# Ay119: Deep learning

In this assignment, you will apply your (newly acquired) knowledge of deep learning 101
to build a classifier for a real life problem (well, actually not, but almost
-- see the very end of this exercise).

**Prerequisites**:

Run this in Google Colab! It has all that you will need.


## The problem: `rbp` classifier for the `GTF`

 You are running a state-of-the-art *robotic* sky survey called the Geeky Transient Facility,
 or GTF, perhaps somewhere in the Southern Hemisphere.
 You have a fancy image-differencing pipeline that allows you to find interesting objects in the
 dynamic sky every night, such as supernovae, variable stars, and asteroids, and send them
 out to the world as alerts.

 However you notice two things:

 1. Your fancy image-differencing pipeline produces a lot of bogus detections.
 With the increasing data rates you are worried that one day you will run out of
 graduate students to examine/scan the potentially interesting candidate sources.

 2. Your AGI-based robot-telescope is extraordinarily sophisticated and smart,
  but has one major soft spot: it really likes platypuses. In fact, the robot likes them
  so much that whenever a platypus appears in the vicinity, it halts observing
  and instead takes a picture of it and sends it out as an alert.

So you decide to build a `real, bogus, or platypus` (`rbp` for short) classifier for GTF.


## Transfer learning

You are not the first one to encounter a problem like this, so you decide to apply the
transfer learning technique and use the real-bogus classifier `braai` that is employed
by a similar survey called ZTF running in the Northern Hemisphere.


Import the necessities:

In [30]:
import matplotlib.pyplot as plt
import numpy as np
import pathlib
from sklearn.model_selection import train_test_split
import tensorflow as tf
from tqdm.keras import TqdmCallback

### The model

Download a pre-trained model:

In [2]:
!wget https://github.com/dmitryduev/kowalski/raw/master/kowalski/models/braai_d6_m9.h5

--2020-05-17 01:55:34--  https://github.com/dmitryduev/kowalski/raw/master/kowalski/models/braai_d6_m9.h5
Resolving github.com (github.com)... 192.30.255.113
Connecting to github.com (github.com)|192.30.255.113|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/dmitryduev/kowalski/master/kowalski/models/braai_d6_m9.h5 [following]
--2020-05-17 01:55:35--  https://raw.githubusercontent.com/dmitryduev/kowalski/master/kowalski/models/braai_d6_m9.h5
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 151.101.24.133
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|151.101.24.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3809888 (3.6M) [application/octet-stream]
Saving to: ‘braai_d6_m9.h5.1’


2020-05-17 01:55:36 (7.22 MB/s) - ‘braai_d6_m9.h5.1’ saved [3809888/3809888]



Load the model:

In [3]:
model = tf.keras.models.load_model('braai_d6_m9.h5')

In [4]:
model.summary()

Model: "VGG6"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1 (Conv2D)               (None, 61, 61, 16)        448       
_________________________________________________________________
conv2 (Conv2D)               (None, 59, 59, 16)        2320      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 29, 29, 16)        0         
_________________________________________________________________
dropout (Dropout)            (None, 29, 29, 16)        0         
_________________________________________________________________
conv3 (Conv2D)               (None, 27, 27, 32)        4640      
_________________________________________________________________
conv4 (Conv2D)               (None, 25, 25, 32)        9248      
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 6, 6, 32)          0      

Use the functional API of `tf.keras`.

Remove the output layer, leave the feature extraction part of the network in place:

In [5]:
model_fe = tf.keras.Model(inputs=model.inputs, outputs=model.layers[-2].output)
model_fe.summary()

Model: "model"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv1_input (InputLayer)     [(None, 63, 63, 3)]       0         
_________________________________________________________________
conv1 (Conv2D)               (None, 61, 61, 16)        448       
_________________________________________________________________
conv2 (Conv2D)               (None, 59, 59, 16)        2320      
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 29, 29, 16)        0         
_________________________________________________________________
dropout (Dropout)            (None, 29, 29, 16)        0         
_________________________________________________________________
conv3 (Conv2D)               (None, 27, 27, 32)        4640      
_________________________________________________________________
conv4 (Conv2D)               (None, 25, 25, 32)        9248  

Create a new output layer that will provide the probabilities for our three classes.

In [6]:
output = tf.keras.layers.Dense(2, activation='softmax')(model_fe.output)

In [7]:
model_tl = tf.keras.Model(inputs=model_fe.inputs, outputs=output)

In [8]:
# mark layers as not trainable
# for layer in model_tl.layers[:-1]:
# 	layer.trainable = False


In [19]:
model_tl.compile(
    optimizer='adam',
    loss='categorical_crossentropy',
    metrics=[
        tf.keras.metrics.TruePositives(name='tp'),
        tf.keras.metrics.FalsePositives(name='fp'),
        tf.keras.metrics.TrueNegatives(name='tn'),
        tf.keras.metrics.FalseNegatives(name='fn'),
        tf.keras.metrics.BinaryAccuracy(name='accuracy'),
        tf.keras.metrics.Precision(name='precision'),
        tf.keras.metrics.Recall(name='recall'),
        tf.keras.metrics.AUC(name='auc'),
    ]
)

### The data

In [9]:
p = pathlib.Path('data')

bogus = np.array([np.load(pp) for pp in (p / 'bogus').glob('*.npy')])
real = np.array([np.load(pp) for pp in (p / 'real').glob('*.npy')])
# platypus = np.array([np.load(pp) for pp in (p / 'platypus').glob('*.npy')])

data = np.vstack((bogus, real))

In [10]:
bogus.shape, real.shape

((100, 63, 63, 3), (100, 63, 63, 3))

Make labels:

In [11]:
indices = [0 for _ in range(bogus.shape[0])] +\
    [1 for _ in range(real.shape[0])]
labels = tf.one_hot(indices, depth=2, dtype=tf.int8).numpy()
# labels


Train/validation/test split (we will use an 81\% / 9\% / 10\% data split):

In [12]:
test_size = 0.1
val_size = 0.1
random_state = 42

train_indexes, test_indexes = train_test_split(range(len(labels)), shuffle=True,
                                               test_size=test_size, random_state=random_state)
train_indexes, val_indexes = train_test_split(train_indexes, shuffle=True,
                                              test_size=val_size, random_state=random_state)

In [13]:
batch_size = 4
shuffle_buffer_size = 4

epochs = 20


In [24]:
train_dataset = tf.data.Dataset.from_tensor_slices((data[train_indexes], labels[train_indexes]))
val_dataset = tf.data.Dataset.from_tensor_slices((data[val_indexes], labels[val_indexes]))
test_dataset = tf.data.Dataset.from_tensor_slices((data[test_indexes], labels[test_indexes]))

In [25]:
# train_dataset = train_dataset.shuffle(shuffle_buffer_size).batch(batch_size).repeat(epochs)
# flipping:
train_dataset = train_dataset.cache().map(
    lambda image, label: (tf.image.random_flip_left_right(image), label)
).map(
    lambda image, label: (tf.image.random_flip_up_down(image), label)
).shuffle(shuffle_buffer_size).batch(batch_size).repeat(epochs)
val_dataset = val_dataset.batch(batch_size).repeat(epochs)
test_dataset = test_dataset.batch(batch_size)

In [26]:
steps_per_epoch_train = len(train_indexes) // batch_size - 1
steps_per_epoch_val = len(val_indexes) // batch_size - 1
steps_per_epoch_test = len(test_indexes) // batch_size - 1
steps_per_epoch_train, steps_per_epoch_val, steps_per_epoch_test

(39, 3, 4)

In [31]:
early_stopping_callback = tf.keras.callbacks.EarlyStopping(
    monitor='val_loss',
    patience=5,
    restore_best_weights=True
)

tqdm_callback = TqdmCallback(verbose=1)

callbacks = [
    early_stopping_callback,
    tqdm_callback,
]

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

HBox(children=(FloatProgress(value=1.0, bar_style='info', max=1.0), HTML(value='')))

In [32]:
h = model_tl.fit(
    train_dataset, epochs=epochs, steps_per_epoch=steps_per_epoch_train,
    callbacks=callbacks,
    validation_data=val_dataset, validation_steps=steps_per_epoch_val,
    verbose=0)




In [33]:
model_tl.evaluate(test_dataset)



[0.00018121306493412703, 20.0, 0.0, 20.0, 0.0, 1.0, 1.0, 1.0, 1.0]