# Exercise 4

Work on this before the next lecture on 26 April. We will talk about questions, comments, and solutions during the exercise after the third lecture.

Please do form study groups! When you do, make sure you can explain everything in your own words, do not simply copy&paste from others.

The solutions to a lot of these problems can probably be found with Google. Please don't. You will not learn a lot by copy&pasting from the internet.

If you want to get credit/examination on this course please upload your work to your GitHub repository for this course before the next lecture starts and post a link to your repository in [this thread](https://github.com/wildtreetech/advanced-computing-2018/issues/8). If you worked on things together with others please add their names to the notebook so we can see who formed groups.

The overall idea of this exercise is to get you using and building convolutional neural networks.

## Question 1

In the last exercise you built a neural network that can classify fashion items using only densely connected layers.

Build on this by using convolutions, pooling, dropout, batch norm, etc in your neural network. Can you outperform your densely connected network?

Start with a small network and a fraction of the data to check if you hooked everything up correctly. Don't go overboard with the size of the network either as even small networks take quite a while to train.

(If you want to experiment with a free GPU checkout https://kaggle.com/kernels .)

In [42]:
# Prepare data
from sklearn.model_selection import train_test_split
from keras import utils
from keras.datasets import fashion_mnist
import numpy as np

(X_train, y_train), (X_test, y_test) = fashion_mnist.load_data()

X_train = X_train.astype('float32')
X_test = X_test.astype('float32')
X_train = X_train.reshape(X_train.shape[0], 28, 28, 1).astype(np.float64)
X_test = X_test.reshape(X_test.shape[0], 28, 28, 1).astype(np.float64)
X_train /= 255
X_test /= 255

X_train, X_val, y_train, y_val = train_test_split(X_train, y_train,
                                                  test_size=10000,
                                                  random_state=42)


num_classes = 10
y_train_ = utils.to_categorical(y_train, num_classes)
y_val_ = utils.to_categorical(y_val, num_classes)
y_test_ = utils.to_categorical(y_test, num_classes)

In [46]:
# Build the neural network
from keras.models import Model
from keras.layers import Input, Dense, Activation, Flatten, Conv2D, MaxPool2D

# we define the input shape (i.e., how many input features) **without** the batch size
x = Input(shape=(28, 28, 1))

h = Conv2D(20, 3, activation='relu', strides=1)(x)
h = MaxPool2D(3, strides=3)(h)
#h = Conv2D(12, 3, activation='relu')(h)
#h = MaxPool2D(2, strides=2)(h)
h = Flatten()(h)

h = Dense(10, activation='relu')(h)

# we want to predict one of ten classes
h = Dense(10)(h)
y = Activation('softmax')(h)

# Package it all up in a Model
net = Model(x, y)

In [47]:
net.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_17 (InputLayer)        (None, 28, 28, 1)         0         
_________________________________________________________________
conv2d_27 (Conv2D)           (None, 26, 26, 20)        200       
_________________________________________________________________
max_pooling2d_25 (MaxPooling (None, 8, 8, 20)          0         
_________________________________________________________________
flatten_14 (Flatten)         (None, 1280)              0         
_________________________________________________________________
dense_40 (Dense)             (None, 10)                12810     
_________________________________________________________________
dense_41 (Dense)             (None, 10)                110       
_________________________________________________________________
activation_14 (Activation)   (None, 10)                0         
Total para

In [48]:
# Run networks
net.compile(loss='categorical_crossentropy',
            optimizer='sgd',
            metrics=['accuracy'])
batch_size = 128
history = net.fit(X_train, y_train_,
                  batch_size=batch_size,
                  epochs=20,
                  verbose=1,
                  validation_data=(X_val, y_val_))

Train on 50000 samples, validate on 10000 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [49]:
score = net.evaluate(X_test, y_test_, batch_size=128)
for na, sc in zip(net.metrics_names, score):
    print('{} is {}'.format(na, sc))

loss is 0.4814257012844086
acc is 0.8236


For some reason I cannot easily improve on the fully connected network of ex 2. It's much slower and not converging faster.

## Question 2

For most real world applications we do not have enough labelled images to train a large neural network from scratch. Instead we can use a pre-trained network as a feature transformer and train a smaller model (or even just a logistic regression) on the output of the pre-trained network.

There are several pretrained networks available as part of keras: https://keras.io/applications/. The documentation usually gives some information or links about each network.

The documentation also contains snippets on how to use a pre-trained network as feature transformer ("Extract features with VGG16"). You should be able to generalise from that example using VGG16 to approximately any of the networks available there.

One important thing to not forget is that you need to preprocess your images before feeding them into a pretrained network. Keras provides the functions to do that as well, use them :) You might also need to resize your images first.

The task for this question is to build a classifier that can tell road bikes from mountain bikes. Start with using a pre-trained network as feature transformer and logistic regression as classifier on the output of the pretrained network. Once this works you can experiment with extracting features from earlier layers of the pre-trained network, compare your performance to a small network trained from scratch, try to beat your neural net by extracting features by hand and feeding them to a random forest, increasing your dataset size by [augmenting the data](https://keras.io/preprocessing/image/), etc.

The dataset containing about 100 labelled images for each road and mountain bikes is here: https://github.com/wildtreetech/advanced-computing-2018/blob/master/data/road-and-mountain-bikes.zip

In [36]:
from keras import applications
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input, decode_predictions
import numpy as np


# for example load the VGG16 network
model = applications.VGG16(include_top=False,
                           weights='imagenet')

img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224, 224))
x = image.img_to_array(img)
print(x.shape)
x = np.expand_dims(x, axis=0)
print(x.shape)
x = preprocess_input(x)

features = model.predict(x)


(224, 224, 3)
(1, 224, 224, 3)


In [41]:
# Road vs mountain bikes
import glob
from sklearn.model_selection import train_test_split

mobikes = glob.glob('data/bikes/mountain_bikes/*.jpg')
robikes = glob.glob('data/bikes/road_bikes/*.jpg')

mobikes_data = np.array([image.img_to_array(image.load_img(path, target_size=(224, 224))) for path in mobikes])
robikes_data = np.array([image.img_to_array(image.load_img(path, target_size=(224, 224))) for path in robikes])

mobikes_labels = np.array([[1,0]]*len(mobikes_data))
robikes_labels = np.array([[0,1]]*len(robikes_data))

full_data = np.concatenate((mobikes_data, robikes_data))
full_labels = np.concatenate((mobikes_labels, robikes_labels))


In [45]:
# for example load the VGG16 network
model = applications.VGG16(include_top=False,
                           weights='imagenet')

full_features = model.predict(full_data)


In [47]:
bikes_train, bikes_val, labs_train, labs_val = train_test_split(full_features, full_labels,
                                                  random_state=42)

In [77]:
#bikes_train[0] - bikes_train[1]

In [78]:
# Build the neural network
from keras.models import Model
from keras.layers import Input, Dense, Activation, Flatten, Conv2D, MaxPool2D


x = Input(shape=(7, 7, 512))

h = Flatten()(x)
h = Dense(20, activation='relu')(h)

h = Dense(2)(h)
y = Activation('softmax')(h)

net = Model(x, y)

In [79]:
net.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_30 (InputLayer)        (None, 7, 7, 512)         0         
_________________________________________________________________
flatten_11 (Flatten)         (None, 25088)             0         
_________________________________________________________________
dense_25 (Dense)             (None, 20)                501780    
_________________________________________________________________
dense_26 (Dense)             (None, 2)                 42        
_________________________________________________________________
activation_10 (Activation)   (None, 2)                 0         
Total params: 501,822
Trainable params: 501,822
Non-trainable params: 0
_________________________________________________________________


In [80]:
# Run networks
net.compile(loss='categorical_crossentropy',
            optimizer='sgd',
            metrics=['accuracy'])
batch_size = 128

dtrain, dval, ltrain, lval = train_test_split(bikes_train, labs_train,
                                                  random_state=42)
history = net.fit(dtrain, ltrain,
                  batch_size=batch_size,
                  epochs=120,
                  verbose=1,
                  validation_data=(dval, lval))

Train on 118 samples, validate on 40 samples
Epoch 1/120
Epoch 2/120
Epoch 3/120
Epoch 4/120
Epoch 5/120
Epoch 6/120
Epoch 7/120
Epoch 8/120
Epoch 9/120
Epoch 10/120
Epoch 11/120
Epoch 12/120
Epoch 13/120
Epoch 14/120
Epoch 15/120
Epoch 16/120
Epoch 17/120
Epoch 18/120
Epoch 19/120
Epoch 20/120
Epoch 21/120
Epoch 22/120
Epoch 23/120
Epoch 24/120
Epoch 25/120
Epoch 26/120
Epoch 27/120
Epoch 28/120
Epoch 29/120
Epoch 30/120
Epoch 31/120
Epoch 32/120
Epoch 33/120
Epoch 34/120
Epoch 35/120
Epoch 36/120
Epoch 37/120
Epoch 38/120
Epoch 39/120
Epoch 40/120
Epoch 41/120
Epoch 42/120
Epoch 43/120
Epoch 44/120
Epoch 45/120
Epoch 46/120
Epoch 47/120
Epoch 48/120
Epoch 49/120
Epoch 50/120
Epoch 51/120
Epoch 52/120
Epoch 53/120
Epoch 54/120
Epoch 55/120
Epoch 56/120
Epoch 57/120
Epoch 58/120
Epoch 59/120
Epoch 60/120


Epoch 61/120
Epoch 62/120
Epoch 63/120
Epoch 64/120
Epoch 65/120
Epoch 66/120
Epoch 67/120
Epoch 68/120
Epoch 69/120
Epoch 70/120
Epoch 71/120
Epoch 72/120
Epoch 73/120
Epoch 74/120
Epoch 75/120
Epoch 76/120
Epoch 77/120
Epoch 78/120
Epoch 79/120
Epoch 80/120
Epoch 81/120
Epoch 82/120
Epoch 83/120
Epoch 84/120
Epoch 85/120
Epoch 86/120
Epoch 87/120
Epoch 88/120
Epoch 89/120
Epoch 90/120
Epoch 91/120
Epoch 92/120
Epoch 93/120
Epoch 94/120
Epoch 95/120
Epoch 96/120
Epoch 97/120
Epoch 98/120
Epoch 99/120
Epoch 100/120
Epoch 101/120
Epoch 102/120
Epoch 103/120
Epoch 104/120
Epoch 105/120
Epoch 106/120
Epoch 107/120
Epoch 108/120
Epoch 109/120
Epoch 110/120
Epoch 111/120
Epoch 112/120
Epoch 113/120
Epoch 114/120
Epoch 115/120
Epoch 116/120
Epoch 117/120
Epoch 118/120
Epoch 119/120
Epoch 120/120


There's a bug I do not understand. Nothing is happening.

## Question 3

Think about what project you want to do. What makes a good project? It should use some of what you learnt in this class, there should be labelled data available already, and it should be something you are interested in.

You will have to write a short report on what you did. To write an interesting report you need to tell a story, not just first I did A, then I did B, then I did X and finally D.

It also has to go a bit beyond simply training a classifier or regression model.

An example based on the bike images from the previous question:

A local bike shop wants to keep an eye on sales of bikes on ebay. They specialise in road bikes so they want to be able to filter out all adverts for mountain bikes. They have found that people writing ebay adverts are not very good at correctly labelling their adverts. Can they use machine-learning to help classify adverts?

We investigate labelling adverts based on the image in the advert and study different trade offs in misclassifying bikes. The network was trained on 100 images from a catalog which show bikes on a white background. We compare the performance of the network on the training data and a small set of hand labelled images of bikes in the wild.