# Linear model in Keras from scratch


In [1]:
#Allow relative imports to directories above lesson1/
import os, sys
sys.path.insert(1, os.path.join(sys.path[0], '..'))

#import modules
from utils import *
from vgg16 import Vgg16

#Instantiate plotting tool
#In Jupyter notebooks, you will need to run this command before doing any plotting
%matplotlib inline

 https://github.com/Theano/Theano/wiki/Converting-to-the-new-gpu-back-end%28gpuarray%29

Using gpu device 0: GeForce GTX 1080 Ti (CNMeM is disabled, cuDNN 5103)
Using Theano backend.


# Introduction

We are going to learn a linear model trained using the 1,000 predictions from the imagenet model for each image as input, and the dog/cat label as target.

In [2]:
%matplotlib inline
from __future__ import division,print_function
import os, json
from glob import glob
import numpy as np
import scipy
from sklearn.preprocessing import OneHotEncoder
from sklearn.metrics import confusion_matrix
np.set_printoptions(precision=4, linewidth=100)
from matplotlib import pyplot as plt
import utils; reload(utils)
from utils import plots, get_batches, plot_confusion_matrix, get_data

In [3]:
from numpy.random import random, permutation
from scipy import misc, ndimage
from scipy.ndimage.interpolation import zoom

import keras
from keras import backend as K
from keras.utils.data_utils import get_file
from keras.models import Sequential
from keras.layers import Input
from keras.layers.core import Flatten, Dense, Dropout, Lambda
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD, RMSprop
from keras.preprocessing import image

# Linear models in Keras

Each of the Dense() layers is just a *linear model*, followed by a *simple activation function*.

A linear mode is simply a model where each row is calculated as sum(row * weights), where weights needs to be learnt from the data, and will be the same for every row.
Let's create some data that we know is linearly related:


In [4]:
x = random((30,2))
y = np.dot(x, [2., 3.]) + 1

In [5]:
x[:5]

array([[ 0.7836,  0.7027],
       [ 0.5702,  0.7641],
       [ 0.4695,  0.5952],
       [ 0.7783,  0.3036],
       [ 0.6816,  0.5919]])

In [6]:
y[:5]

array([ 4.6752,  4.4326,  3.7247,  3.4674,  4.1387])

We use Keras to create a simple linear model (*Dense()* -with no activation- in Keras) and optimize it using SGD to minimize mean squared error (mse):

In [7]:
lm = Sequential([ Dense(1, input_shape=(2,)) ])
lm.compile(optimizer=SGD(lr=0.1), loss='mse')

Now that the lm model learnt its internal weights, we can evaluate the loss function (MSE):

In [8]:
lm.get_weights()

[array([[-0.7223],
        [ 0.2224]], dtype=float32), array([ 0.], dtype=float32)]

In [9]:
lm.evaluate(x, y, verbose=0)

14.826991081237793

Let's start training the model

In [10]:
lm.fit(x, y, nb_epoch=5, batch_size = 1)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7fa8617f5090>

In [11]:
# The loss function improves
lm.evaluate(x, y, verbose=0)

0.012117226608097553

In [12]:
# And the weights improve as well, tending to expected values (2. , 3., +1.)
lm.get_weights()

[array([[ 1.7155],
        [ 2.8075]], dtype=float32), array([ 1.2782], dtype=float32)]

Another round of training and evaluation

In [13]:
lm.fit(x, y, nb_epoch=5, batch_size = 1)
lm.get_weights()

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[array([[ 1.946 ],
        [ 2.9694]], dtype=float32), array([ 1.0459], dtype=float32)]

In [14]:
lm.evaluate(x, y, verbose=0)

0.00034131854772567749

In [15]:
lm.fit(x, y, nb_epoch=5, batch_size = 1)
lm.get_weights()

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[array([[ 1.992 ],
        [ 2.9945]], dtype=float32), array([ 1.0078], dtype=float32)]

lm.evaluate(x, y, verbose=0)

In [16]:
lm.fit(x, y, nb_epoch=5, batch_size = 1)
lm.get_weights()

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


[array([[ 1.9984],
        [ 2.9991]], dtype=float32), array([ 1.0015], dtype=float32)]

In [17]:
lm.evaluate(x, y, verbose=0)

3.8757369225095317e-07

# Train linear model on predictions

Now that we saw how Kears operates a *Linear Model*, we can use a *Dense()* layer to convert the 1,000 predictions -as input- given by ImageNet model into a probability of Dog vs. Cat -as output-, learning from the Kaggle data.

## Training the model

We start with basic config steps and we copy a small amount of our data into a 'sample' directory, with the exact same structure as our 'train' directory.
It's *always* a good idea in Machine Learning to run intial testing on smaller dataset for time sake.