# Deep learning part 2 (convolutional neural network)
we will be going though the following in this notebook:
excploring and processing data
building and training our convolutinal network
testing with our own images

In [2]:
#lets get our data!
import keras

from keras.datasets import cifar10
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

# we are going to be using Keras to actually build this architectrure. 
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Conv2D, MaxPooling2D
model = Sequential()

In [3]:
print('x_train shape:', x_train.shape)
print('y_train shape:', y_train.shape)


x_train shape: (50000, 32, 32, 3)
y_train shape: (50000, 1)


In [4]:
#We will now take a look at an individual image, lets look at the second image from the training set
print(x_train[1])

[[[154 177 187]
  [126 137 136]
  [105 104  95]
  ...
  [ 91  95  71]
  [ 87  90  71]
  [ 79  81  70]]

 [[140 160 169]
  [145 153 154]
  [125 125 118]
  ...
  [ 96  99  78]
  [ 77  80  62]
  [ 71  73  61]]

 [[140 155 164]
  [139 146 149]
  [115 115 112]
  ...
  [ 79  82  64]
  [ 68  70  55]
  [ 67  69  55]]

 ...

 [[175 167 166]
  [156 154 160]
  [154 160 170]
  ...
  [ 42  34  36]
  [ 61  53  57]
  [ 93  83  91]]

 [[165 154 128]
  [156 152 130]
  [159 161 142]
  ...
  [103  93  96]
  [123 114 120]
  [131 121 131]]

 [[163 148 120]
  [158 148 122]
  [163 156 133]
  ...
  [143 133 139]
  [143 134 142]
  [143 133 144]]]


Our kernel kept dying with the matplotlib image, so we are removing this section

what we really want is the likelihood of each of the 10 classes. As such we will need 10 output neurons in the neural network. Since we have 10 output neurons, our labels must match as well. As such, we convert the label into a set of 10 numbers. Each number represents if the image belongs to that class or not. So if an image belongs to the first class, the first number of this set will be a 1 and all the other number sin this set will be a 0. To convert the labels to our one-hot encoding we will use a Keras function

In [5]:
y_train_one_hot = keras.utils.to_categorical(y_train, 10)
y_test_one_hot = keras.utils.to_categorical(y_test, 10)

print('The one hot label is:', y_train_one_hot[1])

The one hot label is: [0. 0. 0. 0. 0. 0. 0. 0. 0. 1.]


In [6]:
# common step we do is to let the values to be between 0 and 1, which will aid in the training of our neural network. Since our pixel values already take the values between 0 and 255, we simply need to divide by 255.

In [7]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train = x_train / 255
x_test = x_test / 255

x_train[0]

array([[[0.23137255, 0.24313726, 0.24705882],
        [0.16862746, 0.18039216, 0.1764706 ],
        [0.19607843, 0.1882353 , 0.16862746],
        ...,
        [0.61960787, 0.5176471 , 0.42352942],
        [0.59607846, 0.49019608, 0.4       ],
        [0.5803922 , 0.4862745 , 0.40392157]],

       [[0.0627451 , 0.07843138, 0.07843138],
        [0.        , 0.        , 0.        ],
        [0.07058824, 0.03137255, 0.        ],
        ...,
        [0.48235294, 0.34509805, 0.21568628],
        [0.46666667, 0.3254902 , 0.19607843],
        [0.47843137, 0.34117648, 0.22352941]],

       [[0.09803922, 0.09411765, 0.08235294],
        [0.0627451 , 0.02745098, 0.        ],
        [0.19215687, 0.10588235, 0.03137255],
        ...,
        [0.4627451 , 0.32941177, 0.19607843],
        [0.47058824, 0.32941177, 0.19607843],
        [0.42745098, 0.28627452, 0.16470589]],

       ...,

       [[0.8156863 , 0.6666667 , 0.3764706 ],
        [0.7882353 , 0.6       , 0.13333334],
        [0.7764706 , 0

Here is the breakdown of the architecture that we will be using for this netwrok for its building and training: 
Conv Layer (Filter size 3x3, Depth 32)
Conv Layer (Filter size 3x3, Depth 32)
Max Pool Layer (Filter size 2x2)
Dropout Layer (Prob of dropout 0.25)
Conv Layer (Filter size 3x3, Depth 64)
Conv Layer (Filter size 3x3, Depth 64)
Max Pool Layer (Filter size 2x2)
Dropout Layer (Prob of dropout 0.25)
FC Layer (512 neurons)
Dropout Layer (Prob of dropout 0.5)
FC Layer, Softmax (10 neurons)

The first layer is a conv layer with filter size 3x3, stride size 1 (in both dimensions), and depth 32. The padding is the 'same' and the activation is 'relu' (these two settings will apply to all layers in our CNN). We add this layer to our empty sequential model using the function model.add().

The first number 32 refers to the depth. The next pair of numbers (3,3) refer to the filter width and size. Then, we specify activation which is 'relu' and padding which is 'same'. Notice that we did not specify stride. This is because stride=1 is a default setting, and unless we want to change this setting, we need not specify it.

If you recall, we also need to specify an input size for our first layer; subsequent layers does not have this specification since they can infer the input size from the output size of the previous layer.

In [8]:
#first layer
model.add(Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(32,32,3)))

#second layer
model.add(Conv2D(32, (3, 3), activation='relu', padding='same'))

#third layer
model.add(MaxPooling2D(pool_size=(2, 2)))

#dopout layer
model.add(Dropout(0.25))


#and then our next four layers where the depth of conv layer is 64 rather than 32
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(Conv2D(64, (3, 3), activation='relu', padding='same'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

Now we need to code our fully connected layer. But, our neurons are arranged i a cube format rather than a row, 
to condense them into a row, we will use a flatten layer

In [9]:
model.add(Flatten())
#and add a dense (FC) layer of 512 neurons in a relu activation
model.add(Dense(512, activation='relu'))
#add dropout pobability of 0.5
model.add(Dropout(0.5))
#and we then have a dense (FC) layer with 10 neurons and a softmax activation
model.add(Dense(10, activation='softmax'))
#now we are done with specifying our architecture

#lets see a summary of our architecture
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 32, 32, 32)        896       
                                                                 
 conv2d_1 (Conv2D)           (None, 32, 32, 32)        9248      
                                                                 
 max_pooling2d (MaxPooling2D  (None, 16, 16, 32)       0         
 )                                                               
                                                                 
 dropout (Dropout)           (None, 16, 16, 32)        0         
                                                                 
 conv2d_2 (Conv2D)           (None, 16, 16, 64)        18496     
                                                                 
 conv2d_3 (Conv2D)           (None, 16, 16, 64)        36928     
                                                        

Now, we we fill in the best numbers post specification of architecture. We will copile th emodel with the settings below. The loss function is calle categorical cross entropy. Ths is applicable for a classification porblem of many classes. The optimizer is Adam. Adam is a type of stochastic gradient descent (with a few mods) so thay it trains better. We will also track the accuracy of the model

In [10]:
model.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['accuracy'])

In [11]:
#lets train with a batch of 32 and 20 epochs. The validation split will be set to 0.2 rather than validation_data
#with this shortcut, we did not need to split our dataset into a train and validation set at the start.
#rather , we will specify how much of our dataset will be used as a validation set. In this case
# 20% of the dataset is used as a validation set. I am prepared for this to take a moment

hist = model.fit(x_train, y_train_one_hot, 
           batch_size=32, epochs=20, 
           validation_split=0.2)

Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


In [12]:
#we are then going ot evaluate our model with our test set, and we are going to save our trained model

model.evaluate(x_test, y_test_one_hot)[1]
model.save('my_cifar10_model.h5')




I will be omitting the image omparison portion of this assignment due to the fact the kernel keeps crashing when the matplotlib image interaction is called. We have our model trained and that works well. Possibilities include that perhaps we arein the wrong environment. Nonetheleless we have covered a lot of ground. 