<a href="https://colab.research.google.com/github/soumendra/cnn-visualisation/blob/master/notebooks/getting_started_with_deep_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Getting Started with Deep Learning (using Keras)

## MNIST

### 1. Importing dependencies and setting seeds

 **Importance of Seed**  
 Neural Network algorithms make heavy use of randomness, be it for initialization of layer weights or to decide which neurons to dropout.  
 As a result of this randomness, each training run of a neural network, is bound to produce slightly different results.  
 This can be a nuisance while experimenting. In order to ensure that our experiments are reproducible (getting the same output every time), by us or anyone else who chooses to run them, we need to seed this randomness.  

---
**Why are we setting two different seeds?**  
Keras relies on `numpy` for some of it's randomness, so we need to seed numpy's random number generator.  
Additionally, `Tensorflow` uses it's own random number generator, and since we are using Tensorflow, we need to seed it's random number generator as well.  

---
More info at -> https://machinelearningmastery.com/reproducible-results-neural-networks-keras/

In [0]:
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)

import keras
from keras.datasets import mnist
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten
from keras.layers import Conv2D, MaxPooling2D

### 2. Declaring some constants

**Epochs**  
An `epoch` is said to have completed when our model trains over the entire train dataset `once`.  
Hence, number of epochs simply defines the total number of passes our model will make on the entire dataset during training.  

---
**Batches**  
Although the model needs to run over the entire dataset on every epoch, giving the entire dataset as input to the model at once is not feasible. Most of the times our datasets are huge, and using the entire dataset as input will consume a lot of memory.  
Hence, we use the concept of batches.  
i.e In a single epoch, we make multiple forward and backward passes on the neural network and each time give a subset of entire dataset as input. The size of this subset is called `batch_size`.  
This consumes a lot less memory and also helps in training the network faster as we are now updating network weights after every `batch` rather than after every `epoch`.


In [0]:
batch_size = 128
num_classes = 10
epochs = 12

# input image dimensions
img_rows, img_cols = 28, 28
input_shape = (img_rows, img_cols, 1)

### 3. Loading and making our dataset usable

**Why Normalize?**  
Normalization in case of image data means dividing it's pixel values by 255.  
This brings all pixel values between (0,1).  
We've observed that this makes training and convergence much more faster.

In [0]:
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = mnist.load_data()

In [0]:
import matplotlib.pyplot as plt

def plot_grid(arr):
  f, axarr = plt.subplots(2,2)
  axarr[0,0].imshow(arr[0])
  axarr[0,1].imshow(arr[1])
  axarr[1,0].imshow(arr[2])
  axarr[1,1].imshow(arr[3])
  
  axarr[0,0].axis('off')
  axarr[0,1].axis('off')
  axarr[1,0].axis('off')
  axarr[1,1].axis('off')

plot_grid(x_train)

In [0]:
x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)

x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train images')
print(x_test.shape[0], 'test images')

### 4. One hot encoding of output classes

In [0]:
# convert class vectors to one hot encoding
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

### 5. Our first Model 😁

#### 5.1 Architecture

In [0]:
model = Sequential()

model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))


model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(num_classes, activation='softmax'))

In [0]:
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.SGD(momentum=0.9, nesterov=True),
              metrics=['accuracy'])

#### 5.2 Training (≖ ‿ ≖)

In [0]:
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

#### 5.3 Evaluation ᕦ⊙෴⊙ᕤ

In [0]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

### 6. Let's go Deeper! ᕕ( ᐛ )ᕗ

![Go Deeper](https://cdn-images-1.medium.com/max/1600/1*RuDCBpDFK4fuBo6W5OFsEw.jpeg)

#### 6.1 Architecture

In [0]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128, activation='relu'))

model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(num_classes, activation='softmax'))

In [0]:
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.SGD(momentum=0.9, nesterov=True),
              metrics=['accuracy'])

#### 6.2 Training

In [0]:
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

#### 6.3 Evaluation

In [0]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

## CIFAR10

### 1. Importing dependencies and setting seeds

In [0]:
from numpy.random import seed
seed(1)
from tensorflow import set_random_seed
set_random_seed(2)

import keras
from keras.datasets import cifar10
from keras.models import Sequential
from keras.layers import Dense, Dropout, Flatten, Activation
from keras.layers import Conv2D, MaxPooling2D
from keras.preprocessing.image import ImageDataGenerator
GENERATOR_SEED = 0

### 2. Declaring some constants

In [0]:
batch_size = 32
num_classes = 10
epochs = 25

# input image dimensions
img_rows, img_cols = 32, 32
input_shape = (img_rows, img_cols, 3)

### 3. Loading and making our dataset usable

In [0]:
# the data, split between train and test sets
(x_train, y_train), (x_test, y_test) = cifar10.load_data()

In [0]:
import matplotlib.pyplot as plt

def plot_grid(arr):
  f, axarr = plt.subplots(2,2)
  axarr[0,0].imshow(arr[0])
  axarr[0,1].imshow(arr[1])
  axarr[1,0].imshow(arr[2])
  axarr[1,1].imshow(arr[3])
  
  axarr[0,0].axis('off')
  axarr[0,1].axis('off')
  axarr[1,0].axis('off')
  axarr[1,1].axis('off')

plot_grid(x_train)

In [0]:
x_train = x_train.astype('float32')
x_test = x_test.astype('float32')
x_train /= 255
x_test /= 255
print('x_train shape:', x_train.shape)
print(x_train.shape[0], 'train samples')
print(x_test.shape[0], 'test samples')

### 4. One hot encoding of output classes

In [0]:
# convert class vectors to one hot encodings
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

### 5. Our Star Model 😁

#### 5.1 Architecture

In [0]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128, activation='relu'))

model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(num_classes, activation='softmax'))

In [0]:
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.SGD(momentum=0.9, nesterov=True),
              metrics=['accuracy'])

#### 5.2 Training (≖ ‿ ≖)

In [0]:
model.fit(x_train, y_train,
          batch_size=batch_size,
          epochs=epochs,
          verbose=1,
          validation_data=(x_test, y_test))

#### 5.3 Evaluation ᕦ⊙෴⊙ᕤ

In [0]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

### 6. Our Star Model  with Data Augmentation

#### 6.1 Architecture

In [0]:
model = Sequential()
model.add(Conv2D(32, kernel_size=(3, 3),
                 activation='relu',
                 input_shape=input_shape))

model.add(Conv2D(64, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))

model.add(Conv2D(128, (3, 3), activation='relu'))
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

model.add(Flatten())

model.add(Dense(128, activation='relu'))

model.add(Dense(64, activation='relu'))
model.add(Dropout(0.5))

model.add(Dense(num_classes, activation='softmax'))

In [0]:
model.compile(loss=keras.losses.categorical_crossentropy,
              optimizer=keras.optimizers.SGD(momentum=0.9, nesterov=True),
              metrics=['accuracy'])

#### 6.2 Training (≖ ‿ ≖)

In [0]:
datagen = ImageDataGenerator(
        width_shift_range=0.1,  # randomly shift images horizontally (fraction of total width)
        height_shift_range=0.1,  # randomly shift images vertically (fraction of total height)
        zoom_range=0.1,  # range for random zoom
        horizontal_flip=True,  # randomly flip images
)

# Fit the model on the batches generated by datagen.flow()
model.fit_generator(datagen.flow(x_train, y_train, batch_size=batch_size, seed=GENERATOR_SEED),
                    epochs=epochs,
                    steps_per_epoch=x_train.shape[0] // batch_size,
                    validation_data=(x_test, y_test))

#### 6.3 Evaluation ᕦ⊙෴⊙ᕤ

In [0]:
score = model.evaluate(x_test, y_test, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])