# Mini GoogLeNet on CIFAR-10

GoogLeNet was proposed in 2014 by [Szegedy et al.](https://arxiv.org/pdf/1409.4842.pdf) This Convolutional Neural Network (CNN) has introduced the concept of micro-architecture, it means, the model is composed by a certain number of micro-architecture, forming the macro-architecture.

GoogLeNet introduced the inception module, it's composed by three convolution processing, including kernels size of $(1x1)$, $(3x3)$ and $(5x5)$. Each of them is parallel to the others during the running. The model was capable to increase the depth of the CNN, conserving a reasonable running time. At the end of the inception module, the model down sample all information to put into a feature map. If there's other inception module, other convolutions are performed, otherwise there's a maxpooling process and, the feature map is connected into the fully-connected layer, to make predictions. This model won the ImageNet Large-Scale Visual Recognition Challenge 2014.

In this example, we consider a reduced form of GoogLeNet, we implement the Mini GoogLeNet on the CIFAR-10 dataset. The Mini GoogLeNet, considers less convolutions layers and, the inception module realizes just two convolution process $(1x1)$ and $(3x3)$.

## Importing Libraries

In [1]:
import tensorflow as tf
from tensorflow import keras
from sklearn.preprocessing import LabelBinarizer
from compvis.nn.cnns import MiniGoogLeNet
from compvis.callbacks import TrainingMonitor
from compvis.nn.lr import LRFunc
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.callbacks import LearningRateScheduler
from tensorflow.keras.optimizers import SGD
from tensorflow.keras.datasets import cifar10
import numpy as np
import os

## Loading and splitting the data

In [2]:
# Loading and splitting the dataset
((X_train, y_train), (X_test, y_test)) = cifar10.load_data()
X_train = X_train.astype('float32')
X_test = X_test.astype("float32")

[INFO] loading the CIFAR-10 dataset ...


**Mean substraction**

In [3]:
# Normalizing with mean substraction
mean = np.mean(X_train, axis = 0)
X_train -= mean
X_test -= mean

**Encoding labels**

In [4]:
# Converting the labels into numercial vectors
lb = LabelBinarizer()
y_train = lb.fit_transform(y_train)
y_test = lb.fit_transform(y_test)

## Training the model

To train the model, we consider $70$ epochs. During the training, the learning rate drops down, following a polynomial function decay.

**Regularizations**

We consider the class LRFunctions, this class offers some learning rate functions to be used with LearningRateScheduler. The required arguments for this example are l_r (initial learning rate), epochs (number of epochs) and degree (the function degree, in this case linear).

We also consider the image augmentation.

In [5]:
lrs = LRFunc(l_r = 0.001, epochs = 70, degree=1) # defining the LRFunc class

# Defining the Data augumentation to avoid the overfit
aug = ImageDataGenerator(width_shift_range=0.1, height_shift_range=0.1,
                         horizontal_flip=True, fill_mode="nearest")

# Building the set of callbacks

figPath = os.path.sep.join(["/path/to/output", "{}.png".format(os.getpid())])
jsonPath = os.path.sep.join(["/path/to/output", "{}.json".format(os.getpid())])

callbacks = [TrainingMonitor(figPath, jsonPath=jsonPath),
             LearningRateScheduler(lrs.poly_decay)]#we consider the attribute poly_decay

**Building the model**

We consider the Stochastic Gradient Descent as regularization, the initial learning rate is $1e-2$ and the momentum is $0.9$. 

The input image size is $(32x32)$ and ten classes to be predicted.

In [6]:
# Constructing the model and the optimizer
opt = SGD(lr=1e-2, momentum=0.9)
model = MiniGoogLeNet.build(width=32, height=32, depth=3, classes=10)
model.compile(loss="categorical_crossentropy", optimizer=opt, metrics=["accuracy"])

**Training the model**

In [7]:
model.fit(aug.flow(X_train, y_train, batch_size=64), validation_data=(X_test, y_test),
          steps_per_epoch=len(X_train) // 64, epochs=70, 
          callbacks=callbacks, verbose=1)

  ...
    to  
  ['...']
Train for 781 steps, validate on 10000 samples
Learning rate  0.001000
Epoch 1/70
Epoch 2/70
Epoch 3/70
Epoch 4/70
Epoch 5/70
Epoch 6/70
Epoch 7/70
Epoch 8/70
Epoch 9/70
Epoch 10/70
Learning rate  0.000857
Epoch 11/70
Epoch 12/70
Epoch 13/70
Epoch 14/70
Epoch 15/70
Epoch 16/70
Epoch 17/70
Epoch 18/70
Epoch 19/70
Epoch 20/70
Learning rate  0.000714
Epoch 21/70
Epoch 22/70
Epoch 23/70
Epoch 24/70
Epoch 25/70
Epoch 26/70
Epoch 27/70
Epoch 28/70
Epoch 29/70
Epoch 30/70
Learning rate  0.000571
Epoch 31/70
Epoch 32/70
Epoch 33/70
Epoch 34/70
Epoch 35/70
Epoch 36/70
Epoch 37/70
Epoch 38/70
Epoch 39/70
Epoch 40/70
Learning rate  0.000429
Epoch 41/70
Epoch 42/70
Epoch 43/70
Epoch 44/70
Epoch 45/70
Epoch 46/70
Epoch 47/70
Epoch 48/70
Epoch 49/70
Epoch 50/70
Learning rate  0.000286
Epoch 51/70
Epoch 52/70
Epoch 53/70
Epoch 54/70
Epoch 55/70


Epoch 56/70
Epoch 57/70
Epoch 58/70
Epoch 59/70
Epoch 60/70
Learning rate  0.000143
Epoch 61/70
Epoch 62/70
Epoch 63/70
Epoch 64/70
Epoch 65/70
Epoch 66/70
Epoch 67/70
Epoch 68/70
Epoch 69/70
Epoch 70/70


<tensorflow.python.keras.callbacks.History at 0x7f426454df10>

## Saving the model

In [8]:
# Saving the model on the disk
model.save("output/minigooglenet_cifar10.hdf5")

## Conclusions

The model achieved a good result on the training set, accuracy of $0.93$, is the best results for this dataset in this training project. The result on the validation set shows accuracy of $0.86$, it's also a good result. The problem is the learning curves. There's a considerable gap between the training and validation set, that's indicates over-fit. This problem is recurrent, when the dataset in question is the CIFAR-10 dataset.