<a href="https://colab.research.google.com/github/Ami190/Densenet_CIFAR/blob/master/DNST_CIFAR10_27Oct_me.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#### Problem Statement:<br>
Image Classification for CIFAR 10 data using **Densenet Connected Convolution Network.**<br>
Constarins:<br>
1. You MUST use SGD.<br>
2. You MUST perform image augmentation (and recommended you read Session 3 docs again).<br>
3. You need to get a Validation Score of +92%<br>
4. You cannot use more than 250 Epochs<br>
5. You cannot have more than 1M parameters<br>
6. Assignment default weightage is 100 pts (if you bead 92% target). For each 0.1% improvement, you get 1 point (i.e. 94% = 120 pts). <br>



#### Solution:
1. Using SGD with base learning rate of .10 and cyclic learning rate of .10 ,.01, .001 <br>
2. Data preprocessing / Augmentation is done using Kears library ImageDataGenerator<br>
3. Number of Parameters are  925,672.
4. Validation Accuracy is 93.00

In [0]:
# Importing Dependencies 
%matplotlib inline
import numpy as np
import matplotlib.pyplot as plt
import keras
from  PIL import Image
import numpy as np
import tensorflow as tf
from keras.preprocessing.image import ImageDataGenerator
from keras.datasets import cifar10
from keras.models import Model, Sequential
from keras.layers import Dense, Dropout, Flatten, Input, AveragePooling2D, merge, Activation
from keras.layers import Conv2D, MaxPooling2D, BatchNormalization
from keras.layers import Concatenate
from keras.optimizers import Adam, SGD
from keras import backend as k

##  1. Data Visualization

In [0]:
(x_train,y_train),(x_test,y_test) = cifar10.load_data() # loading data from CIFAR10

single_img = x_train[45]
classes = ['plane', 'car', 'bird', 'cat', 'deer', 'dog', 'frog', 'horse', 'ship', 'truck']


def visualize_images(x_train, y_train, classes, samples_per_class=10):
    """visualize images from training data """
    num_classes = len(classes)
    for y, cls in enumerate(classes):
        idxs = np.flatnonzero(y_train == y) # get all the indexes of cls
        idxs = np.random.choice(idxs, samples_per_class, replace=False)
        for i, idx in enumerate(idxs): # plot the image one by one
            plt_idx = i * num_classes + y + 1 # i*num_classes and y+1 determine the row and column respectively
            plt.subplot(samples_per_class, num_classes, plt_idx)
            plt.imshow(x_train[idx].astype('uint8'))
            plt.axis('off')
            if i == 0:
                plt.title(cls)
    plt.show()

visualize_images(x_train,y_train,classes)  


## 2. Data Preprocessing :
Data preprocessing is a data mining technique that involves transforming raw data into an understandable format. Real-world data is often incomplete, inconsistent, and/or lacking in certain behaviors or trends, and is likely to contain many errors. Data preprocessing is a proven method of resolving such issues

Using Kears inbulid library for Image augmentation , ImageDataGenerator.

##### ImageDataGenerator class :
Generate batches of tensor image data with real-time data augmentation. The data will be looped over (in batches).

These images are not HD, rather small, Dataset contains 50000 training and 10000 testing images of 10 different classes.
Using image augumentaion to crop widthwise and heightwise so that tiney features are learned. As the images are not digits, It is vital for network to learn mirror images as well.

In [0]:
# Image Augmentation using Kears inbulit lib Image Data Generator
datagen = ImageDataGenerator(
        rotation_range=40, #degree of random rotations of image
        width_shift_range=0.2, #this will crop the image widthwise by moving width fraction of total width, if < 1
        height_shift_range=0.2, #this will crop the image heighteise by Moving height fraction of total height, if < 1
        zoom_range=0.05, # to learn more features zooming image
        channel_shift_range=0.1,#Random Channel Shift
        fill_mode='nearest', # points outside the boundaries are filled according to the
        horizontal_flip=True # mirror image
)
        

In [0]:
# Number of data sampels 
n_training = len(x_train)
n_testing = len(x_test)
print('Loaded CIFAR10 database with {} training and {} testing samples'.format(n_training, n_testing))

In [0]:
# this part will prevent tensorflow to allocate all the avaliable GPU Memory
# backend

# Don't pre-allocate memory; allocate as-needed
config = tf.ConfigProto()
config.gpu_options.allow_growth = True

# Create a session with the above options specified.
k.tensorflow_backend.set_session(tf.Session(config=config))

In [0]:
# Hyperparameters
batch_size = 64 # number of batches per epoch 
num_classes = 10 #number of classifictions 
epochs = 50 #number of times each image is scanned 
l = 10 # growth rate of network
num_filter = 24 # number of filters 
compression = 0.9
#dropout_rate = 0.0

In [0]:
# Load CIFAR10 Data
img_height, img_width, channel = x_train.shape[1],x_train.shape[2],x_train.shape[3]

# convert to one hot encoing ZZ
y_train = keras.utils.to_categorical(y_train, num_classes)
y_test = keras.utils.to_categorical(y_test, num_classes)

## 3.Design Dense Neural Convolution Network:
Idea behind dense network is that many features are lost in sequential network moving from first layer to last. In Dense network is that every layer is connected to all its previous layers and its succeeding ones, thus forming a **Dense Block.**

Densenet targts solution for *Vanishing -gradient*, *strengthen feature propagation* , *encourage feature reuse*, and *substantially reduce the number of parameters*.

Densely connected layers process can easy explode the neural network parameters. It is vital to impliment Compression, dropout and pooling to be able to reduce cost.

Some critcal factors in Dense Neural Nets:
1. Dense Layers are concatenated with each other not added.
2. These layes have to of same height and width.
3. add_transition fuction uses  Batchnorm Relu compression to make layers pass to next block
4. Growth rate(l) decides the depth of the network.
5. num_filter are number of channels or kernals.

In [0]:
# Dense Block
def add_denseblock(input, num_filter, dropout_rate):
    global compression
    temp = input
    for _ in range(l):
        BatchNorm = BatchNormalization()(temp)
        relu = Activation('relu')(BatchNorm)
        Conv2D_3_3 = Conv2D(int(num_filter*compression), (3,3), use_bias=False ,padding='same')(relu)
        if dropout_rate>0:
          Conv2D_3_3 = Dropout(dropout_rate)(Conv2D_3_3)
        concat = Concatenate(axis=-1)([temp,Conv2D_3_3])
        
        temp = concat
        
    return temp

In [0]:
def add_transition(input, num_filter, dropout_rate):
    global compression
    BatchNorm = BatchNormalization()(input)
    relu = Activation('relu')(BatchNorm)
    Conv2D_BottleNeck = Conv2D(int(num_filter*compression), (1,1), use_bias=False ,padding='same')(relu)
    if dropout_rate>0:
        Conv2D_BottleNeck = Dropout(dropout_rate)(Conv2D_BottleNeck)
    avg = AveragePooling2D(pool_size=(2,2))(Conv2D_BottleNeck)
    
    return avg

In [0]:
def output_layer(input):
    global compression
    BatchNorm = BatchNormalization()(input)
    relu = Activation('relu')(BatchNorm)
    AvgPooling = AveragePooling2D(pool_size=(2,2))(relu)
    flat = Flatten()(AvgPooling)
    output = Dense(num_classes, activation='softmax')(flat)
    
    return output

In [0]:
num_filter = 24
dropout_rate = 0.0 #as images are small its critical to learn all features.
l = 10
input = Input(shape=(img_height, img_width, channel,))
First_Conv2D = Conv2D(num_filter, (3,3), use_bias=False ,padding='same')(input)

First_Block = add_denseblock(First_Conv2D, num_filter, dropout_rate)
First_Transition = add_transition(First_Block, num_filter, dropout_rate)

Second_Block = add_denseblock(First_Transition, num_filter, dropout_rate)
Second_Transition = add_transition(Second_Block, num_filter, dropout_rate)

Third_Block = add_denseblock(Second_Transition, num_filter, dropout_rate)
Third_Transition = add_transition(Third_Block, num_filter, dropout_rate)


Last_Block = add_denseblock(Third_Transition,  num_filter, dropout_rate)
output = output_layer(Last_Block)


In [0]:
model = Model(inputs=[input], outputs=[output])
model.summary()

In [0]:
from keras.callbacks import LearningRateScheduler
# learning rate are updated as per number of epocs
def lr_schedule(epoch):
    lrate = 0.10
    if epoch > 75:
        lrate = 0.001
    elif epoch > 100:
        lrate = 0.0001       
    return lrate
 
sgd = keras.optimizers.SGD(lr=0.10,momentum = 0.0, decay = 0.0 , nesterov = False)
model.compile(loss='categorical_crossentropy',
              optimizer=sgd,
              metrics=['accuracy'])


In [0]:
from keras.callbacks import ModelCheckpoint

# To save model weights for best accuracy model used callbacks and check points 
filepath="weight_final_27oct.hdf5"

checkpoint = ModelCheckpoint(filepath, monitor='val_acc', verbose=1, save_best_only=True, mode='max')
callbacks_list = [checkpoint,LearningRateScheduler(lr_schedule)]


In [0]:
#Model fit_generator  train the model on aumented images.
#here one main point to consider in steps_per_epoch (after aug. dataset increses so its wise to go thru random images more times)
model.fit_generator(datagen.flow(x_train, y_train,batch_size=64),steps_per_epoch=2*x_train.shape[0] // batch_size,epochs=50,validation_data=(x_test, y_test),callbacks=callbacks_list) 

In [0]:
# connection disconnected so loading the model from saved weights.
model.fit_generator(datagen.flow(x_train, y_train,batch_size=64),steps_per_epoch=2*x_train.shape[0] // batch_size,epochs=20,validation_data=(x_test, y_test),callbacks=callbacks_list) 

In [0]:
# connection disconnected so loading the model from saved weights.
model.load_weights("weight_final_27oct.hdf5")
model.fit_generator(datagen.flow(x_train, y_train,batch_size=64),steps_per_epoch=2*x_train.shape[0] // batch_size,epochs=10,validation_data=(x_test, y_test),callbacks=callbacks_list) 

**ReduceLROnPlateau** is used to improve LR when model has stop improving. Model benefits from reducing the LR and learn more slowly.

In [0]:
reduce_lr = keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.1, patience=10, verbose=1, mode='auto', min_delta=0.0001)

callbacks_list = [checkpoint,reduce_lr]
model.fit(x_train, y_train,batch_size=64,epochs=50,validation_data=(x_test, y_test),callbacks=callbacks_list) 

In [0]:
callbacks_list = [checkpoint]
model.fit(x_train, y_train,batch_size=64,epochs=20,validation_data=(x_test, y_test),callbacks=callbacks_list) 

In [0]:
# Test the model
score = model.evaluate(x_test, y_test, verbose=1)
print('Test loss:', score[0])
print('Test accuracy:', score[1])

In [0]:
# Save the trained weights in to .h5 format
model.save_weights("DNST_models_Final.h5")
print("Saved model to disk")