## ALEXNET

#### The most important features of the AlexNet paper are:

- As the model had to train 60 million parameters (which is quite a lot), it was prone to overfitting. According to the paper, the usage of Dropout and Data Augmentation significantly helped in reducing overfitting. The first and second fully connected layers in the architecture thus used a dropout of 0.5 for the purpose. Artificially increasing the number of images through data augmentation helped in the expansion of the dataset dynamically during runtime, which helped the model generalize better.

- Another distinct factor was using the ReLU activation function instead of tanh or sigmoid, which resulted in faster training times (a decrease in training time by 6 times). Deep Learning Networks usually employ ReLU non-linearity to achieve faster training times as the others start saturating when they hit higher activation values.

- AlexNet is a Classic type of Convolutional Neural Network, and it came into existence after the 2012 ImageNet challenge. The network architecture is given below :
<img src="78.png">

<img src="79.png">

### Model Explanation : 
- The Input to this model have the dimensions 227x227x3 follwed by a Convolutional Layer with 96 filters of 11x11 dimensions and having a ‘same’ padding and a stride of 4. The resulting output dimensions are given as :

              floor(((n + 2*padding - filter)/stride) + 1 ) * floor(((n + 2*padding — filter)/stride) + 1)

##### Note : This formula is for square input with height = width = n

- Explaining the first Layer with input 227x227x3 and Convolutional layer with 96 filters of 11x11 , ‘valid’ padding and stride = 4 , output dims will be

= floor(((227 + 0–11)/4) + 1) * floor(((227 + 0–11)/4) + 1)

= floor((216/4) + 1) * floor((216/4) + 1)

= floor(54 + 1) * floor(54 + 1)

= 55 * 55

- Since number of filters = 96 , thus output of first Layer is : 55x55x96
- Continuing we have the MaxPooling layer (3, 3) with the stride of 2,making the output size decrease to 27x27x96, followed by another Convolutional Layer with 256, (5,5) filters and ‘same’ padding, that is, the output height and width are retained as the previous layer thus output from this layer is 27x27x256. 
- Next we have the MaxPooling again ,reducing the size to 13x13x256. Another Convolutional Operation with 384, (3,3) filters having same padding is applied twice giving the output as 13x13x384, followed by another Convulutional Layer with 256 , (3,3) filters and same padding resulting in 13x13x256 output. 
- This is MaxPooled and dimensions are reduced to 6x6x256. Further the layer is Flatten out and 2 Fully Connected Layers with 4096 units each are made which is further connected to 1000 units softmax layer.

- The network is used for classifying much large number of classes as per our requirement. However in our case, we will make the output softmax layer with 6 units as we ahve to classify into 6 classes. The softmax layer gives us the probablities for each class to which an Input Image might belong.

In [8]:
#Importing library
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D
from keras.layers.normalization import BatchNormalization
import numpy as np

np.random.seed(1000)


In [9]:
model = Sequential() 

# 1st Convolutional Layer 
model.add(Conv2D(filters = 96, input_shape = (224, 224, 3), kernel_size = (11, 11), strides = (4, 4), padding = 'valid')) 
model.add(Activation('relu')) 
# Max-Pooling 
model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2), padding = 'valid')) 
# Batch Normalisation 
model.add(BatchNormalization()) 

# 2nd Convolutional Layer 
model.add(Conv2D(filters = 256, kernel_size = (11, 11), strides = (1, 1), padding = 'valid')) 
model.add(Activation('relu')) 
# Max-Pooling 
model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2), padding = 'valid')) 
# Batch Normalisation 
model.add(BatchNormalization()) 

# 3rd Convolutional Layer 
model.add(Conv2D(filters = 384, kernel_size = (3, 3), strides = (1, 1), padding = 'valid')) 
model.add(Activation('relu')) 
# Batch Normalisation 
model.add(BatchNormalization()) 

# 4th Convolutional Layer 
model.add(Conv2D(filters = 384, kernel_size = (3, 3), strides = (1, 1), padding = 'valid')) 
model.add(Activation('relu')) 
# Batch Normalisation 
model.add(BatchNormalization()) 

# 5th Convolutional Layer 
model.add(Conv2D(filters = 256, kernel_size = (3, 3), strides = (1, 1), padding = 'valid')) 
model.add(Activation('relu')) 
# Max-Pooling 
model.add(MaxPooling2D(pool_size = (2, 2), strides = (2, 2),padding = 'valid')) 
# Batch Normalisation 
model.add(BatchNormalization()) 

# Flattening 
model.add(Flatten()) 

# 1st Dense Layer 
model.add(Dense(4096, input_shape = (224*224*3, ))) 
model.add(Activation('relu')) 
# Add Dropout to prevent overfitting 
model.add(Dropout(0.4)) 
# Batch Normalisation 
model.add(BatchNormalization()) 

# 2nd Dense Layer 
model.add(Dense(4096)) 
model.add(Activation('relu')) 
# Add Dropout 
model.add(Dropout(0.4)) 
# Batch Normalisation 
model.add(BatchNormalization()) 

# Output Softmax Layer 
model.add(Dense(10)) 
model.add(Activation('softmax')) 


In [10]:
#Model Summary
model.summary()

Model: "sequential_3"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_15 (Conv2D)           (None, 54, 54, 96)        34944     
_________________________________________________________________
activation_24 (Activation)   (None, 54, 54, 96)        0         
_________________________________________________________________
max_pooling2d_9 (MaxPooling2 (None, 27, 27, 96)        0         
_________________________________________________________________
batch_normalization_23 (Batc (None, 27, 27, 96)        384       
_________________________________________________________________
conv2d_16 (Conv2D)           (None, 17, 17, 256)       2973952   
_________________________________________________________________
activation_25 (Activation)   (None, 17, 17, 256)       0         
_________________________________________________________________
max_pooling2d_10 (MaxPooling (None, 8, 8, 256)        