---
### **Introduction**
---

Alex Krizhevsky, Geoffrey Hinton and Ilya Sutskever created a neural network architecture called ‘AlexNet’ and won Image Classification Challenge (ILSVRC) in 2012. They trained their network on 1.2 million high-resolution images into 1000 different classes with 60 million parameters and 650,000 neurons. The training was done on two GPUs with split layer concept because GPUs were a little bit slow at that time.

The original paper is available at [ImageNet Classification with Deep Convolutional Neural Networks](https://papers.nips.cc/paper/4824-imagenet-classification-with-deep-convolutional-neural-networks.pdf)

Also check: [Convolutional Neural Network](https://engmrk.com/convolutional-neural-network-3/) and [LeNet-5](https://engmrk.com/lenet-5-a-classic-cnn-architecture/)

**Reference** : [How to implement alexnet cnn architecture using keras](https://www.datacamp.com/community/news/how-to-implement-alexnet-cnn-architecture-using-keras-7vq9ilt9qb7)

---
### **AlexNet Architecture**
---

The AlexNet architecture consists of five convolutional layers, some of which are followed by maximum pooling layers and then three fully-connected layers and finally a 1000-way softmax classifier.

<p align="center"><img src="https://neurohive.io/wp-content/uploads/2018/10/AlexNet-1.png" width="80%"/>



---
### **Overview**
---

**First Layer** : <br/> 
The input for AlexNet is a 227x227x3 RGB image which passes through the first convolutional layer with 96 feature maps or filters having size 11×11 and a stride of 4. The image dimensions changes to 55x55x96.
Then the AlexNet applies maximum pooling layer or sub-sampling layer with a filter size 3×3 and a stride of two. The resulting image dimensions will be reduced to 27x27x96.

**Second Layer** : <br/>
Next, there is a second convolutional layer with 256 feature maps having size 5×5 and a stride of 1.
Then there is again a maximum pooling layer with filter size 3×3 and a stride of 2. This layer is same as the second layer except it has 256 feature maps so the output will be reduced to 13x13x256.

**Third, Fourth and Fifth Layers** : <br/>
The third, fourth and fifth layers are convolutional layers with filter size 3×3 and a stride of one. The first two used 384 feature maps where the third used 256 filters.
The three convolutional layers are followed by a maximum pooling layer with filter size 3×3, a stride of 2 and have 256 feature maps.

**Sixth Layer** : <br/>
The convolutional layer output is flatten through a fully connected layer with 9216 feature maps each of size 1×1.

**Seventh and Eighth Layers** : <br/>
Next is again two fully connected layers with 4096 units.

**Output Layer** : <br/>

Finally, there is a softmax output layer ŷ with 1000 possible values.

---
### **Summary of AlexNet Architecture**
---

<p align="center"><img src="https://engmrk.com/wp-content/uploads/2018/10/AlexNet_Summary_Table.jpg" width="60%"/>

---
### **Define the AlexNet Model in Keras**
---

#### **Import package**

In [1]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Activation, Dropout, Flatten, Conv2D, MaxPooling2D
from keras.layers.normalization import BatchNormalization
import numpy as np

Using TensorFlow backend.


In [6]:
print("[INFO] Model architecture ... ")
#Instantiate an empty model
model = Sequential()

# 1st Convolutional Layer
model.add(Conv2D(filters=96 , kernel_size=(11,11) , strides=(4,4) , padding='valid',activation='relu',kernel_initializer='he_normal', input_shape=(224,224,3)))
# Max Pooling
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))
# Batch Normalisation before passing it to the next layer
model.add(BatchNormalization())

# 2nd Convolutional Layer
model.add(Conv2D(filters=256 , kernel_size=(5,5) , strides=(1,1) , padding='valid',activation='relu',kernel_initializer='he_normal'))
# Max Pooling
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))
# Batch Normalisation before passing it to the next layer
model.add(BatchNormalization())


# 3rd Convolutional Layer
model.add(Conv2D(filters=384 , kernel_size=(3,3) , strides=(1,1) , padding='valid',activation='relu',kernel_initializer='he_normal'))
# Batch Normalisation before passing it to the next layer
model.add(BatchNormalization())

# 4th Convolutional Layer
model.add(Conv2D(filters=384 , kernel_size=(3,3) , strides=(1,1) , padding='valid',activation='relu',kernel_initializer='he_normal'))
# Batch Normalisation before passing it to the next layer
model.add(BatchNormalization())

# 5th Convolutional Layer
model.add(Conv2D(filters=256 , kernel_size=(3,3) , strides=(1,1) , padding='valid',activation='relu',kernel_initializer='he_normal'))
# Max Pooling
model.add(MaxPooling2D(pool_size=(3,3), strides=(2,2), padding='valid'))
# Batch Normalisation before passing it to the next layer
model.add(BatchNormalization())

# Passing it to a Fully Connected layer
model.add(Flatten())

# 1st Fully Connected Layer
model.add(Dense(units=9216,activation='relu',kernel_initializer='he_normal'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))

# 2nd Fully Connected Layer
model.add(Dense(units=4096,activation='relu',kernel_initializer='he_normal'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))

# 3rd Fully Connected Layer
model.add(Dense(units=4096,activation='relu',kernel_initializer='he_normal'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))

# Output Layer
model.add(Dense(units=1000,activation='softmax'))

[INFO] Model architecture ... 


In [7]:
print("[INFO] Model Summary ... ")
model.summary()

[INFO] Model Summary ... 
Model: "sequential_1"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_1 (Conv2D)            (None, 54, 54, 96)        34944     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 26, 26, 96)        0         
_________________________________________________________________
batch_normalization_1 (Batch (None, 26, 26, 96)        384       
_________________________________________________________________
conv2d_2 (Conv2D)            (None, 22, 22, 256)       614656    
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 10, 10, 256)       0         
_________________________________________________________________
batch_normalization_2 (Batch (None, 10, 10, 256)       1024      
_________________________________________________________________
conv2d_3 (Conv2D)           

#### **Compile**


In [None]:
model.compile(loss='categorical_crossentropy', optimizer='adam',\
 metrics=['accuracy'])