# AlexNet Architecture

We are ready to describe the overall architecture of our CNN. As depicted in Figure 2, the network consists of eight layers with weights. The first five layers are convolutional, and the remaining three layers are fully connected. The output of the last fully connected layer is fed to a 1000-way softmax, which produces a distribution over the 1000 class labels. Our network maximizes the multinomial logistic regression objective, equivalent to maximizing the average log-probability of the correct label under the prediction distribution across training cases.

## Layers

### Convolutional Layers

1. **First Convolutional Layer**
    - Filters the 224×224×3 input image with 96 kernels of size 11×11×3
    - Stride of 4 pixels
2. **Second Convolutional Layer**
    - Takes the response-normalized and pooled output of the first convolutional layer
    - Filters with 256 kernels of size 5×5×48
3. **Third Convolutional Layer**
    - Connected to all kernel maps in the second layer
    - 384 kernels of size 3×3×256
4. **Fourth Convolutional Layer**
    - 384 kernels of size 3×3×192
5. **Fifth Convolutional Layer**
    - 256 kernels of size 3×3×192

### Fully Connected Layers

1. **First Fully Connected Layer**
    - 4096 neurons
2. **Second Fully Connected Layer**
    - 4096 neurons
3. **Third Fully Connected Layer**
    - 1000-way softmax, producing a distribution over the 1000 class labels

## Connections

- **Convolutional Layers**
  - Kernels of the second, fourth, and fifth convolutional layers are connected only to those kernel maps in the previous layer which reside on the same GPU
  - Kernels of the third convolutional layer are connected to all kernel maps in the second layer

- **Fully Connected Layers**
  - Neurons in the fully connected layers are connected to all neurons in the previous layer

## Additional Components

- **Response-normalization Layers**
  - Follow the first and second convolutional layers

- **Max-pooling Layers**
  - Follow both response-normalization layers and the fifth convolutional layer

- **ReLU Non-linearity**
  - Applied to the output of every convolutional and fully connected layer


In [13]:
import numpy as np 
import tensorflow as tf;



In [None]:
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Activation, Dropout, Flatten,\
 Conv2D, MaxPooling2D,BatchNormalization




# (3) Create a sequential model
model = Sequential()

# 1st Convolutional Layer
model.add(Conv2D(filters=96, input_shape=(227,227,3), kernel_size=(11,11),\
 strides=(4,4), padding='valid'))
model.add(Activation('relu'))
# Pooling 
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
# Batch Normalisation before passing it to the next layer
model.add(BatchNormalization())

# 2nd Convolutional Layer
model.add(Conv2D(filters=256, kernel_size=(11,11), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
# Pooling
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
# Batch Normalisation
model.add(BatchNormalization())

# 3rd Convolutional Layer
model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
# Batch Normalisation
model.add(BatchNormalization())

# 4th Convolutional Layer
model.add(Conv2D(filters=384, kernel_size=(3,3), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
# Batch Normalisation
model.add(BatchNormalization())

# 5th Convolutional Layer
model.add(Conv2D(filters=256, kernel_size=(3,3), strides=(1,1), padding='valid'))
model.add(Activation('relu'))
# Pooling
model.add(MaxPooling2D(pool_size=(2,2), strides=(2,2), padding='valid'))
# Batch Normalisation
model.add(BatchNormalization())

# Passing it to a dense layer
model.add(Flatten())
# 1st Dense Layer
model.add(Dense(4096, input_shape=(224*224*3,)))
model.add(Activation('relu'))
# Add Dropout to prevent overfitting
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())

# 2nd Dense Layer
model.add(Dense(4096))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())

# 3rd Dense Layer
model.add(Dense(1000))
model.add(Activation('relu'))
# Add Dropout
model.add(Dropout(0.4))
# Batch Normalisation
model.add(BatchNormalization())

# Output Layer
model.add(Dense(17))
model.add(Activation('softmax'))

In [6]:
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 55, 55, 96)        34944     
                                                                 
 activation (Activation)     (None, 55, 55, 96)        0         
                                                                 
 max_pooling2d (MaxPooling2  (None, 27, 27, 96)        0         
 D)                                                              
                                                                 
 batch_normalization (Batch  (None, 27, 27, 96)        384       
 Normalization)                                                  
                                                                 
 conv2d_1 (Conv2D)           (None, 17, 17, 256)       2973952   
                                                                 
 activation_1 (Activation)   (None, 17, 17, 256)       0

In [10]:
model.compile(
    loss="spare_categorical_entropy",
    optimizer="Adam",
    metrics=["accuracy"]
)

In [None]:
# history = model.fit(X_train,y_train, validation_data=(X_val, y_val), epochs=20)
model.save("AlexNet_model.h5")