## **Alexnet** Model  
5 Conv Layers,1 Flatten, 2 Fully Connected, Output Layer

**First Layer:**
The input for AlexNet is a 227x227x3 RGB image which passes through the first convolutional layer with 96 feature maps or filters having size 11×11 and a stride of 4. The image dimensions changes to 55x55x96.
Then the AlexNet applies maximum pooling layer or sub-sampling layer with a filter size 3×3 and a stride of two. The resulting image dimensions will be reduced to 27x27x96.



**Second Layer:**
Next, there is a second convolutional layer with 256 feature maps having size 5×5 and a stride of 1.
Then there is again a maximum pooling layer with filter size 3×3 and a stride of 2. This layer is same as the second layer except it has 256 feature maps so the output will be reduced to 13x13x256.



**Third, Fourth and Fifth Layers:**
The third, fourth and fifth layers are convolutional layers with filter size 3×3 and a stride of one. The first two used 384 feature maps where the third used 256 filters.
The three convolutional layers are followed by a maximum pooling layer with filter size 3×3, a stride of 2 and have 256 feature maps.



**Sixth Layer:**
The convolutional layer output is flatten through a fully connected layer with 9216 feature maps each of size 1×1.



**Seventh and Eighth Layers:**
Next is again two fully connected layers with 4096 units.



**Output Layer:**
Finally, there is a softmax output layer ŷ with 1000 possible values.

In [1]:
import tensorflow as tf
if __name__=='__main__':
    tf.enable_eager_execution()

from tensorflow.keras.models import Sequential 
from tensorflow.keras.layers import Conv2D,Dropout,Activation,BatchNormalization,MaxPool2D,Flatten,Dense

In [2]:
class Alexnet(tf.keras.Model):
    def __init__(self,classes):
        super(Alexnet,self).__init__()
        
        self.input_names = "None"
        #1st Conv Layer
        self.conv1 = Sequential()
        self.conv1.add(Conv2D(filters=96,kernel_size=(11,11),strides=(4,4),padding='valid'))
        self.conv1.add(Activation('relu'))
        self.conv1.add(MaxPool2D(pool_size=(3,3), strides=(2,2), padding='valid'))
        self.conv1.add(BatchNormalization())
        
        #2nd Conv Layer
        self.conv2 = Sequential()
        self.conv2.add(Conv2D(filters=256,kernel_size=(5,5),strides=(1,1),padding='same'))
        self.conv2.add(Activation('relu'))
        self.conv2.add(MaxPool2D(pool_size=(3,3), strides=(2,2), padding='valid'))
        self.conv2.add(BatchNormalization())
            
        #3rd Conv Layer    
        self.conv3 = Sequential()
        self.conv3.add(Conv2D(filters=384,kernel_size=(3,3),strides=(1,1),padding='same'))
        self.conv3.add(Activation('relu'))
        self.conv3.add(BatchNormalization())
        
        #4th Conv Layer
        self.conv4 = Sequential()
        self.conv4.add(Conv2D(filters=384,kernel_size=(3,3),strides=(1,1),padding='same'))
        self.conv4.add(Activation('relu'))
        self.conv4.add(BatchNormalization())
            
        #5th Conv Layer    
        self.conv5 = Sequential()
        self.conv5.add(Conv2D(filters=256,kernel_size=(3,3),strides=(1,1),padding='same'))
        self.conv5.add(Activation('relu'))
        self.conv5.add(BatchNormalization())
        self.conv5.add(MaxPool2D(pool_size=(2,2), strides=(2,2), padding='valid'))
        
        #6th Layer  & 7th Layer
        self.fc = Sequential()
        self.fc.add(Flatten())
        self.fc.add(Dense(4096))
        self.fc.add(Activation('relu'))
        self.fc.add(Dropout(0.4))
        
        #8th Layer
        self.fc.add(Dense(4096))
        self.fc.add(Activation('relu'))
        self.fc.add(Dropout(0.4))
                           
        #9th Output Layer
        self.classification_layer = Sequential()
        self.classification_layer.add(Dense(classes))
    
    def call(self,x):
        #print(x.shape)
        conv1 = self.conv1(x)
        #print(conv1.shape)
        conv2 = self.conv2(conv1)
        #print(conv2.shape)
        conv3 = self.conv3(conv2)
        #print(conv3.shape)
        conv4 = self.conv4(conv3)
        #print(conv4.shape)
        conv5 = self.conv5(conv4)
        #print(conv5.shape)
        fc = self.fc(conv5)
        #print(fc.shape)
        out = self.classification_layer(fc)
        #print(out.shape)
        return out
        

Define the **Loss Function**

In [3]:
class softmax_cross_entropy:
    def __call__(self,onehot_labels,logits):
        return tf.losses.softmax_cross_entropy(onehot_labels,logits)

**Compile** your model

In [4]:
model = Alexnet(classes=10)
lr = 1e-5
optimizer = tf.train.AdamOptimizer(learning_rate=lr)
model.compile(optimizer=optimizer,loss=softmax_cross_entropy())

**Load** Dataset

In [5]:
import dataloader as dl

In [69]:
dataset = dl.cifar10_loader("../../Datasets/cifar-10-batches-py/",buffer_size=1024,batch_size=512)

**Train** the model

In [None]:
epochs = 5

for e in range(epochs):
    epoch_loss = 0
    for i,(x,y) in enumerate(dataset('train')):
        epoch_loss += model.fit(x=x,y=y,epochs=1,verbose=0,batch_size=512).history['loss'][0]
        print('Epoch %d batch %d'%(e+1,i+1),end='\r')
        
    print("Epoch %d Loss %.4f"%(e,epoch_loss))

Epoch 0 Loss 213.7497
Epoch 1 Loss 155.6060
Epoch 3 batch 55