In [None]:
from keras.layers import Input, Conv2D, MaxPooling2D, AveragePooling2D, concatenate, Flatten, Dense, BatchNormalization, Dropout, GlobalAveragePooling2D
from keras.models import Model


input_shape = (32, 32, 3)
input_layer = Input(shape=input_shape)

conv1x1 = Conv2D(64, (1, 1), activation='relu', padding='same')(input_layer)
conv1x1 = BatchNormalization()(conv1x1)

conv3x3 = Conv2D(128, (3, 3), activation='relu', padding='same')(input_layer)
conv3x3 = BatchNormalization()(conv3x3)

conv5x5 = Conv2D(32, (5, 5), activation='relu', padding='same')(input_layer)
conv5x5 = BatchNormalization()(conv5x5)

max_pool = MaxPooling2D((3, 3), strides=(1, 1), padding='same')(input_layer)

inception_module = concatenate([conv1x1, conv3x3, conv5x5, max_pool], axis=-1)

conv1 = Conv2D(128, (3, 3), activation='relu', padding='same')(inception_module)
conv1 = BatchNormalization()(conv1)
conv1 = Dropout(0.3)(conv1)

conv2 = Conv2D(64, (3, 3), activation='relu', padding='same')(conv1)
conv2 = BatchNormalization()(conv2)
conv2 = Dropout(0.3)(conv2)

global_avg_pooling = GlobalAveragePooling2D()(conv2)

fc1 = Dense(256, activation='relu')(global_avg_pooling)
fc1 = Dropout(0.5)(fc1)

output_layer = Dense(10, activation='softmax')(fc1)

model = Model(inputs=input_layer, outputs=output_layer)

model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

model.summary()


###Batch Normalization:

Added Batch Normalization after each convolutional layer to improve convergence and reduce internal covariate shift.
###Dropout:

Introduced dropout after convolutional and fully connected layers for regularization, reducing the risk of overfitting.
###Global Average Pooling:

Replaced the Flatten layer with Global Average Pooling to reduce the number of parameters and improve model generalization.


---



## Inception Module:
In this example, we used 1x1, 3x3, and 5x5 convolutions, along with max pooling, in parallel. This action allows the network to capture features at various receptive field sizes, enhancing its ability to recognize patterns of different scales.


---




## Stride Parameter:
The stride parameter in convolutional layers affects the spatial dimensions of the feature maps. A larger stride reduces the spatial dimensions of the feature maps, leading to a more aggressive downsampling. This can be useful in reducing computational complexity and controlling overfitting. In the example, max pooling with a stride of (1, 1) is used to maintain spatial dimensions.

---



##Convolutional Layers:

* Convolutional layers with different filter sizes (1x1, 3x3, 5x5) are used in the Inception module.
* Max pooling with a large kernel size (3x3) and stride (1, 1) is employed.
* These layers perform feature extraction by detecting patterns in the input image. The combination of different filter sizes allows the network to learn diverse features, contributing to improved performance.