<a href="https://colab.research.google.com/github/inesfrsantos/MachineLearning/blob/main/Model_Attempts.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Simple self made CNN

In [None]:
#Load packages
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout

# Define the input shape
input_shape = (64, 64, 1)

# Initialize the model
model = Sequential()

# Add the first convolutional layer
model.add(Conv2D(32, (3, 3), activation='relu', input_shape=input_shape))

# Add the first max pooling layer
model.add(MaxPooling2D((2, 2)))

# Add the second convolutional layer
model.add(Conv2D(64, (3, 3), activation='relu'))

# Add the second max pooling layer
model.add(MaxPooling2D((2, 2)))

# Add the flatten layer
model.add(Flatten())

# Add the dense layer
model.add(Dense(128, activation='relu'))

# Add the dropout layer
model.add(Dropout(0.5))

# Add the output layer
model.add(Dense(28, activation='softmax'))

# Compile the model
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

# Print the model summary
model.summary()

Model: "sequential"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d (Conv2D)             (None, 62, 62, 32)        320       
                                                                 
 max_pooling2d (MaxPooling2D  (None, 31, 31, 32)       0         
 )                                                               
                                                                 
 conv2d_1 (Conv2D)           (None, 29, 29, 64)        18496     
                                                                 
 max_pooling2d_1 (MaxPooling  (None, 14, 14, 64)       0         
 2D)                                                             
                                                                 
 flatten (Flatten)           (None, 12544)             0         
                                                                 
 dense (Dense)               (None, 128)               1

# Complex Sef Made CNN

In [None]:
# Add the first convolutional layer with 32 filters of size (3, 3), using 'same' padding
# Input shape is (64, 64, 1) for grayscale images, and ReLU activation function
model.add(Conv2D(32, (3, 3), padding='same', input_shape=(64, 64, 1), activation='relu'))

# Add batch normalization to normalize the activations of the previous layer
model.add(BatchNormalization())

# Add the second convolutional layer with 32 filters of size (3, 3), using 'same' padding
# ReLU activation function is used
model.add(Conv2D(32, (3, 3), padding='same', activation='relu'))

# Add batch normalization to normalize the activations of the previous layer
model.add(BatchNormalization())

# Add max pooling layer with a pool size of (2, 2) to downsample the image
# Dropout of 0.25 is applied to randomly set 25% of the input units to 0 during training
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

# Add the third convolutional layer with 64 filters of size (3, 3), using 'same' padding
# ReLU activation function is used
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))

# Add batch normalization to normalize the activations of the previous layer
model.add(BatchNormalization())

# Add the fourth convolutional layer with 64 filters of size (3, 3), using 'same' padding
# ReLU activation function is used
model.add(Conv2D(64, (3, 3), padding='same', activation='relu'))

# Add batch normalization to normalize the activations of the previous layer
model.add(BatchNormalization())

# Add max pooling layer with a pool size of (2, 2) to downsample the image
# Dropout of 0.25 is applied to randomly set 25% of the input units to 0 during training
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

# Add the fifth convolutional layer with 128 filters of size (3, 3), using 'same' padding
# ReLU activation function is used
model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))

# Add batch normalization to normalize the activations of the previous layer
model.add(BatchNormalization())

# Add the sixth convolutional layer with 128 filters of size (3, 3), using 'same' padding
# ReLU activation function is used
model.add(Conv2D(128, (3, 3), padding='same', activation='relu'))

# Add batch normalization to normalize the activations of the previous layer
model.add(BatchNormalization())

# Add max pooling layer with a pool size of (2, 2) to downsample the image
# Dropout of 0.25 is applied to randomly set 25% of the input units to 0 during training
model.add(MaxPooling2D(pool_size=(2, 2)))
model.add(Dropout(0.25))

# Flatten the output from the previous layer to a 1D vector
model.add(Flatten())

# Add a fully connected dense layer with 512 units and ReLU activation function
model.add(Dense(512, activation='relu'))

# Add batch normalization to normalize the activations of the previous layer
model.add(BatchNormalization())

# Add Dropout o.5, meaning half of the neurons will be dropped
model.add(Dropout(0.5))

# Add the output layer with 28 units (one for each class) and softmax activation
model.add(Dense(28, activation='softmax'))


ValueError: ignored

The architecture used for Arabic sign language image classification is effective because it includes more layers and parameters, allowing the model to learn more complex features from the images. The use of batch normalization helps to normalize the input, which can improve the convergence of the model during training. The addition of dropout layers helps to prevent overfitting by randomly dropping out nodes during training. The use of multiple convolutional and max pooling layers allows the model to learn hierarchical representations of the input images, capturing both local and global features. This architecture can handle more complex patterns and variations in the input data compared to a simpler model with only a few layers, like the simple CNN.

# GoogLeNet Inspired CNN

## Option 1

In [None]:
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Input, Conv2D, MaxPooling2D, Dropout, concatenate, Flatten, Dense

# Define the input shape
input_shape = (64, 64, 1)

# Input layer
input_layer = Input(shape=input_shape)

# First convolutional block
conv1_1 = Conv2D(64, (1,1), activation='relu', padding='same')(input_layer)
conv1_2 = Conv2D(64, (3,3), activation='relu', padding='same')(conv1_1)
conv1_3 = Conv2D(192, (3,3), activation='relu', padding='same')(conv1_2)
pool1 = MaxPooling2D((3,3), strides=(2,2), padding='same')(conv1_3)

# Second convolutional block
conv2_1 = Conv2D(64, (1,1), activation='relu', padding='same')(pool1)
conv2_2 = Conv2D(128, (3,3), activation='relu', padding='same')(conv2_1)
conv2_3 = Conv2D(256, (3,3), activation='relu', padding='same')(conv2_2)
pool2 = MaxPooling2D((3,3), strides=(2,2), padding='same')(conv2_3)

# Third convolutional block with additional dropout and batch normalization
conv3_1 = Conv2D(128, (1,1), activation='relu', padding='same')(pool2)
conv3_2 = Conv2D(256, (3,3), activation='relu', padding='same')(conv3_1)
conv3_3 = Conv2D(512, (3,3), activation='relu', padding='same')(conv3_2)
pool3 = MaxPooling2D((3,3), strides=(2,2), padding='same')(conv3_3)
dropout1 = Dropout(0.4)(pool3)
batch_norm1 = BatchNormalization()(dropout1)

# Fourth convolutional block with additional dropout and batch normalization
conv4_1 = Conv2D(256, (1,1), activation='relu', padding='same')(batch_norm1)
conv4_2 = Conv2D(512, (3,3), activation='relu', padding='same')(conv4_1)
conv4_3 = Conv2D(1024, (3,3), activation='relu', padding='same')(conv4_2)
dropout2 = Dropout(0.4)(conv4_3)
batch_norm2 = BatchNormalization()(dropout2)

# Fifth convolutional block with additional dropout and batch normalization
conv5_1 = Conv2D(256, (1,1), activation='relu', padding='same')(batch_norm2)
conv5_2 = Conv2D(512, (3,3), activation='relu', padding='same')(conv5_1)
conv5_3 = Conv2D(1024, (3,3), activation='relu', padding='same')(conv5_2)
dropout3 = Dropout(0.4)(conv5_3)
batch_norm3 = BatchNormalization()(dropout3)

# Sixth convolutional block with additional dropout and batch normalization
conv6_1 = Conv2D(512, (1,1), activation='relu', padding='same')(concat)
conv6_2 = Conv2D(1024, (3,3), activation='relu', padding='same')(conv6_1)
dropout4 = Dropout(0.4)(conv6_2)
batch_norm4 = BatchNormalization()(dropout4)

# Seventh convolutional block with additional dropout and batch normalization
conv7_1 = Conv2D(256, (1,1), activation='relu', padding='same')(batch_norm4)
conv7_2 = Conv2D(512, (3,3), activation='relu', padding='same')(conv7_1)
conv7_3 = Conv2D(1024, (3,3), activation='relu', padding='same')(conv7_2)
dropout5 = Dropout(0.4)(conv7_3)
batch_norm5 = BatchNormalization()(dropout5)

# Eighth convolutional block with additional dropout and batch normalization
conv8_1 = Conv2D(256, (1,1), activation='relu', padding='same')(batch_norm5)
conv8_2 = Conv2D(512, (3,3), strides=(2,2), activation='relu', padding='same')(conv8_1)
conv8_3 = Conv2D(1024, (3,3), activation='relu', padding='same')(conv8_2)
dropout6 = Dropout(0.4)(conv8_3)
batch_norm6 = BatchNormalization()(dropout6)

# Ninth convolutional block with additional dropout and batch normalization
conv9_1 = Conv2D(512, (1,1), activation='relu', padding='same')(batch_norm6)
conv9_2 = Conv2D(1024, (3,3), activation='relu', padding='same')(conv9_1)
dropout7 = Dropout(0.4)(conv9_2)
batch_norm7 = BatchNormalization()(dropout7)

# Flatten the output of the ninth convolutional block
flatten = Flatten()(batch_norm7)

# Add two fully connected (dense) layers with additional dropout and batch normalization
dense1 = Dense(1024, activation='relu')(flatten)
dropout8 = Dropout(0.4)(dense1)
batch_norm8 = BatchNormalization()(dropout8)
dense2 = Dense(10, activation='softmax')(batch_norm8) # output layer with 10 classes

# Define the model with the input layer and the output layer
model = Model(inputs=input_layer, outputs=dense2)

# Print the model summary
model.summary()


The modifications made to the original GoogLeNet architecture are:

Using a grayscale input image (so only 1 channel instead of 3)
Changing the input image size to 64x64
Adding dropout and batch normalization layers after each of the third, fourth, fifth, sixth, seventh, eighth, and ninth convolutional blocks
Changing the number of output classes in the last dense layer to 10 (since we don't know what the specific classification task is, but it's common to have 10 classes for image classification problems)

## Option 2

In [None]:
from keras.models import Model
from keras.layers import Input, Conv2D, MaxPooling2D, concatenate, AveragePooling2D, Flatten, Dense, Dropout

# Define input shape
input_shape = (64, 64, 1)

# Define input layer
inputs = Input(shape=input_shape)

# Define GoogLeNet-like architecture
# First convolutional layer
conv1_7x7_s2 = Conv2D(64, (7, 7), strides=(2, 2), padding='same', activation='relu')(inputs)

# Max pooling layer
pool1_3x3_s2 = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(conv1_7x7_s2)

# Second convolutional layer
conv2_3x3_reduce = Conv2D(64, (1, 1), padding='same', activation='relu')(pool1_3x3_s2)
conv2_3x3 = Conv2D(192, (3, 3), padding='same', activation='relu')(conv2_3x3_reduce)

# Max pooling layer
pool2_3x3_s2 = MaxPooling2D(pool_size=(3, 3), strides=(2, 2), padding='same')(conv2_3x3)

# Inception module 1
inception_3a_1x1 = Conv2D(64, (1, 1), padding='same', activation='relu')(pool2_3x3_s2)

# These 3 convolutional layers are added in the modified GoogLeNet architecture.
conv3_3x3 = Conv2D(64, (3, 3), padding='same', activation='relu')(pool2_3x3_s2)
conv4_5x5 = Conv2D(32, (5, 5), padding='same', activation='relu')(pool2_3x3_s2)
conv5_1x1 = Conv2D(32, (1, 1), padding='same', activation='relu')(pool2_3x3_s2)

# Concatenate the output of the convolutional layers above
concatenate_1 = concatenate([inception_3a_1x1, conv3_3x3, conv4_5x5, conv5_1x1], axis=3)

# Inception module 2
inception_3b_1x1 = Conv2D(128, (1, 1), padding='same', activation='relu')(concatenate_1)

# These 3 convolutional layers are added in the modified GoogLeNet architecture.
conv6_3x3 = Conv2D(128, (3, 3), padding='same', activation='relu')(concatenate_1)
conv7_5x5 = Conv2D(64, (5, 5), padding='same', activation='relu')(concatenate_1)
conv8_1x1 = Conv2D(64, (1, 1), padding='same', activation='relu')(concatenate_1)

# Concatenate the output of the convolutional layers above
concatenate_2 = concatenate([inception_3b_1x1, conv6_3x3, conv7_5x5, conv8_1x1], axis=3)

# Add a fully connected layer with 1024 neurons
model.add(layers.Dense(1024, activation='relu'))

# Add a dropout layer to prevent overfitting
model.add(layers.Dropout(0.4))

# Add the final output layer with 10 neurons (one for each class)
model.add(layers.Dense(10, activation='softmax'))

# Compile the model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])




The proposed architecture is inspired by GoogLeNet, which is a well-known deep learning architecture used for image classification tasks. In the case of Arabic sign language classification, the proposed model takes grayscale images of size 64x64 pixels as input and passes them through a series of convolutional and pooling layers to extract features at different scales. The added layers in this model help improve its accuracy and enhance the ability of the model to capture complex patterns in the input images.

In particular, the inception modules in the proposed architecture are designed to capture both local and global features of the input images, which is essential for recognizing signs in Arabic sign language. Additionally, the added dropout layer helps prevent overfitting by randomly dropping out some of the neurons during training, which improves the model's generalization ability.

Overall, the proposed architecture is well-suited for the classification task of Arabic sign language because it can effectively capture the intricate hand gestures and motions that are involved in this language.
