<a href="https://colab.research.google.com/github/ElyorS/AI-application-system/blob/main/12204556_week6_hw.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

***TASK 2.1***
===========

In [1]:
import tensorflow as tf

# Load the MNIST dataset
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the images
train_images = train_images / 255.0
test_images = test_images / 255.0

# Build the CNN model with Dropout and Batch Normalization
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.BatchNormalization(),  # Add Batch Normalization
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dropout(0.25),  # Add Dropout
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.BatchNormalization(),  # Add Batch Normalization
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Dropout(0.25),  # Add Dropout
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.BatchNormalization(),  # Add Batch Normalization
    tf.keras.layers.Dropout(0.5),  # Add Dropout
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=5)

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)

# Save the model
model.save('mnist_cnn_model_with_dropout_bn.h5')

# Load the saved model
loaded_model = tf.keras.models.load_model('mnist_cnn_model_with_dropout_bn.h5')

# Evaluate the loaded model
test_loss, test_acc = loaded_model.evaluate(test_images, test_labels)

print("Test accuracy:", test_acc)


Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


  saving_api.save_model(


Test accuracy: 0.9861999750137329


In this code, I implemented a Convolutional Neural Network (CNN) for classifying handwritten digits from the MNIST dataset. It incorporates Dropout and Batch Normalization layers to improve the model's performance. Dropout helps prevent overfitting by randomly deactivating neurons during training, while Batch Normalization normalizes the inputs of each layer for faster and more stable training. The code trains the model, evaluates its accuracy on a test dataset, and saves the trained model for future use. These enhancements can lead to better generalization and faster training for the CNN model.

Test accuracy: 0.9861999750137329

**TASK 2.2**
=========

In [5]:
import tensorflow as tf

# Load the MNIST dataset
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the images
train_images = train_images / 255.0
test_images = test_images / 255.0

# Define a list of activation functions to experiment with
activation_functions = ['sigmoid', 'tanh', 'relu']

for activation_function in activation_functions:
    # Build the CNN model with the specified activation function
    model = tf.keras.Sequential([
        tf.keras.layers.Conv2D(32, (3, 3), activation=activation_function, input_shape=(28, 28, 1)),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Conv2D(64, (3, 3), activation=activation_function),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(128, activation=activation_function),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    # Compile the model
    model.compile(optimizer='adam',
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])

    # Train the model
    model.fit(train_images, train_labels, epochs=5)

    # Evaluate the model
    test_loss, test_acc = model.evaluate(test_images, test_labels)

    print(f"Activation Function: {activation_function}")
    print("Test accuracy:", test_acc)
    print("------------------")


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Activation Function: sigmoid
Test accuracy: 0.9861000180244446
------------------
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Activation Function: tanh
Test accuracy: 0.9869999885559082
------------------
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Activation Function: relu
Test accuracy: 0.9908000230789185
------------------


Test accuracy with sigmoid: 0.9861000180244446
_____________________________________
Test accuracy with tanh: 0.9869999885559082
_____________________________________
Test accuracy with relu: 0.9908000230789185

The impact of activation functions on CNN performance varies. Sigmoid and tanh have been used historically but can suffer from the vanishing gradient problem. ReLU (Rectified Linear Unit) is a popular choice as it tends to lead to faster convergence and often achieves better results. However, it's important to experiment with different activation functions to see which one works best for a particular problem. In this case, relu had the best accurasy, while tanh had the 2nd best accurasy. Sigmoid has lower accurasy compared to relu and tanh.

***TASK 2.3***
========


In [3]:
import tensorflow as tf

# Load the MNIST dataset
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the images
train_images = train_images / 255.0
test_images = test_images / 255.0

# Build the CNN model with more convolutional and pooling layers
model = tf.keras.Sequential([
    tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Conv2D(128, (3, 3), activation='relu'),  # Add another convolutional layer
    tf.keras.layers.MaxPooling2D((2, 2)),
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dense(128, activation='relu'),
    tf.keras.layers.Dense(10, activation='softmax')
])

# Compile the model
model.compile(optimizer='adam',
              loss='sparse_categorical_crossentropy',
              metrics=['accuracy'])

# Train the model
model.fit(train_images, train_labels, epochs=5)

# Evaluate the model
test_loss, test_acc = model.evaluate(test_images, test_labels)

print("Test accuracy:", test_acc)


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test accuracy: 0.9839000105857849


I added an additional convolutional layer with 128 filters and another max-pooling layer. This makes the model deeper, allowing it to learn more complex features. The impact of adding more layers will depend on the specific dataset and problem. Deeper networks can potentially recognize more intricate patterns but may also require more data and careful regularization to prevent overfitting.

Test accuracy: 0.9839000105857849

***TASK 2.4***
======

In [4]:
import tensorflow as tf

# Load the MNIST dataset
mnist = tf.keras.datasets.mnist
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

# Normalize the images
train_images = train_images / 255.0
test_images = test_images / 255.0

# Define a list of optimizers to experiment with
optimizers = ['sgd', 'rmsprop', 'adam']

for optimizer in optimizers:
    # Build the CNN model
    model = tf.keras.Sequential([
        tf.keras.layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Conv2D(64, (3, 3), activation='relu'),
        tf.keras.layers.MaxPooling2D((2, 2)),
        tf.keras.layers.Flatten(),
        tf.keras.layers.Dense(128, activation='relu'),
        tf.keras.layers.Dense(10, activation='softmax')
    ])

    # Compile the model with the specified optimizer
    if optimizer == 'sgd':
        optimizer_obj = tf.keras.optimizers.SGD(learning_rate=0.01)
    elif optimizer == 'rmsprop':
        optimizer_obj = tf.keras.optimizers.RMSprop(learning_rate=0.001)
    else:  # Default to Adam
        optimizer_obj = 'adam'

    model.compile(optimizer=optimizer_obj,
                  loss='sparse_categorical_crossentropy',
                  metrics=['accuracy'])

    # Train the model
    model.fit(train_images, train_labels, epochs=5)

    # Evaluate the model
    test_loss, test_acc = model.evaluate(test_images, test_labels)

    print(f"Optimizer: {optimizer}")
    print("Test accuracy:", test_acc)
    print("------------------")


Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Optimizer: sgd
Test accuracy: 0.980400025844574
------------------
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Optimizer: rmsprop
Test accuracy: 0.9915000200271606
------------------
Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Optimizer: adam
Test accuracy: 0.9915000200271606
------------------


Test accurasy with sgd: 0.980400025844574
_________________
Test accurasy with rmsprop: 0.9915000200271606
_________________
Test accurasy with adam: 0.9915000200271606

In this code I iterated through three different optimizers (SGD, RMSprop, and Adam), train a separate CNN model for each optimizer, and evaluate and print the test accuracy. The choice of optimizer can significantly affect training speed and performance. SGD is a basic optimizer that can work well with proper tuning but may converge slowly. RMSprop adapts the learning rate per parameter and can be faster in some cases. Adam is a popular choice due to its adaptive learning rates and often provides faster convergence and better performance. However, the optimal optimizer can vary depending on the specific dataset and problem, so experimentation is key. In this sneario, rmsprop and adam had a better performance compared to sgd.

***TASK 2.5***
=========


In [6]:
import tensorflow as tf

# Load Data
mnist = tf.keras.datasets.mnist

# Split data into train and test
(train_images, train_labels), (test_images, test_labels) = mnist.load_data()

train_images = train_images / 255.0
test_images = test_images / 255.0

model = tf . keras . Sequential ([
    tf.keras.layers.Conv2D(32 , (3,3) , activation ='relu',input_shape =(28 , 28 , 1) ) ,
    tf.keras.layers.MaxPooling2D ((2 , 2) ) ,
    tf.keras.layers.Conv2D(64 , (3 ,3) , activation ='relu') ,
    tf.keras.layers.MaxPooling2D((2 , 2) ) ,
    tf.keras.layers.Flatten() ,
    tf.keras.layers.Dense(128 , activation ='relu') ,
    tf.keras.layers.Dense(10 , activation ='softmax')
])


model.compile( optimizer ='adam' ,loss = 'sparse_categorical_crossentropy',metrics = [ 'accuracy'])
model.fit( train_images , train_labels , epochs =5)

test_loss , test_acc = model . evaluate ( test_images , test_labels)
model.save ('mnist_cnn_model.h5')

loaded_model = tf.keras.models.load_model ('mnist_cnn_model.h5')

test_loss , test_acc = loaded_model.evaluate (test_images ,test_labels)

print("Test accuracy:", test_acc)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5
Test accuracy: 0.9894000291824341


In this task, I evaluated my trained CNN model using the test dataset. This evaluation process helped me gauge how well my model generalizes to unseen data and provided insights into its overall performance. By loading the previously trained model and using the evaluate method, I obtained important metrics such as test loss and test accuracy.

Test accuracy: 0.9894000291824341