# Assignment 2 - Question 4: Convolutional Neural Network
**Course Name:** Machine Learning (DDA3020)

<font color=Red>*Please enter your personal information (Double-click this block first)*</font>

**Name:** 文杰

**Student ID:** 123090612

**It's highly recommended to finish Question 3 first.**

### Overview

In this question, you will implement two CNN models and train them on the same dataset as Question 3 (Fasion-MNIST). We will discover how well-suited CNNs are for intensive data tasks such as image processing, compared to traditional machine learning algorithms (like those tree tree-based models in Question 3). Similarly, your task is to **run all codes in this script and complete the parts marked with** <font color=Red>\[TASK\]</font>.

### Introduction of TensorFlow

TensorFlow is a powerful open-source package for machine learning and deep learning, enabling efficient implementation of complex models like neural networks with ease. First of all, you need to install the TensorFlow package with the version of 2.9

```bash
pip install numpy==1.26 tensorflow==2.9 -i https://mirrors.aliyun.com/pypi/simple/ 
```

by running this commend in your command line window. To check whether the package is successfully installed, you can try to run the following import block.

In [1]:
import numpy as np
import gzip
import numpy as np
import random
import os
import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.utils import to_categorical

Don't need to carefully read this block since it's just loading the dataset. Just run it.

In [2]:
def load_mnist(path, kind, subset=None):
    labels_path = os.path.join(path, '%s-labels-idx1-ubyte.gz'%kind)
    images_path = os.path.join(path, '%s-images-idx3-ubyte.gz'%kind)

    with gzip.open(labels_path, 'rb') as lbpath:
        labels = np.frombuffer(lbpath.read(), dtype=np.uint8, offset=8)
    with gzip.open(images_path, 'rb') as imgpath:
        images = np.frombuffer(imgpath.read(), dtype=np.uint8, offset=16).reshape(len(labels), 784)
    
    if subset is not None:
        selected_images, selected_labels = [], []
        for label in range(10):
            indices = np.where(labels == label)[0]
            selected_indices = np.random.choice(indices, subset, replace=False)
            selected_images.append(images[selected_indices])
            selected_labels.append(labels[selected_indices])
        images = np.concatenate(selected_images, axis=0)
        labels = np.concatenate(selected_labels, axis=0)

        paired = list(zip(images, labels))
        random.shuffle(paired)
        images, labels = zip(*paired)
    
    return np.array(images), np.array(labels)

In this question, we will use all data of Fashion-MNIST and do a little bit data preprocessing.

In [3]:
X_train, y_train = load_mnist('./data/', kind='train')
X_test, y_test = load_mnist('./data/', kind='t10k')

X_train = X_train.reshape((X_train.shape[0], 28, 28, 1)).astype('float32') / 255
X_test = X_test.reshape((X_test.shape[0], 28, 28, 1)).astype('float32') / 255
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

### Task 1

At the beginning, we need to build a very simple CNN model with the structure of
1. A 2D convolutional layer with 16 filters with each size 3*3 (RELU activation function)
2. A 2D maxpooling layer with 2*2 pooling window
3. A flatten layer to convert 2D feature into 1D vector
4. A fully connected layer using Softmax activation

Remember that we are doing a image classification task, so we shall use categorical cross entropy function as the loss funtion. <font color=Red>\[TASK\]</font> (10 points)

In [4]:
# Define the simple CNN model
model = models.Sequential()

# Add a convolutional layer with 16 filters, 3x3 kernel, and ReLU activation
# Extracts basic features from the 28x28 grayscale images
model.add(layers.Conv2D(16, (3, 3), activation='relu', input_shape=(28, 28, 1)))

# Add a max-pooling layer with 2x2 window to reduce spatial dimensions
model.add(layers.MaxPooling2D((2, 2)))

# Flatten the 2D feature maps into a 1D vector for the dense layer
model.add(layers.Flatten())

# Add a fully connected layer with 10 units and Softmax activation for classification
model.add(layers.Dense(10, activation='softmax'))

# Compile the model with Adam optimizer, categorical cross-entropy loss, and accuracy metric
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])

# Train the model for 2 epochs with a batch size of 32 and 10% validation split
model.fit(X_train, y_train, epochs=2, batch_size=32, validation_split=0.1) # Train the model

# Evaluate the model on the test set
test_loss, test_acc = model.evaluate(X_test, y_test) #  Test the model
print(f"Simple CNN Test Accuracy: {test_acc}")

Epoch 1/2


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.7741 - loss: 0.6558 - val_accuracy: 0.8640 - val_loss: 0.3809
Epoch 2/2
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 2ms/step - accuracy: 0.8713 - loss: 0.3694 - val_accuracy: 0.8768 - val_loss: 0.3434
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 968us/step - accuracy: 0.8775 - loss: 0.3562
Simple CNN Test Accuracy: 0.8738999962806702


### Task 2

Then we will take a challenge to implement a more complex CNN model to have a better classification performance. Here is a structure for your reference, or you can also design your own CNN model. The only requirement is to have a better performance than the simple CNN in Task 1 (a larger accuracy score on test set).

The reference structure is devided into three parts:
1. Primary Feature Extraction Part
    1. A 2D convolutional layer with 32 filters with each size 3*3 (RELU activation function)
    2. A normalization layer
    3. A 2D maxpooling layer with 2*2 pooling window
    4. A dropout layer (randomly drops 25% of units) (designed for preventing overfitting)
2. Advanced Feature Extraction Part

    This part is mostly similar to the previous section. The only difference is to use more filters (like 64) in convolutional layer to gain high-level features
3. Classification Part
    1. A flatten layer to convert 2D feature into 1D vector
    2. A fully connected layer with 512 units using RELU to summerize high-dimensinal features
    3. Another connected layer for Softmax classification

Remember that we are doing a image classification task, so we shall use categorical cross entropy function as the loss funtion. <font color=Red>\[TASK\]</font> (10 points)

In [5]:
# Define the complex CNN model
model_complex = models.Sequential()

# Primary Feature Extraction Part
# Convolutional layer with 32 filters, 3x3 kernel, and ReLU activation
model_complex.add(layers.Conv2D(32, (3, 3), activation='relu', input_shape=(28, 28, 1)))

# Batch normalization to stabilize and accelerate training
model_complex.add(layers.BatchNormalization())

# Max-pooling layer with 2x2 window
model_complex.add(layers.MaxPooling2D((2, 2)))

# Dropout layer to prevent overfitting by randomly dropping 25% of units
model_complex.add(layers.Dropout(0.25))

# Advanced Feature Extraction Part
# Convolutional layer with 64 filters to capture higher-level features
model_complex.add(layers.Conv2D(64, (3, 3), activation='relu'))

# Another batch normalization layer
model_complex.add(layers.BatchNormalization())

# Another max-pooling layer
model_complex.add(layers.MaxPooling2D((2, 2)))

# Another dropout layer
model_complex.add(layers.Dropout(0.25))

# Classification Part
# Flatten the feature maps into a 1D vector
model_complex.add(layers.Flatten())

# Fully connected layer with 512 units and ReLU activation to learn complex patterns
model_complex.add(layers.Dense(512, activation='relu'))

# Output layer with 10 units and Softmax activation for classification
model_complex.add(layers.Dense(10, activation='softmax'))

# Compile the model with Adam optimizer, categorical cross-entropy loss, and accuracy metric
model_complex.compile(optimizer='adam',
                      loss='categorical_crossentropy',
                      metrics=['accuracy'])

# Train the model for 2 epochs with a batch size of 32 and 10% validation split
model_complex.fit(X_train, y_train, epochs=2, batch_size=32, validation_split=0.1)

# Evaluate the model on the test set
test_loss_complex, test_acc_complex = model_complex.evaluate(X_test, y_test)
print(f"Complex CNN Test Accuracy: {test_acc_complex}")

Epoch 1/2
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 9ms/step - accuracy: 0.7858 - loss: 0.6424 - val_accuracy: 0.8753 - val_loss: 0.3465
Epoch 2/2
[1m1688/1688[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m15s[0m 9ms/step - accuracy: 0.8826 - loss: 0.3158 - val_accuracy: 0.8978 - val_loss: 0.2818
[1m313/313[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.8959 - loss: 0.2892
Complex CNN Test Accuracy: 0.8949999809265137
