<a href="https://colab.research.google.com/github/Polaris0116/awesome_lists/blob/main/001_A3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## 注意事项：
## 1. 删除不必要的comment（注释）和Markdown（双击DD可删除）
## 2. 修改variable（即：变量）名称和数据集的获取路径
## 3. 修改模型的设定参数（filters_layer, kernel_size, strides, pool_size, activation等）
## 3. 修改训练的设定参数（VALIDATION_SPLIT, NUM_TRAILS, EPOCH等）

## 本次作业要求的两个文件我已合并成为此一个文件了，最后有模型的总结和准确率显示

## 1. Data Preprocessing

In [None]:
# Besic Libraries
import numpy as np

# Neural Networks
from tensorflow import keras
import tensorflow.keras.layers as layers

# Data Augmentation
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# Early Stopping
from tensorflow.keras.callbacks import EarlyStopping

# Import Hyper-parameters tuning utility
import optuna

# Save the best model
from os import path

In [None]:
# Hyper-parameters for tranining process (Not the model params)

BATCH_SIZE = 256
EPOCH = 20
VALIDATION_SPLIT = 0.2 # Validation during training
NUM_CLASSES = 10 # Number of target
NUM_TRAILS = 1 # Number of trails
TIME_OUT = 2000  # Timeout is optinal

In [None]:
# Load MNIST dataset
# Load training and testing dataset
(x_train, y_train), (x_test, y_test) = keras.datasets.mnist.load_data("/Users/steven/mnist.npz")

# Check Shape
print(x_train.shape)
print(y_train.shape)
print(x_test.shape)
print(y_test.shape)
print("")

# Check demisions of the image
# Image dimention: 28*28 pixels
img_rows, img_cols = x_train.shape[1:]
print("The image dimension is as follows:")
print(img_rows, img_cols)

(60000, 28, 28)
(60000,)
(10000, 28, 28)
(10000,)

The image dimension is as follows:
28 28


In [None]:
# 数据转换，进行数据标准化之后，能够更好利用模型进行训练
if keras.backend.image_data_format() == 'channels_first':
    # x_train.shape = (60000, 1, 28, 28)
    x_train = x_train.reshape(x_train.shape[0], 1, img_rows, img_cols)
    x_test = x_test.reshape(x_test.shape[0], 1, img_rows, img_cols)
    input_shape = (1, img_rows, img_cols)
else:
    # x_train.shape = (60000, 28, 28, 1)
    x_train = x_train.reshape(x_train.shape[0], img_rows, img_cols, 1)
    x_test = x_test.reshape(x_test.shape[0], img_rows, img_cols, 1)
    input_shape = (img_rows, img_cols, 1)

# Convert pixel values to [0, 1]
x_train = x_train.astype('float32') / 255
x_test = x_test.astype('float32') / 255

# Print X_train Shape
print(x_train.shape)
# Print Input Shape
print(input_shape)

# Convert y to one-hot
y_train = keras.utils.to_categorical(y_train, NUM_CLASSES)
y_test = keras.utils.to_categorical(y_test, NUM_CLASSES)

(60000, 28, 28, 1)
(28, 28, 1)


## 2. Model Params Tuning

In [None]:
# Define an objective function to be maximized
def objective(trial):
    # Construct a sequential model
    model = keras.models.Sequential()
    # Suggest values of the model hyperparameters

    filters_layer1 = trial.suggest_categorical("filters_layer1", [96, 128, 512])
    filters_layer2 = trial.suggest_categorical("filters_layer2", [32, 64, 96])

    kernel_size = trial.suggest_categorical("kernel_size", [3, 4])
    strides = trial.suggest_categorical("strides", [1, 2, 3])
    activation = trial.suggest_categorical("activation", ["relu", "tanh"])
    pool_size = trial.suggest_categorical("pool_size", [1, 2, 3])

    # Add 1st Conv2D Layer
    model.add(layers.Conv2D(filters=filters_layer1,
                            kernel_size=kernel_size,
                            strides=strides,
                            activation=activation,
                            padding="same",
                            input_shape=input_shape))
    # Add Pooling Layer
    model.add(layers.MaxPool2D(pool_size=pool_size))
    # Add 2nd Conv2D Layer
    model.add(layers.Conv2D(filters=filters_layer2,
                            kernel_size=kernel_size,
                            strides=strides,
                            activation=activation,
                            padding="same"))
    # Add Pooling Layer
    model.add(layers.MaxPool2D(pool_size=pool_size))
    # Add Dropout Layer
    model.add(layers.Dropout(rate=trial.suggest_categorical("rate", [0.1, 0.15, 0.2])))
    # Add Flatten Layer
    model.add(layers.Flatten())
    # Add  1st Fully-connected Layer
    model.add(layers.Dense(units=trial.suggest_categorical("units", [16, 32, 128]), activation="relu"))
    # Add  2nd Fully-connected Layer
    model.add(layers.Dense(NUM_CLASSES, activation="softmax"))
    # Compile the model
    model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])

    # Define early stopping callback
    early_stopping = EarlyStopping(monitor="loss", patience=7)

    # Train model with augmented data
    history = model.fit(x_train, y_train, callbacks=[early_stopping], batch_size=BATCH_SIZE, epochs=EPOCH, verbose=1)

    # Evaluate the model accuracy on the testing set.
    score = model.evaluate(x_test, y_test, verbose=1) # 1 for showing progress bar
    # Accuracy is the 1st element of the score list
    return score[1] # Accuracy is the 1st element of the score list

In [None]:
# Create a study object and optimize the objective function.
study = optuna.create_study(direction="maximize")
study.optimize(objective, n_trials=NUM_TRAILS) # you can even set time limit to end this trial: timeout=TIME_OUT

[I 2023-12-09 23:45:18,363] A new study created in memory with name: no-name-d80ae026-81c0-4420-b991-4b223153ee68
2023-12-09 23:45:18.367362: I tensorflow/core/platform/cpu_feature_guard.cc:151] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 FMA
To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.


Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


[I 2023-12-09 23:49:31,883] Trial 0 finished with value: 0.9891999959945679 and parameters: {'filters_layer1': 128, 'filters_layer2': 96, 'kernel_size': 3, 'strides': 2, 'activation': 'tanh', 'pool_size': 2, 'rate': 0.15, 'units': 16}. Best is trial 0 with value: 0.9891999959945679.


In [None]:
# Print total numebr of trails
print(f"Number of finished trials: {len(study.trials)}")
print("")
# Find the best trial
best_trial = study.best_trial
print(f"Aaacuacy for the Best Trial: {best_trial.value}")
print("")
print("Params for the Best Trial: ")
for key, value in best_trial.params.items():
    print(f"{key}: {value}")

Number of finished trials: 1

Aaacuacy for the Best Trial: 0.9891999959945679

Params for the Best Trial: 
filters_layer1: 128
filters_layer2: 96
kernel_size: 3
strides: 2
activation: tanh
pool_size: 2
rate: 0.15
units: 16


## 3. Model Reconstruction

In [None]:
# Acquire Best Model Hyper-parameters

best_params_list = []
for key, value in best_trial.params.items():
    best_params_list.append(value)
print("The best model hyper-parameters are as follows:")
print(best_params_list)

filters_layer1 = best_params_list[0]
filters_layer2 = best_params_list[1]
kernel_size = best_params_list[2]
strides = best_params_list[3]
pool_size = best_params_list[5]
rate = best_params_list[6]
unit = best_params_list[7]

The best model hyper-parameters are as follows:
[128, 96, 3, 2, 'tanh', 2, 0.15, 16]


In [None]:
# Best CNN Model Reconstruction
best_model = keras.models.Sequential()

best_model.add(layers.Conv2D(filters=filters_layer1,
                        kernel_size=kernel_size,
                        strides=strides,
                        activation="relu",
                        padding="same",
                        input_shape=input_shape))

best_model.add(layers.MaxPool2D(pool_size=pool_size))

best_model.add(layers.Conv2D(filters=filters_layer2,
                        kernel_size=kernel_size,
                        strides=strides,
                        activation="relu",
                        padding="same"))
# Add Pooling Layer
best_model.add(layers.MaxPool2D(pool_size=pool_size))
# Add Dropout Layer
best_model.add(layers.Dropout(rate=rate))
# Add Flatten Layer
best_model.add(layers.Flatten())
# Add  1st Fully-connected Layer
best_model.add(layers.Dense(units=unit, activation="relu"))
# Add  2nd Fully-connected Layer
best_model.add(layers.Dense(NUM_CLASSES, activation="softmax"))

# Compile the best model
best_model.compile(loss="categorical_crossentropy", optimizer="adam", metrics=['accuracy'])

# Define early stopping callback
early_stopping = EarlyStopping(monitor="loss", patience=5)

# Train the best model with augmented data
print("Model constructing ...")
history = best_model.fit(x_train, y_train, callbacks=[early_stopping], batch_size=BATCH_SIZE, epochs=EPOCH, verbose=1)

Model constructing ...
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


## Model Summary & Performance (ACC and LOSS)

In [None]:
# Best Model Summary
best_model.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 conv2d_5 (Conv2D)           (None, 14, 14, 128)       1280      
                                                                 
 max_pooling2d_5 (MaxPooling  (None, 7, 7, 128)        0         
 2D)                                                             
                                                                 
 conv2d_6 (Conv2D)           (None, 4, 4, 96)          110688    
                                                                 
 max_pooling2d_6 (MaxPooling  (None, 2, 2, 96)         0         
 2D)                                                             
                                                                 
 dropout_2 (Dropout)         (None, 2, 2, 96)          0         
                                                                 
 flatten_2 (Flatten)         (None, 384)              

In [None]:
# Evaluate the model accuracy on the testing set.
score = best_model.evaluate(x_test, y_test, verbose=1) # 1 for showing progress bar
# Accuracy is the 1st element of the score list
print("")
print("Model performances are as follows:")
print(f"Loss: {score[0]}")
print(f"Accuracy: {score[1]}")


Model performances are as follows:
Loss: 0.03738785535097122
Accuracy: 0.9890999794006348


## 4. Save Model

In [None]:
# Save the best model
best_model.save(f"***best_model_A3***.keras")

# Check if the model file exists
if path.isfile(f"***best_model_A3***.keras") == True:
    print("The best model has been saved successfully.")
else:
    print("Error Occured")

The best model has been saved successfully.
