# Assignment - classification

Hi there! In this assignment, you will use a fully connected neural network (FCNN) to solve an adapted Question 1 of the winter 2023 exam in applied machine learning:

As in Assignment 1, the primary objective of this exam is to perform image classification using the PCam dataset. For a detailed description of the dataset, please refer to the assignment 1 description. The assignment is posted as a Kaggle competition and is available here: https://www.kaggle.com/t/cda2949c5097437581cdb9abd32091ae

To get you started, I have provided a complete working example, which is decent but not very impressive.

When you are done, submit your results on the Kaggle webpage for this competition. If you do not like to show your score to everyone, you may use an anonymous username on Kaggle.

However, I suggest you use your real name, after all it is just meant as an exercise and it is more fun that way. You can submit 5 times every day, so you can experiment with some stuff without being "locked in".

# Details

The metric used to score this assignment is accuracy (as in the first assignment).

### Question (adapted from the exam):
Use FCNN to perform image classification (tumor detection). Consider among other things the following:
1. Different activation functions
2. Different number of layers
3. Different number of neurons in each layer
4. Different learning rates
5. Different batch sizes
6. Different number of epochs
7. Different optimizers

**Note:** When you do hyperparameter tuning, you should use the validation set. The test set should only be used for the final evaluation.


# Hints to get you started (with a very simple model)

In [79]:
import tensorflow as tf
from tensorflow.keras.optimizers import SGD, Adam
import numpy as np
from sklearn.preprocessing import StandardScaler
import pandas as pd

In [80]:
print(tf.__version__)

2.10.1


Defining a function that takes a (None,96,96,3) array and turn it into (None, 32,32,1) (grayscale, resize and normalize). This function might also become handy if the original images are too large for your hardware configuration.

In [81]:
def resize_and_normalize_image(image):
    image = tf.image.resize(image,[32,32])
    image = tf.image.rgb_to_grayscale(image)
    return image / 255.0

def convert_sample(data):

# Create a TensorFlow dataset from the training data features
    dataset = tf.data.Dataset.from_tensor_slices(data)

# Define a function to resize each image in the dataset

# Apply the resize function to each image in the dataset
    resized_dataset = dataset.map(resize_and_normalize_image)

# Convert the resized dataset to a NumPy array
    resized_arr = np.array(list(resized_dataset.as_numpy_iterator()))

    return resized_arr

In [None]:
!pip install keras-tuner

In [82]:
import keras_tuner
from tensorflow import keras

In [129]:
def build_model(hp):
    model = keras.Sequential()
    #tf.keras.layers.Flatten(input_shape=(32,32,1)),
    model.add(keras.layers.Flatten(input_shape=(32, 32, 1)))
    
    # Determine the number of hidden layers
    #num_hidden_layers = hp.Int('num_hidden_layers', min_value=2, max_value=20)
    
    # Determine the number of nodes in each hidden layer
#     for i in range(num_hidden_layers):
#         model.add(keras.layers.Dense(
#             units=hp.Int(f'units_{i}', min_value=32, max_value=256, step=32),
#             activation='relu'
#         ))
        
    # Determine the number of nodes in each hidden layer
    for i in range(hp.Int('num_hidden_layers', 2, 20)):         
        #providing range for number of neurons in hidden layers
        model.add(keras.layers.Dense(
            units=hp.Int('num_of_neurons'+ str(i), min_value=32, max_value=512, step=32),
            #activation='relu', # gave the best score so far
            activation=hp.Choice('activation', values=['sigmoid','relu','tanh','elu','gelu','selu']),
        ))

    model.add(keras.layers.Dense(2, activation='softmax'))

    # Choose an optimizer
    #optimizer = hp.Choice('optimizer', values=['sgd', 'adam'])

    # Choose a learning rate
    # learning_rate = hp.Float('learning_rate', min_value=1e-4, max_value=1e-2, sampling='log')

#     if optimizer == 'sgd':
#         momentum = hp.Float('momentum', min_value=0.0, max_value=1.0)
        
#         model.compile(
#             optimizer=SGD(
#                 hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4]), 
#                 momentum=0.9, 
#                 nesterov=True
#             ),
#             loss='sparse_categorical_crossentropy',
#             metrics=['accuracy']
#         )
#     else:
#         model.compile(
#             optimizer=Adam(hp.Choice('learning_rate',values=[1e-2, 1e-3, 1e-4])),
#             loss='sparse_categorical_crossentropy',
#             metrics=['accuracy']
#         )

    model.compile(
            optimizer=SGD(
                hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4]), 
                #hp.Choice('momentum', values=[0.1, 0.25, 0.5, 0.75, 0.9]), # did not help for the best one
                nesterov=True
            ),
            loss='sparse_categorical_crossentropy',
            metrics=['accuracy']
   )

    return model

In [130]:
# Initialize Keras Tuner
tuner = keras_tuner.RandomSearch(
    build_model,
    objective='val_accuracy',
    max_trials=10,  # Number of hyperparameter combinations to try
    executions_per_trial=4,
    directory='keras_tuner',
    project_name='a2_t5'
)

In [131]:
tuner.search_space_summary()

Search space summary
Default search space size: 5
num_hidden_layers (Int)
{'default': None, 'conditions': [], 'min_value': 2, 'max_value': 20, 'step': 1, 'sampling': 'linear'}
num_of_neurons0 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': 'linear'}
activation (Choice)
{'default': 'sigmoid', 'conditions': [], 'values': ['sigmoid', 'relu', 'tanh', 'elu', 'gelu', 'selu'], 'ordered': False}
num_of_neurons1 (Int)
{'default': None, 'conditions': [], 'min_value': 32, 'max_value': 512, 'step': 32, 'sampling': 'linear'}
learning_rate (Choice)
{'default': 0.01, 'conditions': [], 'values': [0.01, 0.001, 0.0001], 'ordered': True}


In [132]:
# Load the training data features
X_train_raw = np.load('Xtrain.npy')
print(f'Shape of the raw training data: {X_train_raw.shape}')
X_test_raw = np.load('Xtest.npy')
print(f'Shape of the raw test data: {X_test_raw.shape}')

X = convert_sample(X_train_raw)
print(f'Shape the resized training data: {X.shape}')

X_test = convert_sample(X_test_raw)
print(f'Shape the resized test data: {X_test.shape}')

y = np.load('ytrain.npy')
y = y.reshape(-1,1) 
print(f'Shape of the raw labels: {y.shape}')

Shape of the raw training data: (26214, 96, 96, 3)
Shape of the raw test data: (1638, 96, 96, 3)
Shape the resized training data: (26214, 32, 32, 1)
Shape the resized test data: (1638, 32, 32, 1)
Shape of the raw labels: (26214, 1)


In [134]:
# Split the data into training and validation set
from sklearn.model_selection import train_test_split
X_train, X_val, y_train, y_val = train_test_split(X, y, test_size=0.2, random_state=48)

In [135]:
# Search for the best hyperparameters

# Define a batch size range to search within
#batch_size = tuner.Int('batch_size', min_value=32, max_value=256, step=32)

#tuner.search(X_train, y_train, epochs=10, batch_size=batch_size, validation_data=(X_val, y_val))
tuner.search(X_train, y_train, epochs=10, validation_data=(X_val, y_val))
best_model = tuner.get_best_models()[0]
tuner.results_summary()

Trial 10 Complete [00h 04m 28s]
val_accuracy: 0.50658018887043

Best val_accuracy So Far: 0.7082300186157227
Total elapsed time: 00h 29m 00s
Results summary
Results in keras_tuner\a2_t5
Showing 10 best trials
Objective(name="val_accuracy", direction="max")

Trial 08 summary
Hyperparameters:
num_hidden_layers: 6
num_of_neurons0: 224
activation: selu
num_of_neurons1: 128
learning_rate: 0.01
num_of_neurons2: 448
num_of_neurons3: 96
num_of_neurons4: 160
num_of_neurons5: 96
num_of_neurons6: 352
num_of_neurons7: 96
num_of_neurons8: 32
num_of_neurons9: 128
num_of_neurons10: 288
num_of_neurons11: 192
num_of_neurons12: 480
num_of_neurons13: 480
num_of_neurons14: 192
num_of_neurons15: 96
num_of_neurons16: 320
Score: 0.7082300186157227

Trial 03 summary
Hyperparameters:
num_hidden_layers: 4
num_of_neurons0: 480
activation: elu
num_of_neurons1: 352
learning_rate: 0.01
num_of_neurons2: 224
num_of_neurons3: 320
num_of_neurons4: 64
num_of_neurons5: 256
num_of_neurons6: 256
num_of_neurons7: 256
num_of

In [136]:
# Get the best hyperparameters
best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]
best_accuracy = tuner.oracle.get_best_trials(num_trials=1)[0].score

print("Optimal Hyperparameters:")

print(f"Number of Hidden Layers: {best_hps.get('num_hidden_layers')}")
for i in range(best_hps.get('num_hidden_layers')):
    print(f"Hidden Layer {i+1} Units: {best_hps.get(f'num_of_neurons{i}')}")
    print(f"Hidden Layer {i+1} Activation: {best_hps.get('activation')}")
    
#print(f"Optimizer: {best_hps.get('optimizer')}")

print(f"Learning Rate: {best_hps.get('learning_rate')}")

#print(f"Momentum: {best_hps.get('momentum')}")

print(f"Optimal Accuracy: {best_accuracy}")

Optimal Hyperparameters:
Number of Hidden Layers: 6
Hidden Layer 1 Units: 224
Hidden Layer 1 Activation: selu
Hidden Layer 2 Units: 128
Hidden Layer 2 Activation: selu
Hidden Layer 3 Units: 448
Hidden Layer 3 Activation: selu
Hidden Layer 4 Units: 96
Hidden Layer 4 Activation: selu
Hidden Layer 5 Units: 160
Hidden Layer 5 Activation: selu
Hidden Layer 6 Units: 96
Hidden Layer 6 Activation: selu
Learning Rate: 0.01
Optimal Accuracy: 0.7082300186157227


In [137]:
# Build the final model with the best hyperparameters
final_model = tuner.hypermodel.build(best_hps)

In [138]:
# Train the final model and save the predictions
final_model.fit(X, y, epochs=10, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x2590b9ffd00>

The below code makes predictions and then saves them (after checking they are in correct format).

The argmax converts probabilities to specific class predictions.

And finally convert to appropriate $\texttt{.csv}$ for Kaggle submit.

In [92]:
# build model with the best hyperparameter and then search for momentum
# def build_model_with_batch(learning_rate, momentum):
#     model = tf.keras.models.Sequential([
#         tf.keras.layers.Flatten(input_shape=(32, 32, 1)),
#         tf.keras.layers.Dense(224, activation='relu'),
#         tf.keras.layers.Dense(224, activation='relu'),
#         tf.keras.layers.Dense(32, activation='relu'),
#         tf.keras.layers.Dense(64, activation='relu'),
#         tf.keras.layers.Dense(32, activation='relu'),
#         tf.keras.layers.Dense(32, activation='relu'),
#         tf.keras.layers.Dense(32, activation='relu'),
#         tf.keras.layers.Dense(10, activation='softmax'),
#     ])
    
#     optimizer = tf.keras.optimizers.SGD(learning_rate=learning_rate, momentum=momentum)
    
#     model.compile(
#         loss='sparse_categorical_crossentropy',
#         optimizer=optimizer,
#         metrics=['accuracy'],
#     )
    
#     return model

In [103]:
# Build the optimal model with the best hyperparameters
# learning_rate = 0.001
# momentum = 0.9
# batch_size = 32

# learning_rate = 0.0001
# momentum = 0.9
# batch_size = 32

# optimal_model = build_model_with_batch(learning_rate=learning_rate, momentum=momentum) 

# # Train the optimal model and save the predictions
# optimal_model.fit(X, y, epochs=10, verbose=1)

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


<keras.callbacks.History at 0x259007aaaa0>

In [None]:
# from tqdm import tqdm
# #learning_rates = [0.1, 0.01, 0.001, 0.00001] # must be positive floats. Default depends on optimizer
# batch_sizes = [8,16,32,64,128] # # must be positive ints. Default is 32
# #momentums = [0.1,0.25,0.5,0.75,0.9] # must be in [0, 1). Default (for SGD) is 0.0
# learning_rate = 0.001
# momentum = 0.9

# results = []

# # for learning_rate in tqdm(learning_rates):
# for batch_size in batch_sizes:
#     #for momentum in momentums:
#         # model = build_model_with_momentum(learning_rate, momentum=momentum) 
#     model = build_model_with_batch(learning_rate=learning_rate, momentum=momentum) 
#     model.fit(X_train, y_train, validation_data=(X_val, y_val), epochs=10, batch_size=batch_size, verbose=1)# remember to pass in batch_size here! Also remember to use epochs=2
#     loss, acc = model.evaluate(x_test, y_test)
#     results.append((acc, batch_size))
    
# results = pd.DataFrame(results, columns=['Accuracy', 'Batch size'])
# results

In [None]:
# results[results['Accuracy'] == results['Accuracy'].max()]

In [None]:
# Train and evaluate final model.
# Remember to use both train and val data for training for best performance! 
# Similar to what we have done in all the other exercises/assignments

In [139]:
y_test_hat = final_model.predict(X_test)
y_test_hat = np.argmax(y_test_hat, axis=1)

ytest_hat_pd = pd.DataFrame({
    'Id': list(range(len(y_test_hat))),
    'Predicted': y_test_hat.reshape(-1,),
})

ytest_hat_pd.to_csv('y_test_hat_fcnn.csv', index=False)

