# Assignment 1- Topics From Labs 1 & 2
  <a target="_blank" href="https://colab.research.google.com/github/andrew-nash/CS6421-labs-2025/blob/main/CS6421_Assignment_01.ipynb">
  <img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/>
</a>

Due on  20/02/2025 at 23:59:59 UTC

In [None]:
import tensorflow as tf
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np

# Data Loading And Cleaning

For this lab, we will use a house pricing dataset (credit: https://www.kaggle.com/datasets/shree1992/housedata). However, instead of predicting house prices here, we are instead going to attempt to classity to condition of the property based on the other featues.

In [None]:
%%bash
wget -O data.csv https://github.com/andrew-nash/CS6421-labs-2025/raw/refs/heads/main/data.csv

In [None]:
raw_data = pd.read_csv("data.csv")

In [None]:
def get_seed_from_s_no(s_no):
  ### DO NOT CHANGE THIS FUNCION
  seeds = [[16, 81], [30, 18]]
  i = int(student_no%2==0)
  j = int(student_no%10<5)
  return seeds[i][j]

In [None]:
features = ["price","bedrooms","bathrooms","sqft_living","sqft_lot","floors","waterfront","view","sqft_above",	"sqft_basement",	"yr_built",	"yr_renovated"]

reduced_data = raw_data[features+['condition']]

# enter your student number here
student_no = STUDENT NUMBER

df = raw_data.sample(axis=1,frac=1, random_state=get_seed_from_s_no(student_no))

train_split = 0.85

x_train = df[features].to_numpy()[:int(train_split*len(df))]
y_train = df['condition'].to_numpy()[:int(train_split*len(df))]

x_test = df[features].to_numpy()[int(train_split*len(df)):]
y_test = df['condition'].to_numpy()[int(train_split*len(df)):]

In [None]:
# scaling the x valuee to [0,1] on each column
# is done as follows
# axis=0 means that the operation is performed accross each column
x_train_clean = (x_train-x_train.min(axis=0))/x_train.max(axis=0)

#### IMPORTATNT - observe that we scale by _train.min()
# when we are scaling the test dataset - why is this?
x_test_clean = (x_test-x_train.min(axis=0))/x_train.max(axis=0)

### Convert the y data to one hot encoding.
### There are 5 values for condition: 1,2,3,4,5
###
### If we want to use the tf.one_hot function then
### convert these to 0,1,2,3,4

y_train_clean = tf.one_hot(y_train-1, depth=5)
y_test_clean  = tf.one_hot(y_test-1, depth=5)


# Task 1: Hyper-parameter Optimzation

We will use the Keras tuner to partially automate this process (https://www.tensorflow.org/tutorials/keras/keras_tuner)

### Model Builder Function

The first step is to define a function over the hyper-parameters of interest, that returns the validation metrics.

We will then search over these arguments to find their optimal values.

For this task, you should search over the following hyper-parameters


| Hyper-Parameter | Min | Max |
| -- | -- | -- |
| No. Layers | 2 | 10 |
| Neurons in layer `i` | 10 | 750 |
| Regularization | 0.0001 | 0.1 |

<br>

| Hyper-Parameter | Choices |
|---|---|
| Activaiton | [relu, elu, sigmoid]|
| Type of Regularization | [L1,L2,L1L2] |
| Use BatchNorm | [True, False] |
| batch_size | [16,64,124] |
| Learning Rate | [0.01,0.001,0.001] |
| kernel_initializer |  ["glorot_normal","glorot_uniform", "zeros"] |
| bias_initializer |  ["zeros", "ones√ü"] |




In [None]:
!pip install -U keras-tuner

In [None]:
%load_ext tensorboard

In [None]:
import keras_tuner as kt

In [None]:
def task1_search_model_coarse(hp):
  model = tf.keras.models.Sequential()

  model.add(tf.keras.layers.Input(shape=(12,)))

  num_layers = hp.Int("num_layers", min_value=2, max_value=10, step=2)
  reg_type = hp.Choice("reg_type", ["L1", "L2", "L1L2"])
  reg_level = hp.Choice("reg_level", [0.0001, 0.001, 0.01, 0.1])

  act_fuc = hp.Choice("act_fuc", ["relu", "elu", "sigmoid"])

  for i in range(num_layers):
    neurons = hp.Int(f"num_neurons_layer_{i}", min_value=10, max_value=750, step=2, sampling='log')

    if reg_type == "L1":
      reg = tf.keras.regularizers.L1(l1=reg_level)
    elif reg_type == "L2":
      reg = tf.keras.regularizers.L2(l2=reg_level)
    else:
      reg = tf.keras.regularizers.L1L2(l1=reg_level, l2=reg_level)

    kernel_initializer = hp.Choice("kernel_initializer", ["glorot_normal","glorot_uniform", "zeros"])
    bias_initializer = hp.Choice("bias_initializer", ["zeros", "ones"])

    model.add(tf.keras.layers.Dense(neurons, activation=act_fuc,
                                    activity_regularizer = reg,
                                    kernel_initializer=kernel_initializer,
                                    bias_initializer=bias_initializer))

    add_batchnorm = hp.Boolean("add_batchnorm")
    if add_batchnorm:
      model.add(tf.keras.layers.BatchNormalization())


  model.add(tf.keras.layers.Dense(5, activation="softmax"))

  lr = hp.Choice("learning_rate", values=[0.01, 0.001, 0.0001])

  model.compile(
    optimizer= tf.keras.optimizers.Adam(learning_rate=lr),
    loss = "categorical_crossentropy",
    metrics = ["accuracy"]
  )

  return model


In [None]:
tuner = kt.RandomSearch(task1_search_model_coarse,
                        objective='val_accuracy',
                        max_trials=60,
                        seed=42,
                        overwrite=True,
                        directory="./hyp_searches/",
                        project_name="coarse_search_bs16")

In [None]:
tuner.search_space_summary()

In [None]:
%tensorboard --logdir "./hyp_searches/coarse_search_bs16"

In [None]:
tuner.search(
    x_train_clean,
    y_train_clean,

    validation_split = 0.8,
    batch_size=16,
    epochs=10,
    callbacks=[tf.keras.callbacks.TensorBoard("./hyp_searches/coarse_search_bs16/tb_logs")]
)

In [None]:
tuner.get_best_hyperparameters()[0].values

### Search Over other Values of batch_size

In [None]:
tuner = kt.RandomSearch(task1_search_model_coarse,
                        objective='val_accuracy',
                        max_trials=60,
                        seed=42,
                        overwrite=True,
                        directory="./hyp_searches/",
                        project_name="coarse_search_bs64")

tuner.search(
    x_train_clean,
    y_train_clean,
    validation_split = 0.8,
    batch_size=64,
    epochs=10,
    callbacks=[tf.keras.callbacks.TensorBoard("./hyp_searches/coarse_search_bs64/tb_logs")]
)

In [None]:
tuner = kt.RandomSearch(task1_search_model_coarse,
                        objective='val_accuracy',
                        max_trials=60,
                        seed=42,
                        overwrite=True,
                        directory="./hyp_searches/",
                        project_name="coarse_search_bs124")
tuner.search(
    x_train_clean,
    y_train_clean,
    validation_split = 0.8,
    batch_size=124,
    epochs=10,
    callbacks=[tf.keras.callbacks.TensorBoard("./hyp_searches/coarse_search_bs124/tb_logs")]
)

### Continue Optimizing

Continue tuning this model.

**IMPORTANT** Marks will be awarded based on your model-tuning process. Make sure all model builder function names start with `task1_search_`

Based on the TensorBoard output above, you should be able to fix a few of the hyper-parameters (hint: such as the activation function).

Continue to perform additional hyper-parameter searches to narrow in on the optimal set of hyper-parameters. From the table above.

Don't just focus on taking the best validation accuracy - look at the TensorBoard outputs and try to find models that have good performance, but also minimal overfitting.



#### Evaluate

It is good practice when tuning these hyper-parameters to not use the test dataset for tuning - we will perform a separate split on our training data, and evaluate on the test dataset post-optimization


Once you are happy, define a Sequential model using the best parametrs extracted above.

Train it for 20 epochs (or fewer if you wish) and evalue its performance on the test dataset.

### Discussion

Explain, in a **short paragraph**

1. Some key observations you made in the hyper-parameter tuning.
2. If the evalutation on the test datset gave the results you expected, and why

In [None]:
task1_explanation = '''

'''

## Task 2

Now, consider that unlike our image labels in the MNIST problems, the labels here are in fact ordinal data, existing on a discrete scale from 1-5.

### Re-process The Label Data to Prepare for a model that will predict a single value

Instead of one-hot encoding, now we must scale the y_train values (from [1,5]) to be between [0,1]

In [None]:
y_train_clean = None
y_test_clean = None

### Define a Custom Accruacy Function

Since our model predicts a single scalar value to predict discrete ordinal classes, TensorFlow cannot directly assign a class to a prediction itself.

The loss can still be computed as normal, but it means that we will need to define the accuracy metric ourselves.

\begin{equation}
  f_\text{acc}(y_\text{true},y_\text{pred})=   
  \begin{cases}
  1& \text{if}\; round( 5* y_\text{true})=round( 5* y_\text{pred}) \\
  0& \text{otherwise}
\end{cases}
\end{equation}

Hint: use the tf.math.round() and tf.equal()
function


In [None]:
@tf.function
def accuracy(y_true, y_pred):
  '''
  process
  '''

### Optimize the hyper-parameters Similarly To task 1

You do not start from scratch, use a range of hyper-parameters that are close to the optimal values found in task 1.


**REMEMBER** that you must change the **output layer** from 12 to 1 neuron, and change the **loss function**. Also, think about what activation function is most appropiate for this output layer.

## Compare the models from Task 1 and Task 2

Which is better, and why do you think this is the case? Are these results what you expected? Explain in no more than 10 sentences.

In [None]:
final_explanation =  '''
  Put your explanation here
'''