## Exercise

Time to experiment with activation functions and optimizers! And now we're at it, let's use this as an introduction to regression using neural networks as well.

1. Use the **fetch_california_housing** data (remember to split your data into a train and test data). Use the five optimizers presented in class to train five neural networks (identival aside from the optimizer used). How well does the networks perform on the test set, as measured by MSE and MAE? Rank the optimizers.
1. Select the best optimizer and use it for this exercise. Experiment with different activation functions, including at least sigmoid, tanh, and relu. Rank the activation functions you try. 
1. Using your findings, as well as experimenting with more layers, try to minimize the test MSE.

**Note**: You may want to use https://www.tensorflow.org/api_docs/python/tf/keras/activations and https://www.tensorflow.org/api_docs/python/tf/keras/optimizers.

**See slides for more details!**

# Exercise 1

Use the **fetch_california_housing** data (remember to split your data into a train and test data). Use the five optimizers presented in class to train five neural networks (identival aside from the optimizer used). How well does the networks perform on the test set, as measured by MSE and MAE? Rank the optimizers.

In [1]:
import tensorflow as tf
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from sklearn.preprocessing import StandardScaler

x, y = fetch_california_housing(return_X_y=True)

# Use `train_test_split` to split your data into a train and a test set.
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=1)

# Scale
scaler = StandardScaler()
z_train = scaler.fit_transform(x_train)
z_test = scaler.transform(x_test)

print(z_train.shape, z_test.shape, y_train.shape, y_test.shape)

(16512, 8) (4128, 8) (16512,) (4128,)


Here is a small function you can use as a starting point for your network - but feel free to experiment!

In [2]:
def build_nn(activation = 'sigmoid'):
    your_regression_nn = tf.keras.models.Sequential([
        tf.keras.layers.Dense(64, activation=activation, input_shape=(8,)), # input_shape=8 since 8 features
        tf.keras.layers.Dense(1, activation='linear'), # linear is used for regression. 1 node since 1 output (pr. observation)
        ])

    return your_regression_nn

**Important note**: Remember to use "mse" as your loss function! Now, it is okay to try something else, but at least do not use cross entropy (remember that is for classification.

Go through each of the five optimizers covered in class and rank their performance on this dataset.

In [3]:
# SGD
# This code I have completed for you - use it to construct to other 4 cases (i.e. for the other 4 optimizers covered in class).
nn_sgd = build_nn()
nn_sgd.compile(
    optimizer='SGD',
    loss='mse',
    metrics=['mae'], # to also track MAE. MSE is "automatically" measured since it is the loss
    )
nn_sgd.fit(z_train, y_train, epochs=5)
mse, mae = nn_sgd.evaluate(z_test, y_test)
print(f'Test mse = {round(mse, 3)}, test mae = {round(mae, 3)}.')

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


# Exercise 2

Select the best optimizer and use it for this exercise. Experiment with different activation functions, including at least sigmoid, tanh, and relu. Rank the activation functions you try. 

In [None]:
# Example of using tanh
nn_tanh = build_nn('tanh')
nn_tanh.compile(
    optimizer=??,
    loss='mse',
    metrics=['mae'],
    )
nn_tanh.fit(z_train, y_train, epochs=5)
mse, mae = nn_tanh.evaluate(z_test, y_test)
print(f'Test mse = {round(mse, 3)}, test mae = {round(mae, 3)}.')

# Exercise 3

Using your findings, as well as experimenting with more layers, try to minimize the test MSE.

In [None]:
# Try to experiment a bit, but here is an example of a model with more layers
def build_better_nn(activation):
    your_regression_nn = tf.keras.models.Sequential([
        tf.keras.layers.Dense(32, activation=activation, input_shape=(8,)), # input_shape=8 since 8 features
        tf.keras.layers.Dense(64, activation=activation),
        tf.keras.layers.Dense(128, activation=activation),
        tf.keras.layers.Dense(1, activation='linear'), # linear is used for regression. 1 node since 1 output
        ])

    return your_regression_nn

In [None]:
nn_final = build_better_nn(??)
nn_final.compile(
    optimizer=??,
    loss='mse',
    metrics=['mae'],
    )
nn_final.fit(z_train, y_train, epochs=??)
mse, mae = nn_final.evaluate(z_test, y_test)
print(f'Final model test mse = {round(mse, 3)}, test mae = {round(mae, 3)}.')