# Deep Learning week - Day 1 - Regression

### Exercise objectives:
- Write a Neural Network suited for a regression
- Optimize the model with the right loss and metrics

<hr>
<hr>

The overall objective of this exercise is to predict the house pricing in the Boston area (USA) based on the input features. This will be done with a Neural Network.

The intention of this exercise is to :
- prepare the data for a NN (Neural Network)
- train a _regression_ NN
- check the NN loss during the training and adapt accordingly
- select the hyperparameters of the NN

# Data

We will predict the price of houses in Boston and suburbs, based on input variables as the is the pupil-teacher ratio (in the related town), nitric oxides concentration, the crime rate per capita or the weighted distances to five Boston employment centers.

You can check additional information about the dataset here https://towardsdatascience.com/machine-learning-project-predicting-boston-house-prices-with-regression-b4e47493633d

This classic dataset is provided in the Keras library. It can be loaded as follows : 

In [None]:
from tensorflow.keras.datasets import boston_housing

(X_train, y_train), (X_test, y_test) = boston_housing.load_data()

`shape` is an interesting attribute of the data object. It gives the (row, column) shape of the data.

In [None]:
print("Size of training data: {}".format(X_train.shape))
print("Size of test data: {}".format(X_test.shape))

❓ **Question** ❓ What kind of Machine Learning is this problem related to? Supervised, regression, unsupervised, clustering, classification, ... ?

In [None]:
### YOUR ANSWER HERE


For reason we will see during the week, it is important to center and normalize the data so that they are centered around 0 with a variance of 1. 

❓ **Question** ❓ Use the StandardScaler from scikit learn [(see documentation)](https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html) to standardise the data 

Warning : Use it wisely on the train and test set. _Hint_ : you can check what was done on the multiclass classification tutorial.

In [None]:
from sklearn.preprocessing import StandardScaler

### YOUR CODE HERE

❓ **Question** ❓ Plot each of your variable within the train set to check that it is somehow centered around 0 with small variance

In [None]:
### YOUR PLOTS HERE

# Model


Now that we have the data, we will define a first architecture and run the model 

❓ **Question** ❓ Initialize a model which has 
- a first layer with 64 neurons (do not forget the activation and the input dim
- a second layer with 32 neurons
- a final layer that outputs a predicted value 

Hint : in the case of a regression, your final layer will look similar to `layers.Dense(SOME_NUMBER, activation='linear')` where `SOME_NUMBER` corresponds to the dimension of the output you want to predict.

In [None]:
def initialize_model():

    # Model architecture
    ### YOUR CODE HERE
    
    
    # Model optimization : Optimized, loss and metric
    model.compile(optimizer='adam',
                  loss='mse',       # MSE stands for Mean Square Error
                  metrics=['mae'])  # MAE stands for Mean Absolute Error
    
    return model


model = initialize_model()

❓ **Question** ❓ What can you say about the loss and the metrics?

In [None]:
### YOUR ANSWER HERE

❓ **Question** ❓ Run the model on the train data. As in the previous exercise, add `validation_data=(X_test, y_test)` to check the MAE and MSE value on the test set during the iterations
and plot the history on the train and test set.

In [None]:
def plot_loss_mae(history):
    plt.plot(history.history['loss'])
    plt.plot(history.history['val_loss'])
    plt.title('Model loss')
    plt.ylabel('Loss')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Test'], loc='best')
    plt.show()
    
    plt.plot(history.history['mae'])
    plt.plot(history.history['val_mae'])
    plt.title('Model MAE')
    plt.ylabel('MAE')
    plt.xlabel('Epoch')
    plt.legend(['Train', 'Test'], loc='best')
    plt.show()
    
### YOUR CODE HERE

❓ **Question** ❓ Try some architecture of your own to get the best possible MAE. (We will see tomorrow how to avoid the overfitting). Especially, you can try to :

- Change the number of layers
- Change the number of neurons in each layer
- Change the activation function within each layer