### Peer-graded Assignment: Build a Regression Model in Keras

#### A. Build a baseline model

In [26]:
import pandas as pd
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error


In [2]:
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


As known in the lab, this dataset contains about 1000 samples to train the model on and no missing values. Split the dataset into predictors and target. 

In [31]:
train,test = train_test_split(concrete_data, test_size=0.3)

In [32]:
concrete_data_col = concrete_data.columns
predictors_train = train[concrete_data_col[concrete_data_col != 'Strength']]
target_train = train['Strength']

In [33]:
predictors_test = test[concrete_data_col[concrete_data_col != 'Strength']]
target_test = test['Strength']

#### Build the regression model:
- One hidden layer of 10 nodes, and a ReLU activation function
- Use the adam optimizer and the mean squared error as the loss function.


In [36]:
def reg_model():
    model = Sequential()
    model.add(Dense(10, activation='relu',input_shape=(n_cols,)))
    model.add(Dense(1))
    model.compile(optimizer='adam',loss='mean_squared_error')
    return model

#### Fit model

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the train_test_split helper function from Scikit-learn.
2. Train the model on the training data using 50 epochs.
3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

In [37]:
model = reg_model()

In [39]:
model.fit(predictors_train, target_train, epochs=50,verbose=0)

In [40]:
prediction = model.predict(predictors_test)

In [50]:
mean_squared_error(target_test, prediction)

76.25510915848686

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.
5. Report the mean and the standard deviation of the mean squared errors.

In [60]:
MSE = np.zeros(50)
for i in range(0,50):
    model.fit(predictors_train, target_train, epochs=50,verbose=0)
    prediction = model.predict(predictors_test)
    MSE[i] = mean_squared_error(target_test, prediction)

In [62]:
MSE.mean()

41.71742535689672

In [63]:
MSE.std()

1.498309545575541

#### B. Normalize the data (5 marks)
Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.

In [79]:
predictors_train_norm = (predictors_train-predictors_train.mean())/predictors_train.std()
predictors_test_norm = (predictors_test-predictors_test.mean())/predictors_test.std()
n_cols = predictors_train_norm.shape[1]

In [89]:
MSE_norm = np.zeros(50)
for i in range(0,50):
    model.fit(predictors_train_norm, target_train, epochs=50,verbose=0)
    prediction_norm = model.predict(predictors_test_norm)
    MSE_norm[i] = mean_squared_error(target_test, prediction_norm)

In [91]:
MSE_norm.mean()

39.627755031734836

In [92]:
MSE_norm.std()

0.46717421066748027

Compare to Step A, both mean and standard deviation decreased in Step B

#### C. Increate the number of epochs 
Repeat Part B but use 100 epochs this time for training.
How does the mean of the mean squared errors compare to that from Step B?

In [98]:
MSE_c = np.zeros(100)
for i in range(0,100):
    model.fit(predictors_train_norm, target_train, epochs=100,verbose=0)
    prediction_norm = model.predict(predictors_test_norm)
    MSE_c[i] = mean_squared_error(target_test, prediction_norm)

In [99]:
MSE_c.mean()

38.692448705267736

In [100]:
MSE_c.std()

0.7991862110330344

The mean of the MSE is smaller than the Step B. However, the standard deviation is larger than step B which means the prediction errors are more spread in Step C than Step B. 

#### D. Increase the number of hidden layers
Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

How does the mean of the mean squared errors compare to that from Step B?

In [101]:
def reg_model_d():
    model = Sequential()
    model.add(Dense(10, activation='relu',input_shape=(n_cols,)))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    model.compile(optimizer='adam',loss='mean_squared_error')
    return model

In [102]:
model_d = reg_model_d()

In [110]:
MSE_norm_d = np.zeros(50)
for i in range(0,50):
    model_d.fit(predictors_train_norm, target_train, epochs=50,verbose=0)
    prediction_norm = model_d.predict(predictors_test_norm)
    MSE_norm_d[i] = mean_squared_error(target_test, prediction_norm)

In [111]:
MSE_norm_d.mean()

40.72657283430016

In [112]:
MSE_norm_d.std()

1.6628651538200954

 Both the mean and the standard deviation are larger than which in the Step B