# Keras based regression model

<hr>

<a id='partA'></a>
## Part A - Building a baseline model

In [1]:
# importing the packages
import pandas as pd
import numpy as np
import keras
from keras.models import Sequential
from keras.layers import Dense

Using TensorFlow backend.


In [2]:
# loading the data
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


**Splitting data into predictors and target**

- The target variable in this problem is the concrete sample strength. 


- The predictors will be all the other columns.

In [3]:
# splitting the data into predictors and target
concrete_data_col = concrete_data.columns

# defining predictors by including all feature columns except strength.
predictors = concrete_data[concrete_data_col[concrete_data_col != 'Strength']]

# defining the target column
target = concrete_data['Strength']

print('PREDICTORS: \n')
print(predictors.head()) 
print('\n')
print('TARGET: \n')
print(target.head())

PREDICTORS: 

   Cement  Blast Furnace Slag  Fly Ash  Water  Superplasticizer  \
0   540.0                 0.0      0.0  162.0               2.5   
1   540.0                 0.0      0.0  162.0               2.5   
2   332.5               142.5      0.0  228.0               0.0   
3   332.5               142.5      0.0  228.0               0.0   
4   198.6               132.4      0.0  192.0               0.0   

   Coarse Aggregate  Fine Aggregate  Age  
0            1040.0           676.0   28  
1            1055.0           676.0   28  
2             932.0           594.0  270  
3             932.0           594.0  365  
4             978.4           825.5  360  


TARGET: 

0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64


- We now render the input one-dimensional for model building.

In [4]:
# rendering input one-dimensional
n_cols = predictors.shape[1]

- I now build the neural network using the Keras library.


- I include **one hidden layer** with **10 nodes**. 


- I include a **ReLU** activation function.


- I compile the model with the **adam** optimizer and the **mean_squared_error** as the loss function. 

In [5]:
# building the model
def regr_model():
    model = Sequential()
    model.add(Dense(10, activation = 'relu', input_shape = (n_cols, )))
    model.add(Dense(10, activation = 'relu'))
    model.add(Dense(1))
    
    # compiling the model
    model.compile(optimizer = 'adam', 
                  loss = 'mean_squared_error',
                  metrics = ['mse'])
    
    return model

model = regr_model()
model

<keras.engine.sequential.Sequential at 0x1315b1a10>

- I now split the data.


- The simplest way is via Scikit-Learn.


- I import **train_test_split** from **sklearn.model_selection**

In [105]:
# randomly splitting data
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(predictors, 
                                                    target, 
                                                    test_size=0.3, 
                                                    random_state = 4)

print('The validation (i.e: test) data: \n')
print(X_test.shape)
print(y_test.shape, '\n')

print('The training data: \n')
print(X_train.shape)
print(y_train.shape)

The validation (i.e: test) data: 

(309, 8)
(309,) 

The training data: 

(721, 8)
(721,)


- I now use the .fit method to fit the model for training.


- **Note**: One may use 'shuffle = True' to shuffle data again when training in each epochs but I did not find any major difference at this stage.

In [118]:
model_fit = model.fit(X_train, 
                      y_train,
                      validation_data = [X_test, y_test],
                      epochs = 50,
                      verbose = 1)

Train on 721 samples, validate on 309 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


- I proceed to now evaluate the model with the .predict method in a first instance.


- I label the predicted values by **y_pred**


- I note that **y_pred** is not of the same shape as y_test, so I ensure that y_test and y_pred have the same shape before implementing any evaluation metrics.

In [119]:
# evaluating the model
y_pred = model.predict(X_test)

print(y_pred[:10]) # printing just first 10 values
print('\n')
print('Shape of array y_pred is:', y_pred.shape)

[[45.327934]
 [53.799763]
 [25.145731]
 [42.86279 ]
 [44.69525 ]
 [42.6908  ]
 [52.20326 ]
 [24.99718 ]
 [59.718204]
 [22.357647]]


Shape of array y_pred is: (309, 1)


In [120]:
# reshaping array
import numpy as np
y_testarray = np.transpose(([y_test]))

print(y_testarray[:10]) # printing just first 10 values
print('\n')
print('Shape of array y_testarray is:', y_testarray.shape)

[[44.52]
 [50.53]
 [21.82]
 [38.8 ]
 [55.6 ]
 [39.42]
 [81.75]
 [24.5 ]
 [69.84]
 [19.99]]


Shape of array y_testarray is: (309, 1)


In [121]:
# calculating mean squared error
from sklearn.metrics import mean_squared_error
mean_squared_error(y_testarray, y_pred)

45.601026865748956

In [122]:
# getting the mean squared error for each 50 epochs
mse = model_fit.history['mse']

# creating the list of the 50 individual mse
list(mse)

[48.7318,
 50.457325,
 48.868332,
 45.721493,
 48.14556,
 48.37437,
 45.97351,
 46.862045,
 47.19001,
 45.82891,
 46.45313,
 46.894463,
 46.287964,
 48.685593,
 48.24435,
 46.40776,
 48.716995,
 50.353767,
 48.02197,
 45.864338,
 46.45332,
 47.833202,
 47.116405,
 47.02548,
 47.339943,
 46.85834,
 49.03749,
 47.435978,
 47.499664,
 45.900238,
 48.813908,
 49.224174,
 46.308567,
 45.37275,
 46.726833,
 49.937973,
 46.336105,
 46.33087,
 48.011654,
 46.45423,
 47.406853,
 47.123745,
 46.07556,
 46.54427,
 46.99451,
 45.72799,
 46.175213,
 46.587303,
 46.136856,
 45.567722]

In [123]:
mean = np.mean(mse)
st_dev = np.std(mse)

print('Mean of the 50 mse = ', mean)
print('Standard deviation of the 50 mse =', st_dev)

Mean of the 50 mse =  47.248817
Standard deviation of the 50 mse = 1.2687162


<hr>

<a id ='partB'></a>
## Part B - Normalizing the data

In [13]:
# normalizing the data
# We subtract the mean and divide by standard deviation
pred_norm = (predictors - predictors.mean()) / predictors.std()
pred_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


- I repeat Part A. 


- I specify the input. 


- I then build the model, split the validation and training data, and fit the model.


- I then compare the new mean of the new individual mean squared errors.

In [14]:
# saving the number of predictors in n_cols_norm as 1-dim inputs
n_cols_norm = pred_norm.shape[1]

In [124]:
# building the model
def regr_model_norm():
    model_norm = Sequential()
    model_norm.add(Dense(10, activation = 'relu', input_shape = (n_cols_norm, )))
    model_norm.add(Dense(10, activation = 'relu'))
    model_norm.add(Dense(1))
    
    # compiling the model
    model_norm.compile(optimizer = 'adam', 
                       loss = 'mean_squared_error',
                       metrics = ['mse'])
    
    return model_norm

model_norm = regr_model_norm()
model_norm

<keras.engine.sequential.Sequential at 0x10e48ea90>

In [125]:
# split the validation and training data randomly
from sklearn.model_selection import train_test_split
X_train1, X_test1, y_train1, y_test1 = train_test_split(pred_norm, 
                                                        target, 
                                                        test_size=0.3, 
                                                        random_state = 4)

print('The validation (i.e: test) data: \n')
print(X_test.shape)
print(y_test.shape, '\n')

print('The training data: \n')
print(X_train.shape)
print(y_train.shape)

The validation (i.e: test) data: 

(309, 8)
(309,) 

The training data: 

(721, 8)
(721,)


In [147]:
# fitting the model
model_norm_fit = model_norm.fit(X_train1, 
                      y_train1,
                      validation_data = [X_test1, y_test1],
                      epochs = 50,
                      verbose = 1)

Train on 721 samples, validate on 309 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [148]:
# evaluating the model with normalised data
y_pred1 = model_norm.predict(X_test1)

print(y_pred1[:10]) # printing just first 10 values
print('\n')
print('Shape of array y_pred is:', y_pred1.shape)

[[47.03498 ]
 [51.693844]
 [20.899956]
 [37.824623]
 [52.644394]
 [43.076744]
 [56.85761 ]
 [24.296974]
 [59.39751 ]
 [28.411846]]


Shape of array y_pred is: (309, 1)


In [149]:
# reshaping array y_test1
y_testarray1 = np.transpose(([y_test1]))

print(y_testarray1[:10]) # printing just first 10 values
print('\n')
print('Shape of array y_testarray1 is:', y_testarray1.shape)

[[44.52]
 [50.53]
 [21.82]
 [38.8 ]
 [55.6 ]
 [39.42]
 [81.75]
 [24.5 ]
 [69.84]
 [19.99]]


Shape of array y_testarray1 is: (309, 1)


In [150]:
# calculating new mean squared error
from sklearn.metrics import mean_squared_error
mean_squared_error(y_testarray1, y_pred1)

38.08460959710628

In [151]:
# getting the mean squared error for each 50 epochs in new model
mse_norm_data = model_norm_fit.history['mse']

# creating the list of the 50 individual mse
list(mse_norm_data)

[27.049284,
 27.146837,
 26.982853,
 27.20802,
 27.133114,
 27.01495,
 27.37964,
 27.151093,
 27.19383,
 27.158417,
 27.243437,
 27.386585,
 27.483265,
 27.311352,
 27.087736,
 27.237076,
 27.222233,
 27.07987,
 26.984148,
 27.138916,
 26.97658,
 27.101086,
 27.069777,
 27.10966,
 26.948702,
 26.824123,
 26.986301,
 26.880949,
 26.891207,
 27.40449,
 27.1748,
 26.878946,
 26.977251,
 26.912052,
 26.996416,
 26.686817,
 26.8456,
 26.897549,
 26.81136,
 26.824034,
 27.112597,
 26.791473,
 26.669579,
 26.755426,
 26.801664,
 26.72053,
 26.814726,
 26.91487,
 26.985079,
 26.75542]

In [152]:
mean_norm = np.mean(mse_norm_data)
st_dev_norm = np.std(mse_norm_data)

print('Mean of the 50 mse with normalized data = ', mean_norm)
print('Standard deviation of the 50 mse with normalized data =', st_dev_norm)

Mean of the 50 mse with normalized data =  27.022232
Standard deviation of the 50 mse with normalized data = 0.19578068


- The mean and standard deviation of the new results are even smaller indicating that the algorithm is performing much better already with the normalized data. 

<hr>

<a id = 'partC'></a>
## Part C - Increasing the number of epochs

- We repeat Part B but with increased number of epochs. 


- **Note**: I shall not redefine the variables train and test since I am using the same normalised training and testing data. However, as the predictions are expected to be different, it makes sense to relabel our prediction data as, say **y_pred2**.

In [75]:
# the model building and splitting are unchanged and do not need to be 
# re-written here.

# fitting the model with 100 epochs
model_norm_fit = model_norm.fit(X_train1, 
                      y_train1,
                      validation_data = [X_test1, y_test1],
                      epochs = 100,
                      verbose = 1)

Train on 721 samples, validate on 309 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
E

In [79]:
# evaluation
y_pred2 = model_norm.predict(X_test1)

print(y_pred2[:10]) # printing just first 10 values
print('\n')
print('Shape of array y_pred2 is:', y_pred2.shape)

# this part is repeating for the sake of simplicity.
# reshaping array y_test1.
y_testarray1 = np.transpose(([y_test1]))

print(y_testarray1[:10]) # printing just first 10 values
print('\n')
print('Shape of array y_testarray1 is:', y_testarray1.shape)

# calculating new mean squared error
mse = mean_squared_error(y_testarray1, y_pred2)
mse

[[43.522987]
 [46.294334]
 [17.977379]
 [37.939888]
 [46.4308  ]
 [42.125267]
 [59.85051 ]
 [23.745401]
 [63.746346]
 [20.421413]]


Shape of array y_pred2 is: (309, 1)
[[44.52]
 [50.53]
 [21.82]
 [38.8 ]
 [55.6 ]
 [39.42]
 [81.75]
 [24.5 ]
 [69.84]
 [19.99]]


Shape of array y_testarray1 is: (309, 1)


40.576328711682976

In [77]:
# getting the mean squared error for each 50 epochs in new model
mse_norm_data1 = model_norm_fit.history['mse']

# creating the list of the 50 individual mse
list(mse_norm_data1)

[26.060368,
 26.295803,
 26.21351,
 26.28386,
 26.168173,
 26.137217,
 26.30051,
 26.253283,
 26.195427,
 26.0047,
 26.165976,
 26.30013,
 26.191015,
 26.150616,
 26.21371,
 26.03956,
 26.299803,
 26.239338,
 26.035156,
 25.968206,
 25.983742,
 26.061834,
 26.119642,
 25.963696,
 26.083628,
 26.122208,
 25.974592,
 25.835775,
 25.959957,
 26.07628,
 25.793644,
 25.93254,
 25.972265,
 25.809645,
 25.941063,
 25.887323,
 26.026222,
 25.878353,
 25.9667,
 25.871153,
 25.886118,
 25.807646,
 25.793905,
 25.842705,
 25.818548,
 25.748423,
 25.89999,
 25.830122,
 25.786093,
 25.890858,
 25.68104,
 25.668407,
 25.706446,
 25.676046,
 25.903254,
 25.975744,
 25.87215,
 25.760069,
 25.730541,
 25.790602,
 25.916996,
 25.65252,
 25.68079,
 25.602427,
 25.62966,
 25.601004,
 25.571983,
 25.6308,
 25.644222,
 25.593365,
 25.69668,
 25.70552,
 25.61599,
 25.627657,
 25.734453,
 25.693684,
 25.753502,
 25.731346,
 26.128296,
 25.540285,
 25.645412,
 25.714153,
 25.44098,
 25.427593,
 25.474564,
 25.

In [78]:
mean_norm1 = np.mean(mse_norm_data1)
st_dev_norm1 = np.std(mse_norm_data1)

print('Mean of the 100 mse with normalized data = ', mean_norm1)
print('Standard deviation of the 100 mse with normalized data =', st_dev_norm1)

Mean of the 100 mse with normalized data =  25.824774
Standard deviation of the 100 mse with normalized data = 0.25524512


- The increase in number of epochs greatly slightly improves the reduction in the mean and standard deviation as expected. This is, however, not a lot as expected. The algorithm performace already indicated a plateau at 50 epochs. The increase in number of hidden layers will surely be different.

<hr>

<a id = 'partD'></a>
## Part D - increasing number of hidden layers

In [27]:
# building the model with three hidden layers
def regr_model_norm3():
    model_norm3 = Sequential()
    model_norm3.add(Dense(10, activation = 'relu', input_shape = (n_cols_norm, )))
    model_norm3.add(Dense(10, activation = 'relu'))
    model_norm3.add(Dense(10, activation = 'relu'))
    model_norm3.add(Dense(10, activation = 'relu'))
    model_norm3.add(Dense(1))
    
    # compiling the model
    model_norm3.compile(optimizer = 'adam', 
                        loss = 'mean_squared_error',
                        metrics = ['mse'])
    
    return model_norm3

model_norm3 = regr_model_norm3()
model_norm3

<keras.engine.sequential.Sequential at 0x1315ab950>

In [61]:
# data splitted as in Part B 
# fitting the model with 3 hidden layers and training on normalized data again
model_norm3_fit = model_norm3.fit(X_train1, 
                      y_train1,
                      validation_data = [X_test1, y_test1],
                      epochs = 50,
                      verbose = 1)

Train on 721 samples, validate on 309 samples
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [81]:
# evaluating model_norm3 with normalised data
y_pred3 = model_norm3.predict(X_test1)

print(y_pred3[:10]) # printing just first 10 values
print('\n')
print('Shape of array y_pred3 is:', y_pred3.shape)

# reshaping array y_test1
y_testarray1 = np.transpose(([y_test1]))

print(y_testarray1[:10]) # printing just first 10 values
print('\n')
print('Shape of array y_testarray1 is:', y_testarray1.shape)

[[44.235912]
 [51.545326]
 [18.96126 ]
 [41.100327]
 [48.8388  ]
 [40.3017  ]
 [57.238564]
 [24.380571]
 [71.77685 ]
 [16.772387]]


Shape of array y_pred3 is: (309, 1)
[[44.52]
 [50.53]
 [21.82]
 [38.8 ]
 [55.6 ]
 [39.42]
 [81.75]
 [24.5 ]
 [69.84]
 [19.99]]


Shape of array y_testarray1 is: (309, 1)


In [82]:
# calculating new mean squared error with 3 layers
mean_squared_error(y_testarray1, y_pred3)

29.938282695046624

In [64]:
# getting the mean squared error for each 50 epochs in new model_norm3
mse_norm_data3 = model_norm3_fit.history['mse']

# creating the list of the 50 individual mse
list(mse_norm_data3)

[15.745217,
 16.06641,
 16.072306,
 15.996646,
 16.25396,
 16.414373,
 15.626202,
 16.558289,
 16.334295,
 15.763969,
 16.319944,
 16.021631,
 15.64377,
 15.587807,
 16.21166,
 15.761116,
 15.706262,
 15.686052,
 15.946257,
 15.48783,
 15.50806,
 15.437339,
 15.869812,
 16.081686,
 16.07079,
 15.694274,
 15.728847,
 15.871004,
 15.675888,
 15.491398,
 15.757222,
 16.193039,
 16.052158,
 15.622786,
 15.646333,
 15.610858,
 15.538747,
 15.604458,
 15.753019,
 15.600547,
 15.864119,
 15.696818,
 15.911787,
 16.063385,
 15.657878,
 15.873853,
 15.4486265,
 15.552235,
 16.018524,
 15.611313]

In [66]:
mean_norm3 = np.mean(mse_norm_data3)
st_dev_norm3 = np.std(mse_norm_data3)

print('Mean of the 50 mse of model_norm3 = ', mean_norm3)
print('Standard deviation of the 50 mse of model_norm3 =', st_dev_norm3)

Mean of the 50 mse of model_norm3 =  15.834216
Standard deviation of the 50 mse of model_norm3 = 0.27055964


- The mean is greatly reduced in this model with 3 layers, the standard deviation is unchanged. Independent of this research being done for this project, there are a number of things that can be done to further reduce the mse such as working with larger number of epochs, using better optimizer and more. 