![](https://pbs.twimg.com/profile_images/1240321823213907973/TFsX7hfq_400x400.jpg)
<h1 align=center><font size = 5><span style="color:gray">FINAL PROJECT DEEP LEARNING & NEURAL NETWORKS WITH KERAS</span></font></h1>

### <span style="color:gray">Project: *Build a regression model using the Deep learning <span style="color:lightblue">Keras</span> library*</span>

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#itemA">A. Build a baseline model</a>      
2. <a href="#itemB">B. Repeat Part A but use a normalized version of the data</a>     
3. <a href="#itemC">C. Increate the number of epochs</a>
4. <a href="#itemD">D. Increase the number of hidden layers</a>
5. <a href="#itemE">E. Additional</a>
6. <a href="#itemF">F. Report</a>

</font>
</div>

<a id='itemA'></a>
### A. Build a baseline model 

In [874]:
# First import the 'pandas' and 'numpy' libraries
import pandas as pd # 'pd' constructor
import numpy as np  # 'np' constructor

In [875]:
# Import the Data Set
concrete_data=pd.read_csv('https://cocl.us/concrete_data')
# Display First five (5) rows of data set
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [876]:
# Check how many data points we have
concrete_data.shape

(1030, 9)

In [877]:
# Check data set for any missing values
concrete_data.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


In [878]:
# Sum all NULL values per each column in the dataframe
concrete_data.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

In [879]:
# Split data into predictors and target='strenght'
concrete_data_columns=concrete_data.columns

# Select All columns except 'Strenght' as our predictors
predictors=concrete_data[concrete_data_columns[concrete_data_columns!='Strength']]

# Get target columns as 'strenght'
target=concrete_data['Strength']

In [880]:
# Check predictors data frame
print(predictors.shape)
predictors.head()

(1030, 8)


Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [881]:
# Check target data frame
print(target.shape)
target.head()

(1030,)


0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64

In [882]:
# Randomly split the data into a training and test sets by holding 30% of the data for testing
from sklearn.model_selection import train_test_split

X_train, X_test, y_train, y_test=train_test_split(predictors, target, test_size=0.3, random_state=4)

# Show new created training and test set shape
print('Full Data set:', predictors.shape, target.shape)
print('Training set:', X_train.shape, y_train.shape)
print('Test set:', X_test.shape, y_test.shape)

Full Data set: (1030, 8) (1030,)
Training set: (721, 8) (721,)
Test set: (309, 8) (309,)


In [883]:
# Save the number of predictors to 'n_cols'. We'll need this number when building the network.
n_cols=predictors.shape[1]
n_cols

8

In [884]:
# Data looks clean and ready to build our model.
# Let´s import the Keras Library 
import keras

# Import the keras packages to build our regression model
from keras.models import Sequential # 'Sequential' model constructor
from keras.layers import Dense      # We use 'Dense' type layers in our network

In [885]:
# Define regression model function
def regression_model():
    # Create model
    model=Sequential()
    
    # Add One hidden layer with 10 nodes/neurons each one and ReLU activation function
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    
    # Add Output layer with 1 node/neuron
    model.add(Dense(1))
    
    # Compile model adding adam Optimizer and the mean squared error as the loss function.
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [886]:
# Build our model
model=regression_model()

# Train/fit the model on the training set using 50 epochs, and validate the model on test set.
error=model.fit(X_train, y_train, validation_data=(X_test, y_test), epochs=50, verbose=2)

Train on 721 samples, validate on 309 samples
Epoch 1/50
 - 9s - loss: 106006.3928 - val_loss: 63513.9886
Epoch 2/50
 - 0s - loss: 39472.5595 - val_loss: 20451.2612
Epoch 3/50
 - 0s - loss: 11557.9644 - val_loss: 5488.9410
Epoch 4/50
 - 0s - loss: 3003.6886 - val_loss: 1895.4335
Epoch 5/50
 - 0s - loss: 1553.9954 - val_loss: 1588.5794
Epoch 6/50
 - 0s - loss: 1428.5663 - val_loss: 1510.1569
Epoch 7/50
 - 0s - loss: 1350.9323 - val_loss: 1437.3049
Epoch 8/50
 - 0s - loss: 1282.0127 - val_loss: 1362.2999
Epoch 9/50
 - 0s - loss: 1210.5058 - val_loss: 1286.1179
Epoch 10/50
 - 0s - loss: 1134.0123 - val_loss: 1209.5958
Epoch 11/50
 - 0s - loss: 1068.6524 - val_loss: 1135.5245
Epoch 12/50
 - 0s - loss: 1000.5384 - val_loss: 1066.7246
Epoch 13/50
 - 0s - loss: 933.2642 - val_loss: 997.2015
Epoch 14/50
 - 0s - loss: 870.1212 - val_loss: 933.1631
Epoch 15/50
 - 0s - loss: 808.0316 - val_loss: 872.4827
Epoch 16/50
 - 0s - loss: 753.6012 - val_loss: 810.2669
Epoch 17/50
 - 0s - loss: 697.6107 - 

In [887]:
# Use .predict() method to get an array of new predicted values of 'strenght' using test set
y_hat=model.predict(X_test)

# Create data frame for comparing true Strength and Predicted Strength
df=pd.DataFrame(y_hat, y_test)
df.reset_index(inplace=True)
df.columns=['Strength', 'Predicted Strength']

#Select first row as sample to report
y_testA=df.iloc[0,0]
y_hatA=df.iloc[0,1]


df.head()

Unnamed: 0,Strength,Predicted Strength
0,44.52,40.439449
1,50.53,46.148174
2,21.82,42.090893
3,38.8,40.280331
4,55.6,61.221523


In [888]:
# Evaluate the model
scores=model.evaluate(X_test, y_test, verbose=0)

# Use Scikit-learn to corroborate that our MSE value between true Strength and Predicted Strength is correct
from sklearn.metrics import mean_squared_error

MSE=mean_squared_error(y_test, y_hat)
MSEA=MSE     # To report

print('We get the last Mean Squared Error value on test set at Epoch 50/50 such as: {},'.format(np.around(scores, decimals=4)))
print('and the same value: {} computed with Scikit-learn.'.format(np.around(MSE, decimals=4)))

We get the last Mean Squared Error value on test set at Epoch 50/50 such as: 137.6999,
and the same value: 137.6999 computed with Scikit-learn.


In [889]:
# Extract/Get out the MSE values of our History object as dictionary using .history attribute
dictionary=error.history

# Select each array of values in the dictionary
loss_train=dictionary['loss']
val_loss_test=dictionary['val_loss']

# Create a Data Frame displaying the 50 MSE values at train and test set
error_values=pd.DataFrame(val_loss_test, loss_train)
error_values.reset_index(inplace=True)
error_values.columns=['loss (train set)','validated loss (test set)']
error_values.head(50)

Unnamed: 0,loss (train set),validated loss (test set)
0,106006.39277,63513.988572
1,39472.559485,20451.261163
2,11557.964432,5488.941025
3,3003.688636,1895.433484
4,1553.995355,1588.579362
5,1428.56634,1510.156883
6,1350.932338,1437.30495
7,1282.012746,1362.29991
8,1210.505848,1286.117879
9,1134.012302,1209.595817


In [890]:
# Report the mean and the standard deviation of the mean squared errors
# Get Statistics with .describe() method and select just mean and std in 'error_values' data frame
stats=error_values.describe()
meanA=stats.iloc[1,1] # To report
stats[1:3]

Unnamed: 0,loss (train set),validated loss (test set)
mean,3654.410994,2322.370858
std,15836.846043,9300.250886


<a id='itemB'></a>
### B. Repeat Part A but use a normalized version of the data.

In [891]:
# Normalize the inputs/predictors in training set.
X_train_norm=(X_train - X_train.mean()) / X_train.std()
X_train_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
401,1.792477,-0.842033,-0.853275,-0.934934,0.545015,0.885941,-1.326981,-0.302526
288,-0.93285,-0.842033,1.73731,-0.578668,0.230955,1.033146,0.066649,0.847751
406,-1.085082,-0.842033,1.374318,-0.850555,-1.025288,0.398645,1.548133,-0.701928
945,-1.318072,0.712067,0.749165,0.860457,0.197896,-0.738382,-0.227,-0.302526
616,-0.045448,-0.842033,-0.853275,0.424501,-1.025288,-0.0785,1.00777,5.001532


In [892]:
# Normalize the inputs/predictors in test set.
X_test_norm=(X_test - X_test.mean()) / X_test.std()
X_test_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
522,0.046854,-0.720173,1.409228,-0.078799,-0.122201,-1.699502,0.333673,0.206369
701,0.088213,1.281494,-0.830446,0.528287,-1.070083,-0.504189,-0.806759,0.733507
563,-0.711057,2.684923,-0.830446,0.234084,-1.070083,0.093467,-1.197411,-0.553329
678,0.088213,1.281494,-0.830446,0.528287,-1.070083,-0.504189,-0.806759,-0.227744
98,2.021765,0.453686,-0.830446,0.019269,0.463762,-1.565362,0.066384,-0.553329


In [893]:
# Build our model
model=regression_model()

# Train/fit the model on the training set using 50 epochs, and validate the model on test set.
error=model.fit(X_train_norm, y_train, validation_data=(X_test_norm, y_test), epochs=50, verbose=2)

Train on 721 samples, validate on 309 samples
Epoch 1/50
 - 8s - loss: 1568.5704 - val_loss: 1504.4738
Epoch 2/50
 - 0s - loss: 1554.5798 - val_loss: 1491.1339
Epoch 3/50
 - 0s - loss: 1540.5810 - val_loss: 1477.8037
Epoch 4/50
 - 0s - loss: 1526.5726 - val_loss: 1464.1372
Epoch 5/50
 - 0s - loss: 1511.9989 - val_loss: 1450.2452
Epoch 6/50
 - 0s - loss: 1497.0109 - val_loss: 1435.6246
Epoch 7/50
 - 0s - loss: 1481.2603 - val_loss: 1420.5208
Epoch 8/50
 - 0s - loss: 1464.8205 - val_loss: 1404.2346
Epoch 9/50
 - 0s - loss: 1447.2504 - val_loss: 1387.1105
Epoch 10/50
 - 0s - loss: 1428.6036 - val_loss: 1368.8448
Epoch 11/50
 - 0s - loss: 1408.8812 - val_loss: 1349.4705
Epoch 12/50
 - 0s - loss: 1388.0260 - val_loss: 1329.0920
Epoch 13/50
 - 0s - loss: 1366.0374 - val_loss: 1307.5961
Epoch 14/50
 - 0s - loss: 1342.9561 - val_loss: 1285.2853
Epoch 15/50
 - 1s - loss: 1318.8044 - val_loss: 1261.9018
Epoch 16/50
 - 1s - loss: 1293.3602 - val_loss: 1237.7852
Epoch 17/50
 - 0s - loss: 1267.1831

In [894]:
# Use .predict() method to get an array of new predicted values of 'strenght' using test set
y_hat=model.predict(X_test_norm)

# Create data frame for comparing true Strength and Predicted Strength
df=pd.DataFrame(y_hat, y_test)
df.reset_index(inplace=True)
df.columns=['Strength', 'Predicted Strength']

#Select first row as sample to report
y_testB=df.iloc[0,0]
y_hatB=df.iloc[0,1]

df.head()

Unnamed: 0,Strength,Predicted Strength
0,44.52,27.473602
1,50.53,29.726318
2,21.82,32.074711
3,38.8,27.71534
4,55.6,36.944599


In [895]:
# Evaluate the model
scores=model.evaluate(X_test_norm, y_test, verbose=0)

# Use Scikit-learn to corroborate that our MSE value between true Strength and Predicted Strength is correct
from sklearn.metrics import mean_squared_error

MSE=mean_squared_error(y_test, y_hat)
MSEB=MSE   # To report

print('We get the last Mean Squared Error value on test set at Epoch 50/50 such as: {},'.format(np.around(scores, decimals=4)))
print('and the same value: {} computed with Scikit-learn, and it has INCREASED compared to step A.'.format(np.around(MSE, decimals=4)))

We get the last Mean Squared Error value on test set at Epoch 50/50 such as: 311.9161,
and the same value: 311.9161 computed with Scikit-learn, and it has INCREASED compared to step A.


In [896]:
# Extract/Get out the MSE values of our History object as dictionary using .history attribute
dictionary=error.history

# Select each array of values in the dictionary
loss_train=dictionary['loss']
val_loss_test=dictionary['val_loss']

# Create a Data Frame displaying the 50 MSE values at train and test set
error_values=pd.DataFrame(val_loss_test, loss_train)
error_values.reset_index(inplace=True)
error_values.columns=['loss (train set)','validated loss (test set)']
error_values.head(50)

Unnamed: 0,loss (train set),validated loss (test set)
0,1568.570357,1504.473831
1,1554.579803,1491.133865
2,1540.581046,1477.803674
3,1526.572587,1464.137177
4,1511.998912,1450.245159
5,1497.010886,1435.624647
6,1481.260323,1420.520828
7,1464.820463,1404.234571
8,1447.250401,1387.110547
9,1428.603563,1368.844789


In [897]:
# Report the mean and the standard deviation of the mean squared errors
# Get Statistics with .describe() method and select just mean and std in 'error_values' data frame
stats=error_values.describe()
meanB=stats.iloc[1,1] # To report
stats[1:3]

Unnamed: 0,loss (train set),validated loss (test set)
mean,963.424557,931.623949
std,425.926999,397.989829


#### The *mean* of Mean Squared Error has DECREASED on step B, compared to step A.

<a id='itemC'></a>
### C. Increate the number of epochs
#### Repeat Part B but use 100 epochs this time for training.

In [898]:
# Build our model
model=regression_model()

# Train/fit the model on the training set using 50 epochs, and validate the model on test set.
error=model.fit(X_train_norm, y_train, validation_data=(X_test_norm, y_test), epochs=100, verbose=2)

Train on 721 samples, validate on 309 samples
Epoch 1/100
 - 8s - loss: 1618.1457 - val_loss: 1555.0297
Epoch 2/100
 - 0s - loss: 1605.0979 - val_loss: 1542.7327
Epoch 3/100
 - 0s - loss: 1592.2167 - val_loss: 1530.6456
Epoch 4/100
 - 0s - loss: 1579.5535 - val_loss: 1518.6370
Epoch 5/100
 - 0s - loss: 1566.7230 - val_loss: 1506.3661
Epoch 6/100
 - 0s - loss: 1553.4323 - val_loss: 1493.6529
Epoch 7/100
 - 0s - loss: 1539.5092 - val_loss: 1480.3573
Epoch 8/100
 - 0s - loss: 1524.8621 - val_loss: 1466.1366
Epoch 9/100
 - 0s - loss: 1509.1276 - val_loss: 1451.0277
Epoch 10/100
 - 0s - loss: 1492.4963 - val_loss: 1434.7579
Epoch 11/100
 - 0s - loss: 1474.4569 - val_loss: 1417.3850
Epoch 12/100
 - 0s - loss: 1454.9961 - val_loss: 1398.4876
Epoch 13/100
 - 0s - loss: 1433.8540 - val_loss: 1377.8350
Epoch 14/100
 - 0s - loss: 1411.0470 - val_loss: 1355.6406
Epoch 15/100
 - 0s - loss: 1386.4724 - val_loss: 1331.8139
Epoch 16/100
 - 0s - loss: 1360.2431 - val_loss: 1305.9919
Epoch 17/100
 - 0s 

In [899]:
# Use .predict() method to get an array of new predicted values of 'strenght' using test set
y_hat=model.predict(X_test_norm)

# Create data frame for comparing true Strength and Predicted Strength
df=pd.DataFrame(y_hat, y_test)
df.reset_index(inplace=True)
df.columns=['Strength', 'Predicted Strength']

#Select first row as sample to report
y_testC=df.iloc[0,0]
y_hatC=df.iloc[0,1]

df.head()

Unnamed: 0,Strength,Predicted Strength
0,44.52,27.100958
1,50.53,40.859787
2,21.82,42.636662
3,38.8,39.090347
4,55.6,59.708954


In [900]:
# Evaluate the model
scores=model.evaluate(X_test_norm, y_test, verbose=0)

# Use Scikit-learn to corroborate that our MSE value between true Strength and Predicted Strength is correct
from sklearn.metrics import mean_squared_error

MSE=mean_squared_error(y_test, y_hat)
MSEC=MSE   # To report

print('We get the last Mean Squared Error value on test set at Epoch 100/100 such as: {},'.format(np.around(scores, decimals=4)))
print('and the same value: {} computed with Scikit-learn, and it has DECREASED respect to step B and step A.'.format(np.around(MSE, decimals=4)))

We get the last Mean Squared Error value on test set at Epoch 100/100 such as: 192.497,
and the same value: 192.497 computed with Scikit-learn, and it has DECREASED respect to step B and step A.


In [901]:
# Extract/Get out the MSE values of our History object as dictionary using .history attribute
dictionary=error.history

# Select each array of values in the dictionary
loss_train=dictionary['loss']
val_loss_test=dictionary['val_loss']

# Create a Data Frame displaying the 50 MSE values at train and test set
error_values=pd.DataFrame(val_loss_test, loss_train)
error_values.reset_index(inplace=True)
error_values.columns=['loss (train set)','validated loss (test set)']
error_values.head(100)

Unnamed: 0,loss (train set),validated loss (test set)
0,1618.145725,1555.029743
1,1605.097920,1542.732692
2,1592.216662,1530.645569
3,1579.553503,1518.636982
4,1566.723007,1506.366063
5,1553.432342,1493.652889
6,1539.509176,1480.357290
7,1524.862104,1466.136634
8,1509.127612,1451.027713
9,1492.496312,1434.757903


In [902]:
# Report the mean and the standard deviation of the mean squared errors
# Get Statistics with .describe() method and select just mean and std in 'error_values' data frame
stats=error_values.describe()
meanC=stats.iloc[1,1] # To report
stats[1:3]

Unnamed: 0,loss (train set),validated loss (test set)
mean,611.933344,610.273983
std,509.011003,473.208693


#### The *mean* of Mean Squared Error on step C is lower than step B and step A

<a id='itemD'></a>
### D. Increase the number of hidden layers
#### Repeat part B but use a neural network with *Three hidden layers*, each of 10 nodes and ReLU activation function.

In [903]:
# Define regression model function
def regression_model():
    # Create model
    model=Sequential()
    
    # Add First hidden layer with 10 nodes/neurons and ReLU activation function
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    
    # Add Second hidden layer with 10 nodes/neurons and ReLU activation function 
    model.add(Dense(10, activation='relu'))
    
    # Add Third hidden layer with 10 nodes/neurons and ReLU activation function 
    model.add(Dense(10, activation='relu'))
    
    # Add Output layer with 1 node/neuron
    model.add(Dense(1))
    
    # Compile model adding adam Optimizer and the mean squared error as the loss function.
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

In [904]:
# Build our model
model=regression_model()

# Train/fit the model on the training set using 50 epochs, and validate the model on test set.
error=model.fit(X_train_norm, y_train, validation_data=(X_test_norm, y_test), epochs=50, verbose=2)

Train on 721 samples, validate on 309 samples
Epoch 1/50
 - 9s - loss: 1554.4776 - val_loss: 1486.8938
Epoch 2/50
 - 0s - loss: 1526.6335 - val_loss: 1454.4655
Epoch 3/50
 - 0s - loss: 1484.7901 - val_loss: 1404.2561
Epoch 4/50
 - 0s - loss: 1420.6499 - val_loss: 1327.5391
Epoch 5/50
 - 0s - loss: 1322.0023 - val_loss: 1213.8024
Epoch 6/50
 - 0s - loss: 1178.0353 - val_loss: 1047.8646
Epoch 7/50
 - 0s - loss: 979.5445 - val_loss: 837.1843
Epoch 8/50
 - 0s - loss: 735.9884 - val_loss: 598.4784
Epoch 9/50
 - 0s - loss: 490.9768 - val_loss: 408.5086
Epoch 10/50
 - 0s - loss: 327.6814 - val_loss: 315.5331
Epoch 11/50
 - 1s - loss: 257.2625 - val_loss: 282.8456
Epoch 12/50
 - 1s - loss: 232.7503 - val_loss: 264.0977
Epoch 13/50
 - 0s - loss: 217.3433 - val_loss: 248.8995
Epoch 14/50
 - 0s - loss: 206.1156 - val_loss: 237.6190
Epoch 15/50
 - 0s - loss: 198.1912 - val_loss: 228.5942
Epoch 16/50
 - 0s - loss: 190.5329 - val_loss: 220.1448
Epoch 17/50
 - 0s - loss: 184.2429 - val_loss: 214.3927

In [905]:
# Use .predict() method to get an array of new predicted values of 'strenght' using test set
y_hat=model.predict(X_test_norm)

# Create data frame for comparing true Strength and Predicted Strength
df=pd.DataFrame(y_hat, y_test)
df.reset_index(inplace=True)
df.columns=['Strength', 'Predicted Strength']

#Select first row as sample to report
y_testD=df.iloc[0,0]
y_hatD=df.iloc[0,1]

df.head()

Unnamed: 0,Strength,Predicted Strength
0,44.52,34.165421
1,50.53,41.173103
2,21.82,38.667336
3,38.8,37.060841
4,55.6,62.264595


In [906]:
# Evaluate the model
scores=model.evaluate(X_test_norm, y_test, verbose=0)

# Use Scikit-learn to corroborate that our MSE value between true Strength and Predicted Strength is correct
from sklearn.metrics import mean_squared_error

MSE=mean_squared_error(y_test, y_hat)
MSED=MSE   # To report

print('We get the last Mean Squared Error value on test set at Epoch 50/50 such as: {},'.format(np.around(scores, decimals=4)))
print('and the same value: {} computed with Scikit-learn, and it has DECREASED respect to steps C, B and A.'.format(np.around(MSE, decimals=4)))

We get the last Mean Squared Error value on test set at Epoch 50/50 such as: 147.2921,
and the same value: 147.2921 computed with Scikit-learn, and it has DECREASED respect to steps C, B and A.


In [907]:
# Extract/Get out the MSE values of our History object as dictionary using .history attribute
dictionary=error.history

# Select each array of values in the dictionary
loss_train=dictionary['loss']
val_loss_test=dictionary['val_loss']

# Create a Data Frame displaying the 50 MSE values at train and test set
error_values=pd.DataFrame(val_loss_test, loss_train)
error_values.reset_index(inplace=True)
error_values.columns=['loss (train set)','validated loss (test set)']
error_values.head(50)

Unnamed: 0,loss (train set),validated loss (test set)
0,1554.477566,1486.893808
1,1526.63345,1454.465529
2,1484.79007,1404.256083
3,1420.649896,1327.539104
4,1322.00228,1213.802449
5,1178.035259,1047.864586
6,979.544489,837.184251
7,735.988383,598.478356
8,490.976783,408.508594
9,327.681398,315.533127


In [908]:
# Report the mean and the standard deviation of the mean squared errors
# Get Statistics with .describe() method and select just mean and std in 'error_values' data frame
stats=error_values.describe()
meanD=stats.iloc[1,1] # To report
stats[1:3]

Unnamed: 0,loss (train set),validated loss (test set)
mean,343.967644,347.211205
std,429.390928,385.832155


#### The *mean* of Mean Squared Error on step D is lower than steps C, B and A.

<a id='itemE'></a>
## E. Additional
### Use (3) Hidden Layers and 100 Epochs
#### Repeat part C but use a neural network with *Three hidden layers*, each of 10 nodes, *ReLU* activation function and 100 *Epochs*.

In [909]:
# Build our model
model=regression_model()

# Train/fit the model on the training set using 50 epochs, and validate the model on test set.
error=model.fit(X_train_norm, y_train, validation_data=(X_test_norm, y_test), epochs=100, verbose=2)

Train on 721 samples, validate on 309 samples
Epoch 1/100
 - 9s - loss: 1564.7969 - val_loss: 1494.7517
Epoch 2/100
 - 0s - loss: 1533.1168 - val_loss: 1458.0330
Epoch 3/100
 - 0s - loss: 1489.2541 - val_loss: 1408.4382
Epoch 4/100
 - 0s - loss: 1427.2655 - val_loss: 1334.4746
Epoch 5/100
 - 0s - loss: 1333.4475 - val_loss: 1218.7755
Epoch 6/100
 - 0s - loss: 1185.1941 - val_loss: 1048.1217
Epoch 7/100
 - 0s - loss: 980.8907 - val_loss: 827.3225
Epoch 8/100
 - 0s - loss: 731.8673 - val_loss: 594.7780
Epoch 9/100
 - 0s - loss: 499.3447 - val_loss: 407.1117
Epoch 10/100
 - 0s - loss: 339.7944 - val_loss: 313.9081
Epoch 11/100
 - 0s - loss: 268.4868 - val_loss: 277.2577
Epoch 12/100
 - 0s - loss: 235.1848 - val_loss: 254.6318
Epoch 13/100
 - 1s - loss: 215.5390 - val_loss: 239.6493
Epoch 14/100
 - 1s - loss: 203.0392 - val_loss: 227.5382
Epoch 15/100
 - 0s - loss: 193.2609 - val_loss: 219.1504
Epoch 16/100
 - 0s - loss: 185.7954 - val_loss: 211.6006
Epoch 17/100
 - 0s - loss: 180.5979 - v

In [910]:
# Use .predict() method to get an array of new predicted values of 'strenght' using test set
y_hat=model.predict(X_test_norm)

# Create data frame for comparing true Strength and Predicted Strength
df=pd.DataFrame(y_hat, y_test)
df.reset_index(inplace=True)
df.columns=['Strength', 'Predicted Strength']

#Select first row as sample to report
y_testE=df.iloc[0,0]
y_hatE=df.iloc[0,1]

df.head()

Unnamed: 0,Strength,Predicted Strength
0,44.52,42.48904
1,50.53,49.544811
2,21.82,32.233387
3,38.8,35.205288
4,55.6,60.650658


In [911]:
# Evaluate the model
scores=model.evaluate(X_test_norm, y_test, verbose=0)

# Use Scikit-learn to corroborate that our MSE value between true Strength and Predicted Strength is correct
from sklearn.metrics import mean_squared_error

MSE=mean_squared_error(y_test, y_hat)
MSEE=MSE   # To report

print('We get the last Mean Squared Error value on test set at Epoch 100/100 such as: {},'.format(np.around(scores, decimals=4)))
print('and the same value: {} computed with Scikit-learn, and it has DECREASED respect to steps D, C, B and A.'.format(np.around(MSE, decimals=4)))

We get the last Mean Squared Error value on test set at Epoch 100/100 such as: 99.0254,
and the same value: 99.0254 computed with Scikit-learn, and it has DECREASED respect to steps D, C, B and A.


In [912]:
# Extract/Get out the MSE values of our History object as dictionary using .history attribute
dictionary=error.history

# Select each array of values in the dictionary
loss_train=dictionary['loss']
val_loss_test=dictionary['val_loss']

# Create a Data Frame displaying the 50 MSE values at train and test set
error_values=pd.DataFrame(val_loss_test, loss_train)
error_values.reset_index(inplace=True)
error_values.columns=['loss (train set)','validated loss (test set)']
error_values.head(100)

Unnamed: 0,loss (train set),validated loss (test set)
0,1564.796854,1494.751742
1,1533.116815,1458.033021
2,1489.254119,1408.438209
3,1427.265465,1334.474612
4,1333.447491,1218.775482
5,1185.194066,1048.121675
6,980.890710,827.322451
7,731.867312,594.778048
8,499.344653,407.111743
9,339.794368,313.908078


In [913]:
# Report the mean and the standard deviation of the mean squared errors
# Get Statistics with .describe() method and select just mean and std in 'error_values' data frame
stats=error_values.describe()
meanE=stats.iloc[1,1] # To report
stats[1:3]

Unnamed: 0,loss (train set),validated loss (test set)
mean,219.258341,228.945331
std,329.08612,296.614918


#### The *mean* of Mean Squared Error on step E is lower than steps D, C, B and A.

<a id='itemF'></a>
# F. Report

In [914]:
report_array=[
             {'Step': 'A', 'Mean of MSE (Test Set)':meanA, 'True Strength':y_testA, 'Predicted Strength':y_hatA, 'MSE last Epoch (Test Set)':MSEA},
             {'Step': 'B', 'Mean of MSE (Test Set)':meanB, 'True Strength':y_testB, 'Predicted Strength':y_hatB, 'MSE last Epoch (Test Set)':MSEB},
             {'Step': 'C', 'Mean of MSE (Test Set)':meanC, 'True Strength':y_testC, 'Predicted Strength':y_hatC, 'MSE last Epoch (Test Set)':MSEC},
             {'Step': 'D', 'Mean of MSE (Test Set)':meanD, 'True Strength':y_testD, 'Predicted Strength':y_hatD, 'MSE last Epoch (Test Set)':MSED},
             {'Step': 'E', 'Mean of MSE (Test Set)':meanE, 'True Strength':y_testE, 'Predicted Strength':y_hatE, 'MSE last Epoch (Test Set)':MSEE}
             ]

report_df=pd.DataFrame(report_array)
report_df

Unnamed: 0,MSE last Epoch (Test Set),Mean of MSE (Test Set),Predicted Strength,Step,True Strength
0,137.699862,2322.370858,40.439449,A,44.52
1,311.916119,931.623949,27.473602,B,44.52
2,192.497021,610.273983,27.100958,C,44.52
3,147.292055,347.211205,34.165421,D,44.52
4,99.025402,228.945331,42.48904,E,44.52


# <span style="color:gray"><strong>Thanks for Watching !</strong></span>
### <span style="color:gray"><strong>Diego H. Salazar A.</strong></span>

<img src="https://media-exp1.licdn.com/dms/image/C5103AQH_4IJSGAl9Yw/profile-displayphoto-shrink_200_200/0?e=1602115200&v=beta&t=oi-1AzhyTRYSGTOqgeC27692BfY63DtzdqS9eHVZKPk" width="60" height="60" align="left"/>