# Concrete Strength - Keras Regression - 50 epochs - Normalized - 3 layers


In this project, we will build a regression model using the Keras library to model the same data about concrete compressive strength. We will build a regression model using the deep learning Keras library, and then we will experiment with increasing the number of training epochs and changing number of hidden layers and we will see how changing these parameters impacts the performance of the model.



The dataset is about the compressive strength of different samples of concrete based on the volumes of the different ingredients that were used to make them. Ingredients include:

1. Cement

2. Blast Furnace Slag

3. Fly Ash

4. Water

5. Superplasticizer

6. Coarse Aggregate

7. Fine Aggregate


![some text](https://www.structuralguide.com/wp-content/uploads/2020/12/Concrete-Construction.jpg)

## Download and Clean Dataset


In [44]:
import pandas as pd
import numpy as np

Let's download the data and read it into a <em>pandas</em> dataframe.


In [45]:
concrete_data = pd.read_csv('concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [46]:
concrete_data.shape

(1030, 9)

In [47]:
concrete_data.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


## Split data into predictors and target


The target variable in this problem is the concrete sample strength. Therefore, our predictors will be all the other columns.


In [48]:
concrete_data_columns = concrete_data.columns

target = concrete_data['Strength']
predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']]

In [49]:
target.head()

0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64

In [50]:
predictors.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [51]:
predictors.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [52]:
n_cols = predictors.shape[1]

## Normalize predictors with Mean and Standard deviation

In [53]:
#Normalize data with mean and std
predictors = (predictors - predictors.mean()) / predictors.std()

## Build a Neural Network


In [54]:
import keras
from keras.models import Sequential
from keras.layers import Dense

In [55]:
# define regression model
def regression_model():
    # create model
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    
    # compile model
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model


## Train, fit and predict with model

In [56]:
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error


In [57]:
mse_list_normalized = []

for i in range(50):
    print(f"Sample number:{i}")
    
    #Split the data into 30% test and 70% training data
    x_train, x_test, y_train, y_test = train_test_split(predictors, target, test_size = 0.3)
    
    #Fit the model to training data
    model = regression_model()
    model.fit(x_train, y_train, epochs=50)
    
    #Predict the target of testing data
    y_hat = model.predict(x_test)
    
    #Evaluate the error of model
    mse = mean_squared_error(y_hat, y_test)
    mse_list_normalized.append(mse)

Sample number:0
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Sample number:1
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Ep

## Evaluating the model with Mean Squared Error

In [58]:
mse_list_normalized

[145.58210306356767,
 106.2216452843658,
 115.39560386509162,
 125.41507815616353,
 106.75739909313192,
 108.90289529010528,
 107.87849400601444,
 149.8413971203843,
 123.19615960448358,
 123.54887570478508,
 130.2137298189586,
 105.29234110129588,
 142.60432672197578,
 138.35088423970006,
 133.72612045650501,
 107.16720043934991,
 120.97269460234833,
 135.19761213418315,
 134.03908404628825,
 142.5929208158684,
 114.39762119163628,
 138.19225285228882,
 145.79072441906214,
 115.72545818904756,
 120.52266507494075,
 128.31233301332983,
 141.34515617235004,
 138.70422606013813,
 130.05517161013532,
 136.7060118117281,
 98.90492244061885,
 129.76506098786916,
 131.01177067443595,
 95.7336729189115,
 112.96098636330406,
 99.79940696757468,
 139.1477348223805,
 129.8596539450355,
 121.38407821800332,
 120.95464458973896,
 90.76005662214841,
 149.61698936551429,
 132.55275637457092,
 141.74248576140698,
 122.77459721539033,
 133.80342403787418,
 108.23414434108432,
 144.32395831982018,
 148

### Mean of 50 model's Mean Squared Error

In [59]:
np.mean(mse_list_normalized)

125.88634128609378

### Standard Deviation of 50 model's Mean Squared Error

In [60]:
np.std(mse_list_normalized)

15.265862349133252

### Conclusion

It can be seen that when we are increasing the hidden layers to 3, the mean of mse is decreasing compared to having only 1 layer (376.30), which shows we have lower error in general. Also, as the standard deviation is decreasing compared to 1 layer (93.75), which shows we have a more uniform error list closer to it's mean.
It seems lower the number of hidden layers is more efficient in lower the error compared to increasing the epochs number.