# Concrete Strength Prediction
: build a regression model using the Keras library that predicts the compressive strength of concrete

concrete_data.csv contains the following features:
1. Cement: the amount of cement mixed, in m<sup>3<sup/>
2. Blast Furnace Slag: the amount of blast furnace slag mixed, in m<sup>3<sup/>
3. Fly Ash: the amount of fly ash mixed, in m<sup>3<sup/>
4. Water: the amount of water mixed, in m<sup>3<sup/>
5. Superplasticizer: the amount of superplasticizer mixed, in m<sup>3<sup/>
6. Coarse Aggregate: the amount of coarse aggregate mixed, in m<sup>3<sup/>
7. Fine Aggregate: the amount of fine aggregate mixed, in m<sup>3<sup/>
8. Age: the age of a concrete, in days

The main objectives of this project is to figure out how data normalization, the number of epochs, and the number of hidden layers influence the performance of a neural network.

In [2]:
# import modules
import keras
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense

2024-05-15 03:06:41.911976: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


## Build a baseline model

In [3]:
# load the dataset from csv file
concrete_data = pd.read_csv('concrete_data.csv')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [4]:
# split the dataset
predictors = concrete_data.drop(columns='Strength', inplace=False)
target = concrete_data[['Strength']]
predictors_train, predictors_test, target_train, target_test = train_test_split(predictors, target, test_size=0.3, random_state=5)

In [13]:
# build a neural network
def regression_model(features, activation, optimizer, loss):
    model = Sequential()
    
    n_cols = features.shape[1]
    
    model.add(Dense(10, activation=activation, input_shape=(n_cols,)))
    model.add(Dense(1))
    
    model.compile(optimizer=optimizer, loss=loss)
    return model

In [14]:
def train_regression_model(model, features_train, targets_train, features_test, targets_test, epochs):
    # fit the model
    model.fit(features_train, targets_train, epochs=epochs, verbose=0)
    
    # evaluate the model
    predictions = model.predict(features_test)
    MSE = mean_squared_error(targets_test, predictions)
    print('Mean squared error of this model is: {}'.format(np.around(MSE, decimals=4)))
    return MSE
model = regression_model(predictors, 'relu', 'adam', 'mean_squared_error')
train_regression_model(model, predictors_train, target_train, predictors_test, target_test, 50)

Mean squared error of this model is: 543.8628


543.862820964712

In [15]:
# train 50 times; get the mean and the standard deviation
MSEs = []
for i in range(50):
    MSEs.append(train_regression_model(model, predictors_train, target_train, predictors_test, target_test, 50))
MSEs = np.array(MSEs)

Mean squared error of this model is: 251.5042
Mean squared error of this model is: 182.9886
Mean squared error of this model is: 143.3611
Mean squared error of this model is: 116.3093
Mean squared error of this model is: 108.9885
Mean squared error of this model is: 107.9778
Mean squared error of this model is: 109.7633
Mean squared error of this model is: 108.2241
Mean squared error of this model is: 113.6541
Mean squared error of this model is: 108.1719
Mean squared error of this model is: 107.9627
Mean squared error of this model is: 111.5672
Mean squared error of this model is: 114.7291
Mean squared error of this model is: 111.64
Mean squared error of this model is: 109.6346
Mean squared error of this model is: 107.6573
Mean squared error of this model is: 112.9737
Mean squared error of this model is: 114.5837
Mean squared error of this model is: 109.4837
Mean squared error of this model is: 108.3797
Mean squared error of this model is: 109.4824
Mean squared error of this model is:

In [11]:
mean = np.around(MSEs.mean(), decimals=4).astype(str)
std = np.around(MSEs.std(), decimals=4).astype(str)
print('The mean of the mean squared errors is ' + mean + ', and the standard deviation of the mean squared error is ' + std)

The mean of the mean squared errors is 394.2581, and the standard deviation of the mean squared error is 623.0284


Mean: 394.2581
Standard Deviation: 623.0284

## Training with a normalized data

In [17]:
# normalize data
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm_train, predictors_norm_test, target_train, target_test = train_test_split(predictors_norm, target, test_size=0.3, random_state=5)

# train data
model = regression_model(predictors_norm, 'relu', 'adam', 'mean_squared_error')
train_regression_model(model, predictors_norm_train, target_train, predictors_norm_test, target_test, 50)

Mean squared error of this model is: 312.6012


312.60124277852447

In [18]:
# train 50 times; get the mean and the standard deviation
MSEs = []
for i in range(50):
    MSEs.append(train_regression_model(model, predictors_norm_train, target_train, predictors_norm_test, target_test, 50))
MSEs = np.array(MSEs)
mean = np.around(MSEs.mean(), decimals=4).astype(str)
std = np.around(MSEs.std(), decimals=4).astype(str)
print('The mean of the mean squared errors is ' + mean + ', and the standard deviation of the mean squared error is ' + std)

Mean squared error of this model is: 164.3544
Mean squared error of this model is: 131.3273
Mean squared error of this model is: 104.4418
Mean squared error of this model is: 84.7311
Mean squared error of this model is: 70.6504
Mean squared error of this model is: 62.528
Mean squared error of this model is: 58.0033
Mean squared error of this model is: 55.2918
Mean squared error of this model is: 53.5045
Mean squared error of this model is: 52.0842
Mean squared error of this model is: 51.1193
Mean squared error of this model is: 50.2552
Mean squared error of this model is: 49.2906
Mean squared error of this model is: 48.1269
Mean squared error of this model is: 47.4974
Mean squared error of this model is: 47.0585
Mean squared error of this model is: 46.7353
Mean squared error of this model is: 46.6137
Mean squared error of this model is: 45.977
Mean squared error of this model is: 45.7227
Mean squared error of this model is: 45.2664
Mean squared error of this model is: 45.2073
Mean squa

Mean: 52.6069 Standard Deviation: 22.5591

## Increase in epochs

In [19]:
# train 50 times; get the mean and the standard deviation
MSEs = []
for i in range(50):
    MSEs.append(train_regression_model(model, predictors_norm_train, target_train, predictors_norm_test, target_test, 100))
MSEs = np.array(MSEs)
mean = np.around(MSEs.mean(), decimals=4).astype(str)
std = np.around(MSEs.std(), decimals=4).astype(str)
print('The mean of the mean squared errors is ' + mean + ', and the standard deviation of the mean squared error is ' + std)

Mean squared error of this model is: 43.0065
Mean squared error of this model is: 43.4582
Mean squared error of this model is: 43.3447
Mean squared error of this model is: 43.0557
Mean squared error of this model is: 43.0397
Mean squared error of this model is: 42.8587
Mean squared error of this model is: 42.648
Mean squared error of this model is: 42.3821
Mean squared error of this model is: 42.3071
Mean squared error of this model is: 42.3671
Mean squared error of this model is: 41.9
Mean squared error of this model is: 42.1418
Mean squared error of this model is: 42.0712
Mean squared error of this model is: 41.5343
Mean squared error of this model is: 41.5814
Mean squared error of this model is: 41.7064
Mean squared error of this model is: 42.043
Mean squared error of this model is: 41.8426
Mean squared error of this model is: 41.9537
Mean squared error of this model is: 41.8461
Mean squared error of this model is: 41.8392
Mean squared error of this model is: 41.862
Mean squared err

Mean: 41.8942 Standard Deviation: 0.5517

## Increase in hidden layers

In [20]:
# build a neural network with three hidden layers
def regression_model(features, activation, optimizer, loss):
    model = Sequential()
    
    n_cols = features.shape[1]
    
    model.add(Dense(10, activation=activation, input_shape=(n_cols,)))
    model.add(Dense(10, activation=activation))
    model.add(Dense(10, activation=activation))
    model.add(Dense(1))
    
    model.compile(optimizer=optimizer, loss=loss)
    return model

In [21]:
# train 50 times; get the mean and the standard deviation
MSEs = []
for i in range(50):
    MSEs.append(train_regression_model(model, predictors_norm_train, target_train, predictors_norm_test, target_test, 50))
MSEs = np.array(MSEs)
mean = np.around(MSEs.mean(), decimals=4).astype(str)
std = np.around(MSEs.std(), decimals=4).astype(str)
print('The mean of the mean squared errors is ' + mean + ', and the standard deviation of the mean squared error is ' + std)

Mean squared error of this model is: 42.0781
Mean squared error of this model is: 41.8161
Mean squared error of this model is: 41.846
Mean squared error of this model is: 41.9117
Mean squared error of this model is: 42.1016
Mean squared error of this model is: 41.9591
Mean squared error of this model is: 41.7605
Mean squared error of this model is: 41.7462
Mean squared error of this model is: 41.9702
Mean squared error of this model is: 41.653
Mean squared error of this model is: 41.9482
Mean squared error of this model is: 41.8665
Mean squared error of this model is: 42.0839
Mean squared error of this model is: 41.9461
Mean squared error of this model is: 41.6541
Mean squared error of this model is: 41.7352
Mean squared error of this model is: 41.9266
Mean squared error of this model is: 41.7832
Mean squared error of this model is: 42.112
Mean squared error of this model is: 42.1559
Mean squared error of this model is: 41.873
Mean squared error of this model is: 41.9233
Mean squared e

Mean: 42.0072 Standard Deviation: 0.1762

### Conclusion
* The most significant improvement in the performance of a neural network model was found from data normalization.
* Increase in the number of epochs and the number of hidden layers had positive effects on the model's performace.