# Predicting Concrete Compression Strength using Regression

### *A. Build a baseline model*

**imports**

In [0]:
from sklearn.model_selection import train_test_split
import numpy as np
import pandas as pd
import keras
from keras.layers import Dense
from google.colab import output

Using TensorFlow backend.


**Getting Data Set**

In [0]:
X = pd.read_csv('https://cocl.us/concrete_data')
X.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


**Separating Features and Targets**

In [0]:
Y = X.pop('Strength')
print('Features:\n', X.head())
print('Targets:\n', Y.head())

Features:
    Cement  Blast Furnace Slag  Fly Ash  ...  Coarse Aggregate  Fine Aggregate  Age
0   540.0                 0.0      0.0  ...            1040.0           676.0   28
1   540.0                 0.0      0.0  ...            1055.0           676.0   28
2   332.5               142.5      0.0  ...             932.0           594.0  270
3   332.5               142.5      0.0  ...             932.0           594.0  365
4   198.6               132.4      0.0  ...             978.4           825.5  360

[5 rows x 8 columns]
Targets:
 0    79.99
1    61.89
2    40.27
3    41.05
4    44.30
Name: Strength, dtype: float64


In [0]:
# saving input shape for later use in defining first layer of ANN
input_shape = X.shape[1]

***A generic functions for all parts of exercise***

In [0]:
def create_model(hidden_layers):
    model = keras.models.Sequential()
    model.add(Dense(10, activation='relu', input_shape=(input_shape,)))
    for n in range(hidden_layers):
        model.add(Dense(10, activation='relu'))
    model.add(Dense(1))

    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

**Getting results for part A**

In [0]:
losses = []

for _ in range(50):
    x_train, x_test, y_train, y_test = train_test_split(X, Y, test_size=0.30)
    model = create_model(1) # 1 hidden layer
    model.fit(x_train, y_train, epochs=50)
    losses.append(model.evaluate(x_test, y_test))

output.clear()
mse = np.array(losses)
print("Mean of MSE:", mse.mean())
print("Standard Deviation of MSE:", mse.std())

Mean of MSE: 149.6971592448213
Standard Deviation of MSE: 86.6208041117065


### *B. Normalize the data*
Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation.

In [0]:

X_norm = (X - X.mean()) / X.std()
X_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


**results for part B**

In [0]:
losses = []

for _ in range(50):
    x_train, x_test, y_train, y_test = train_test_split(X_norm, Y, test_size=0.30)
    model = create_model(1)
    model.fit(x_train, y_train, epochs=50)
    losses.append(model.evaluate(x_test, y_test))

output.clear()
mse = np.array(losses)
print("Mean of MSE:", mse.mean())
print("Standard Deviation of MSE:", mse.std())

Mean of MSE: 149.92144518312125
Standard Deviation of MSE: 11.844941022757316


### *C. Increase the number of epochs*
Repeat Part B but use 100 epochs this time for training.

In [0]:
losses = []

for _ in range(50):
    x_train, x_test, y_train, y_test = train_test_split(X_norm, Y, test_size=0.30)
    model = create_model(1)
    model.fit(x_train, y_train, epochs=100) # epochs to 100
    losses.append(model.evaluate(x_test, y_test))

output.clear()
mse = np.array(losses)
print("Mean of MSE:", mse.mean())
print("Standard Deviation of MSE:", mse.std())

Mean of MSE: 119.9896006555156
Standard Deviation of MSE: 10.78721577362423


### *D. Increase the number of hidden layers*
Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function

In [0]:
losses = []

for _ in range(50):
    x_train, x_test, y_train, y_test = train_test_split(X_norm, Y, test_size=0.30)
    model = create_model(3)   # 3 hidden layers 
    model.fit(x_train, y_train, epochs=50)
    losses.append(model.evaluate(x_test, y_test))

output.clear()
mse = np.array(losses)
print("Mean of MSE:", mse.mean())
print("Standard Deviation of MSE:", mse.std())

Mean of MSE: 112.32674698740148
Standard Deviation of MSE: 22.74569044312257
