# Concrete Strength Estimator Regression model using Keras  

In this project, I have built a regression model using the deep learning Keras library, and then I experimented that with increasing the number of training epochs and changing number of hidden layers and I made observations of how changing these parameters impacted the performance of the model.
A regression model using the Keras library to model the concrete compressive strength data.

Project is divided in 4 sections (A, B, C, D as mentioned below).

Model is trained with 70% data and tested by 30% of data.

For Section A, B and C: 1 hidden layer and 10 nodes each layer Neural Network is used.

For Section D: 3 hidden layer and 10 nodes each layer Neural Network is used.
 
The predictors in the data of concrete strength include:
1.	Cement
2.	Blast Furnace Slag
3.	Fly Ash
4.	Water
5.	Superplasticizer
6.	Coarse Aggregate
7.	Fine Aggregate

The target in the data is 'Strength'.


# Importing Libraries

In [2]:
import pandas as pd
import numpy as np
import keras
import statistics
from keras.models import Sequential
from keras.layers import Dense
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

Using TensorFlow backend.


# Importing, splitting and normalizing Data

In [3]:
# Step 2.1 - Import csv file
concrete_data = pd.read_csv('https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv')
# Step 2.2 - Split
concrete_data_columns = concrete_data.columns
predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']]
target = concrete_data['Strength']
# Step 2.3 - Normalize
predictors_norm = (predictors - predictors.mean()) / predictors.std()
n_cols = predictors_norm.shape[1]

# Neural Network (1 Hidden Layers, 10 Nodes)  

In [4]:
def regression_model_One():
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

# Section A : Train & Test (Non-normalize Data, 30% Test , 50 Epochs, 50 Fits)

In [5]:
model_One = regression_model_One()
error_list = []
for x in range(50):
    # Split Data into Test and Train
    X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size = 0.3)
    # Run Neural Network
    y = model_One.fit(X_train, y_train, epochs=50, verbose=0)
    # Find predictions, get error, add to list
    y_pred = model_One.predict(X_test)
    errors = mean_squared_error(y_test, y_pred)
    error_list.append(errors)

Instructions for updating:
Colocations handled automatically by placer.
Instructions for updating:
Use tf.cast instead.


# Analysing Error Data

In [6]:
mean_error_One = statistics.mean(error_list)
SD_error_One = statistics.stdev(error_list)
print("Average of the errors = ", round(mean_error_One, 2))
print("Standard Deviation of errors = ", round(SD_error_One, 2))

Average of the errors =  132.22
Standard Deviation of errors =  120.65


# Section B : Train & Test (Normalized Data, 30% Test, 50 Epochs, 50 Fits)

In [7]:
model_Two = regression_model_One()
error_list_Two = []
for x in range(50):
    # Split Data into Test and Train
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size = 0.3)
    # Run Neural Network
    y = model_Two.fit(X_train, y_train, epochs=50, verbose=0)
    # Find predictions, get error, add to list
    y_pred = model_Two.predict(X_test)
    errors = mean_squared_error(y_test, y_pred)
    error_list_Two.append(errors)

# Analysing Error Data

In [9]:
mean_error_Two = statistics.mean(error_list_Two)
SD_error_Two = statistics.stdev(error_list_Two)
#print(error_list_Two)
print("Average of the errors = ", round(mean_error_Two, 2))
print("Standard Deviation of errors = ", round(SD_error_Two, 2))

Average of the errors =  59.94
Standard Deviation of errors =  37.4


# Section C : Train & Test (Normalized Data, 30% Test, 100 Epochs, 100 Fits)

In [10]:
model_Three = regression_model_One()
error_list_Three = []
for x in range(50):
    # Split Data into Test and Train
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size = 0.3)
    # Run Neural Network
    y = model_Three.fit(X_train, y_train, epochs=100, verbose=0)
    # Find predictions, get error, add to list
    y_pred = model_Three.predict(X_test)
    errors = mean_squared_error(y_test, y_pred)
    error_list_Three.append(errors)

# Analysing Error Data

In [11]:
mean_error_Three = statistics.mean(error_list_Three)
SD_error_Three = statistics.stdev(error_list_Three)
#print(error_list_Three)
print("Average of the errors = ", round(mean_error_Three, 2))
print("Standard Deviation of errors = ", round(SD_error_Three, 2))

Average of the errors =  36.67
Standard Deviation of errors =  23.14


# Section D : Neural Network (3 Hidden Layers, 10 Nodes)  

In [12]:
def regression_model_Two():
    model = Sequential()
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(10, activation='relu', input_shape=(n_cols,)))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

# Train & Test (Normalized, 30 % Test, 50 Epochs , 50 Fits)

In [13]:
model_Four = regression_model_Two()
error_list_Four = []
for x in range(50):
    # Split Data into Test and Train
    X_train, X_test, y_train, y_test = train_test_split(predictors_norm, target, test_size = 0.3)
    # Run Neural Network
    y = model_Four.fit(X_train, y_train, epochs=50, verbose=0)
    # Find predictions, get error, add to list
    y_pred = model_Four.predict(X_test)
    errors = mean_squared_error(y_test, y_pred)
    error_list_Four.append(errors)

# Analysing Error Data

In [14]:
mean_error_Four = statistics.mean(error_list_Four)
SD_error_Four = statistics.stdev(error_list_Four)
#print(error_list_Four)
print("Average of the errors = ", round(mean_error_Four, 2))
print("Standard Deviation of errors = ", round(SD_error_Four, 2))

Average of the errors =  32.36
Standard Deviation of errors =  18.07


# Final Report

In [15]:
print("When the Data was normalized (SECTION B): \n Average error changed from ", round(mean_error_One, 2), " to ", round(mean_error_Two, 2), "\n Std Dev changed from ", round(SD_error_One, 2)," to ", round(SD_error_Two, 2))
print("\nWhen the Epochs increased to 100 (Section C): \n Average error changed from ", round(mean_error_Two, 2), " to ", round(mean_error_Three, 2), "\n Std Dev changed from ", round(SD_error_Two, 2)," to ", round(SD_error_Three, 2))
print("\nWhen the more Hidden Layers were added (Section D): \n Average error changed from ", round(mean_error_Two, 2), " to ", round(mean_error_Four, 2), "\n Std Dev changed from ", round(SD_error_Two, 2), " to ", round(SD_error_Four, 2))

When the Data was normalized (SECTION B): 
 Average error changed from  132.22  to  59.94 
 Std Dev changed from  120.65  to  37.4

When the Epochs increased to 100 (Section C): 
 Average error changed from  59.94  to  36.67 
 Std Dev changed from  37.4  to  23.14

When the more Hidden Layers were added (Section D): 
 Average error changed from  59.94  to  32.36 
 Std Dev changed from  37.4  to  18.07
