## Concrete Compressive Strength Prediction


Dataset available at https://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+Strength

OR

https://github.com/ramsha275/ML_Datasets/blob/main/compresive_strength_concrete.csv


Concrete is one of the most important materials in Civil Engineering. Knowing the compressive strength of concrete is very important when constructing a building or a bridge. The Compressive Strength of Concrete is a highly nonlinear function of ingredients used in making it and their characteristics. Thus, using Deep Learning to predict the Strength could be useful in generating a combination of ingredients which result in high Strength.

This notebook demonstrates the use of Deep Learning to predict Concrete Compressive Strength.

### Problem Statement
Predicting Compressive Strength of Concrete given its age and quantitative measurements of ingredients.

### Data Description

Data is obtained from UCI Machine Learning Repository.
https://archive.ics.uci.edu/ml/datasets/Concrete+Compressive+Strength

* Number of instances - 1030
* Number of Attributes - 9
  * Attribute breakdown - 8 quantitative inputs, 1 quantitative output

#### Attribute information
##### Inputs
* Cement
* Blast Furnace Slag
* Fly Ash
* Water
* Superplasticizer
* Coarse Aggregate
* Fine Aggregate

All above features measured in kg/$m^3$

* Age (in days)

##### Output
* Concrete Compressive Strength (Mpa)

# WORKFLOW :
1.Load Data

2.Check Missing Values ( If Exist ; Fill each record with mean of its feature )

3.Standardized the Input Variables. Hint: Centeralized the data

4.Split into 50% Training(Samples,Labels) , 30% Test(Samples,Labels) and 20% Validation Data(Samples,Labels).

5.Model : input Layer (No. of features ), 3 hidden layers including 10,8,6 unit & Output Layer with activation function relu/tanh (check by experiment).

6.Compilation Step (Note : Its a Regression problem , select loss , metrics according to it)

7.Train the Model with Epochs (100) and validate it
If the model gets overfit tune your model by changing the units , No. of layers , activation function , epochs , add dropout layer or add Regularizer according to the need .

8.Evaluation Step

9.Prediction




In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

%matplotlib inline

##### Loading the Data 

In [3]:
data = pd.read_csv("compresive_strength_concrete.csv")

In [4]:
len(data)

1030

In [5]:
data.head()

Unnamed: 0,Cement (component 1)(kg in a m^3 mixture),Blast Furnace Slag (component 2)(kg in a m^3 mixture),Fly Ash (component 3)(kg in a m^3 mixture),Water (component 4)(kg in a m^3 mixture),Superplasticizer (component 5)(kg in a m^3 mixture),Coarse Aggregate (component 6)(kg in a m^3 mixture),Fine Aggregate (component 7)(kg in a m^3 mixture),Age (day),"Concrete compressive strength(MPa, megapascals)"
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


Simplifying Column names, since they appear to be too lengthy.

In [6]:
new_col_names = ["Cement", "BlastFurnaceSlag", "FlyAsh", "Water", "Superplasticizer",
                 "CoarseAggregate", "FineAggregate", "Age", "CC_Strength"]
curr_col_names = list(data.columns)

mapper = {}
for i, name in enumerate(curr_col_names):
    mapper[name] = new_col_names[i]

data = data.rename(columns=mapper)

In [7]:
data.head()

Unnamed: 0,Cement,BlastFurnaceSlag,FlyAsh,Water,Superplasticizer,CoarseAggregate,FineAggregate,Age,CC_Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


###### Checking for 'null' values

In [8]:
data.isna().sum()

Cement              0
BlastFurnaceSlag    0
FlyAsh              0
Water               0
Superplasticizer    0
CoarseAggregate     0
FineAggregate       0
Age                 0
CC_Strength         0
dtype: int64

There are no null values in the data.

### Data Preprocessing

Separating Input Features and Target Variable. 

In [9]:
X = data.iloc[:,:-1]         # Features - All columns but last
y = data.iloc[:,-1]          # Target - Last Column

##### Splitting data into Training ,Validation and Test splits. 

In [10]:

train_data = data.iloc[0:515]
train_targets = data.iloc[0:515]

val_data = data.iloc[515:721]
val_targets = data.iloc[515:721]

test_data = data.iloc[721:]
test_targets =data.iloc[721:]

In [12]:
print(train_data.shape)
print(train_targets.shape)
print('#'*20)
print(val_data.shape)
print(val_targets.shape)
print('#'*20)
print(test_data.shape)
print(test_targets.shape)

(515, 9)
(515, 9)
####################
(206, 9)
(206, 9)
####################
(309, 9)
(309, 9)


In [13]:
mean = train_data.mean()
std = train_data.std()

train_data -= mean
train_data /= std

val_data -= mean
val_data /= std

test_data -= mean
test_data /= std

In [16]:
print(train_data.head())
print(val_data.head())
print(test_data.head())

     Cement  BlastFurnaceSlag    FlyAsh  ...  FineAggregate       Age  CC_Strength
0  2.317058         -0.843539 -1.114583  ...      -1.197012 -0.358601     2.238902
1  2.317058         -0.843539 -1.114583  ...      -1.197012 -0.358601     1.163579
2  0.331683          1.031962 -1.114583  ...      -2.149023  3.142022    -0.120868
3  0.331683          1.031962 -1.114583  ...      -2.149023  4.516233    -0.074528
4 -0.949482          0.899032 -1.114583  ...       0.538667  4.443906     0.118555

[5 rows x 9 columns]
       Cement  BlastFurnaceSlag    FlyAsh  ...  FineAggregate       Age  CC_Strength
515 -0.916950         -0.698764  1.134061  ...       0.254225 -0.358601    -1.208073
516 -0.916950         -0.698764  1.134061  ...       0.254225 -0.720236    -1.928123
517 -0.916950         -0.698764  1.134061  ...       0.254225 -0.662374    -1.618003
518 -0.916950         -0.698764  1.134061  ...       0.254225  0.046430    -1.132028
519 -0.132368         -0.646118  1.134061  ...       0.

In [17]:
from keras import models
from keras import layers

def build_model():
    
    model = models.Sequential()
    
    model.add(layers.Dense(10, activation='relu', input_shape=(train_data.shape[1],)))
    
    model.add(layers.Dense(8, activation='relu'))
        
    model.add(layers.Dense(6, activation='relu'))
    
    model.add(layers.Dense(1))
    
    model.compile(optimizer='Adam', loss='mse', metrics=['mae'])
    
    
    return model

model = build_model()

In [18]:
history = model.fit(train_data, train_targets, validation_data = (val_data, val_targets), epochs=100, verbose=1)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Epoch 75/100
Epoch 76/100
Epoch 77/100
Epoch 78

In [19]:
predictions = model.predict(test_data)

In [20]:

predictions.shape

(309, 1)

In [21]:
predictions = predictions.reshape(309)


In [22]:
y_pred = model.predict(train_data)
print(y_pred)

[[367.08377]
 [299.25397]
 [255.79199]
 [288.51285]
 [294.82928]
 [262.7741 ]
 [284.13538]
 [255.3901 ]
 [266.46048]
 [263.06268]
 [230.56143]
 [249.33511]
 [271.88394]
 [254.7615 ]
 [270.0113 ]
 [274.9632 ]
 [276.90433]
 [322.8943 ]
 [254.33006]
 [264.04456]
 [256.1012 ]
 [282.32135]
 [313.0055 ]
 [291.75494]
 [323.48938]
 [306.56137]
 [255.43625]
 [277.77228]
 [258.86002]
 [271.2251 ]
 [312.42242]
 [298.07413]
 [237.44489]
 [280.62796]
 [306.15543]
 [285.72623]
 [261.46283]
 [249.5197 ]
 [260.14844]
 [274.31784]
 [272.53165]
 [292.03833]
 [303.80212]
 [288.46353]
 [257.36697]
 [265.61533]
 [233.07109]
 [251.02689]
 [268.8725 ]
 [261.0357 ]
 [245.3342 ]
 [259.1324 ]
 [266.21924]
 [268.99432]
 [302.98633]
 [275.56952]
 [298.82895]
 [284.27084]
 [269.39493]
 [250.01721]
 [291.8511 ]
 [276.335  ]
 [257.62155]
 [274.1215 ]
 [264.05807]
 [301.88394]
 [334.7392 ]
 [256.22717]
 [258.33865]
 [335.1287 ]
 [258.4356 ]
 [291.9741 ]
 [265.65872]
 [256.01022]
 [338.19742]
 [267.07343]
 [318.80328]

In [23]:
model.evaluate(train_data, train_targets)



[118447.546875, 280.9512634277344]

In [24]:
model.evaluate(test_data, test_targets)



[114909.6171875, 267.9645080566406]