# Assignment: Compresive Strength Concrete Problem


### Abstract: 

Concrete is the most important material in civil engineering. The concrete compressive strength (concrete strength to bear the load) is a highly nonlinear function of age and ingredients.  <br><br>

<table border="1"  cellpadding="6" bordercolor="red">
	<tbody>
        <tr>
		<td bgcolor="#DDEEFF"><p class="normal"><b>Data Set Characteristics:&nbsp;&nbsp;</b></p></td>
		<td><p class="normal">Multivariate</p></td>
		<td bgcolor="#DDEEFF"><p class="normal"><b>Number of Instances:</b></p></td>
		<td><p class="normal">1030</p></td>
		<td bgcolor="#DDEEFF"><p class="normal"><b>Area:</b></p></td>
		<td><p class="normal">Physical</p></td>
        </tr>
     </tbody>
    </table>
<table border="1" cellpadding="6">
    <tbody>
        <tr>
            <td bgcolor="#DDEEFF"><p class="normal"><b>Attribute Characteristics:</b></p></td>
            <td><p class="normal">Real</p></td>
            <td bgcolor="#DDEEFF"><p class="normal"><b>Number of Attributes:</b></p></td>
            <td><p class="normal">9</p></td>
            <td bgcolor="#DDEEFF"><p class="normal"><b>Date Donated</b></p></td>
            <td><p class="normal">2007-08-03</p></td>
        </tr>
     </tbody>
    </table>
<table border="1" cellpadding="6">	
    <tbody>
    <tr>
		<td bgcolor="#DDEEFF"><p class="normal"><b>Associated Tasks:</b></p></td>
		<td><p class="normal">Regression</p></td>
		<td bgcolor="#DDEEFF"><p class="normal"><b>Missing Values?</b></p></td>
		<td><p class="normal">N/A</p></td>
		<td bgcolor="#DDEEFF"><p class="normal"><b>Number of Web Hits:</b></p></td>
		<td><p class="normal">231464</p></td>
	</tr>
    </tbody>
    </table>

###  Description:
| Features Name | Data Type | Measurement | Description |
| -- | -- | -- | -- |
Cement (component 1) | quantitative | kg in a m3 mixture | Input Variable
Blast Furnace Slag (component 2) | quantitative | kg in a m3 mixture | Input Variable
Fly Ash (component 3) | quantitative | kg in a m3 mixture | Input Variable
Water (component 4) | quantitative | kg in a m3 mixture | Input Variable
Superplasticizer (component 5) | quantitative | kg in a m3 mixture | Input Variable
Coarse Aggregate (component 6) | quantitative | kg in a m3 mixture | Input Variable
Fine Aggregate (component 7) | quantitative | kg in a m3 mixture | Input Variable
Age | quantitative | Day (1~365) | Input Variable
Concrete compressive strength | quantitative | MPa | Output Variable

### WORKFLOW :
- Load Data
- Check Missing Values ( If Exist ; Fill each record with mean of its feature )
- Standardized the Input Variables. **Hint**: Centeralized the data
- Split into 50% Training(Samples,Labels) , 30% Test(Samples,Labels) and 20% Validation Data(Samples,Labels).
- Model : input Layer (No. of features ), 3 hidden layers including 10,8,6 unit & Output Layer with activation function relu/tanh (check by experiment).
- Compilation Step (Note : Its a Regression problem , select loss , metrics according to it)
- Train the Model with Epochs (100) and validate it
- If the model gets overfit tune your model by changing the units , No. of layers , activation function , epochs , add dropout layer or add Regularizer according to the need .
- Evaluation Step
- Prediction


# Load Data:
[Click Here to Download DataSet](https://github.com/ramsha275/ML_Datasets/blob/main/compresive_strength_concrete.csv)

In [2]:
import numpy as np 
import pandas as pd
import tensorflow as tf 
from tensorflow.keras import models,layers,optimizers

In [3]:
data=pd.read_csv("data/compresive_strength_concrete.csv")
print(data.info())
print(data)

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 1030 entries, 0 to 1029
Data columns (total 9 columns):
 #   Column                                                 Non-Null Count  Dtype  
---  ------                                                 --------------  -----  
 0   Cement (component 1)(kg in a m^3 mixture)              1030 non-null   float64
 1   Blast Furnace Slag (component 2)(kg in a m^3 mixture)  1030 non-null   float64
 2   Fly Ash (component 3)(kg in a m^3 mixture)             1030 non-null   float64
 3   Water  (component 4)(kg in a m^3 mixture)              1030 non-null   float64
 4   Superplasticizer (component 5)(kg in a m^3 mixture)    1030 non-null   float64
 5   Coarse Aggregate  (component 6)(kg in a m^3 mixture)   1030 non-null   float64
 6   Fine Aggregate (component 7)(kg in a m^3 mixture)      1030 non-null   float64
 7   Age (day)                                              1030 non-null   int64  
 8   Concrete compressive strength(MPa, megapascals)  

In [30]:
train, val, test= np.split(data.sample(frac=1), [int(.5*len(X)), int(.7*len(X))])

mean=train.mean(axis=0)
train-=mean
std=train.std(axis=0)
train/=std

test-=mean
test/=std
val-=mean
val/=std

print(train.info())
print(test.info())
print(val.info())


<class 'pandas.core.frame.DataFrame'>
Int64Index: 515 entries, 837 to 450
Data columns (total 9 columns):
 #   Column                                                 Non-Null Count  Dtype  
---  ------                                                 --------------  -----  
 0   Cement (component 1)(kg in a m^3 mixture)              515 non-null    float64
 1   Blast Furnace Slag (component 2)(kg in a m^3 mixture)  515 non-null    float64
 2   Fly Ash (component 3)(kg in a m^3 mixture)             515 non-null    float64
 3   Water  (component 4)(kg in a m^3 mixture)              515 non-null    float64
 4   Superplasticizer (component 5)(kg in a m^3 mixture)    515 non-null    float64
 5   Coarse Aggregate  (component 6)(kg in a m^3 mixture)   515 non-null    float64
 6   Fine Aggregate (component 7)(kg in a m^3 mixture)      515 non-null    float64
 7   Age (day)                                              515 non-null    float64
 8   Concrete compressive strength(MPa, megapascals)  

In [33]:
trainX=train.iloc[:,:-1]
trainY=train.iloc[:,-1]

testX=test.iloc[:,:-1]
testY=test.iloc[:,-1]

valX=val.iloc[:,:-1]
valY=val.iloc[:,-1]

trainX.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 515 entries, 837 to 450
Data columns (total 8 columns):
 #   Column                                                 Non-Null Count  Dtype  
---  ------                                                 --------------  -----  
 0   Cement (component 1)(kg in a m^3 mixture)              515 non-null    float64
 1   Blast Furnace Slag (component 2)(kg in a m^3 mixture)  515 non-null    float64
 2   Fly Ash (component 3)(kg in a m^3 mixture)             515 non-null    float64
 3   Water  (component 4)(kg in a m^3 mixture)              515 non-null    float64
 4   Superplasticizer (component 5)(kg in a m^3 mixture)    515 non-null    float64
 5   Coarse Aggregate  (component 6)(kg in a m^3 mixture)   515 non-null    float64
 6   Fine Aggregate (component 7)(kg in a m^3 mixture)      515 non-null    float64
 7   Age (day)                                              515 non-null    float64
dtypes: float64(8)
memory usage: 36.2 KB


In [34]:
n_featureCols = X.shape[1]
n_featureCols

8

In [35]:
trainX.head()
trainY.head()
print(trainX.shape)
print(testX.shape)
print(valX.shape)

(515, 8)
(309, 8)
(206, 8)


In [45]:
model=models.Sequential()
model.add(layers.Dense(32,activation='relu',input_shape=(trainX.shape[1],)))# number of cols from data directly.
model.add(layers.Dense(16,activation='relu'))
model.add(layers.Dense(12,activation='relu'))
model.add(layers.Dense(1))# No activation on last layer on regression
model.compile(optimizer='rmsprop',loss='mse',metrics=['mae'])

In [46]:
history=model.fit(trainX,trainY,epochs=200,batch_size=64,validation_data=(valX,valY))

Epoch 50/200
Epoch 51/200
Epoch 52/200
Epoch 53/200
Epoch 54/200
Epoch 55/200
Epoch 56/200
Epoch 57/200
Epoch 58/200
Epoch 59/200
Epoch 60/200
Epoch 61/200
Epoch 62/200
Epoch 63/200
Epoch 64/200
Epoch 65/200
Epoch 66/200
Epoch 67/200
Epoch 68/200
Epoch 69/200
Epoch 70/200
Epoch 71/200
Epoch 72/200
Epoch 73/200
Epoch 74/200
Epoch 75/200
Epoch 76/200
Epoch 77/200
Epoch 78/200
Epoch 79/200
Epoch 80/200
Epoch 81/200
Epoch 82/200
Epoch 83/200
Epoch 84/200
Epoch 85/200
Epoch 86/200
Epoch 87/200
Epoch 88/200
Epoch 89/200
Epoch 90/200
Epoch 91/200
Epoch 92/200
Epoch 93/200
Epoch 94/200
Epoch 95/200
Epoch 96/200
Epoch 97/200
Epoch 98/200
Epoch 99/200
Epoch 100/200
Epoch 101/200
Epoch 102/200
Epoch 103/200
Epoch 104/200
Epoch 105/200
Epoch 106/200
Epoch 107/200
Epoch 108/200
Epoch 109/200
Epoch 110/200
Epoch 111/200
Epoch 112/200
Epoch 113/200
Epoch 114/200
Epoch 115/200
Epoch 116/200
Epoch 117/200
Epoch 118/200
Epoch 119/200
Epoch 120/200
Epoch 121/200
Epoch 122/200
Epoch 123/200
Epoch 124/200


In [51]:
results = model.evaluate(testX, testY, batch_size=128)



In [52]:
yPred=model.predict(testX)
print(yPred[0])
print(testY.iloc[0])
print(yPred[10])
print(testY.iloc[10])
print(yPred[20])
print(testY.iloc[20])

[-0.9872216]
-0.8274038871300198
[-0.28553846]
0.2946883792106291
[-1.4426808]
-1.3048768832673059
