<a href="https://colab.research.google.com/github/Chaitanya-Shinde/DeepLearning/blob/main/RegressionModels_with_Keras.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

##Building a regression model network with Keras, and training and testing it

The dataset used is about the compressivev strength of diffferent samples of concrete based onn the volumes of the diifferent ingredients used to make them.

The ingredients include:

* Cement
* Blast Furnace Slag
* Fly ash
* Water
* Superplasticizer
* Coarse aggregate
* Fine aggregate

### Importing libraries

In [2]:
import pandas as pd
import numpy as np
import keras
import warnings
warnings.simplefilter('ignore', FutureWarning)
import os
os.environ['TF_ENABLE_ONEDNN_OPTS']='0'
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'

### Downloading and cleaning the dataset

In [3]:
url = 'https://s3-api.us-geo.objectstorage.softlayer.net/cf-courses-data/CognitiveClass/DL0101EN/labs/data/concrete_data.csv'
concreteData = pd.read_csv(url)
concreteData.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [5]:
concreteData.shape

(1030, 9)

In [6]:
concreteData.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


In [7]:
#checking for null values
concreteData.isnull().sum()

Unnamed: 0,0
Cement,0
Blast Furnace Slag,0
Fly Ash,0
Water,0
Superplasticizer,0
Coarse Aggregate,0
Fine Aggregate,0
Age,0
Strength,0


### splitting the data into dependent (target) and independent (features) set

In [8]:
concreteDataCols = concreteData.columns
concreteDataCols

Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer',
       'Coarse Aggregate', 'Fine Aggregate', 'Age', 'Strength'],
      dtype='object')

In [14]:
features = concreteData.iloc[: , :-1]
target = concreteData.iloc[:,-1]

In [18]:
features.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


In [19]:
target.head()

Unnamed: 0,Strength
0,79.99
1,61.89
2,40.27
3,41.05
4,44.3


In [20]:
#standardizing the features set
featuresNorm = (features - features.mean()) / features.std()
featuresNorm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


In [22]:
features.shape

(1030, 8)

In [23]:
numOfFeatures = features.shape[1]
numOfFeatures

8

### importing keras packages

In [24]:
from keras.models import Sequential
from keras.layers import Dense
from keras.layers import Input

### building the neural network

the neural network will contain:

1 input layer

2 hidden layers with relu activation

1 output layer

In [25]:
def regression_model():
  model = Sequential()
  model.add(Input(shape=(numOfFeatures,))) ## input layer of 8 neurons
  model.add(Dense(50, activation='relu')) ## 1st hiddden layer of 50 neurons
  model.add(Dense(50, activation='relu')) ## 2nd hidden layer of 50 neurons
  model.add(Dense(1))

  # compiling the model
  model.compile(optimizer='adam', loss = 'mean_squared_error')
  return model

In [26]:
model = regression_model()

### Training and testing the model

In [27]:
model.fit(featuresNorm, target, validation_split=0.3, epochs=100, verbose = 2)

Epoch 1/100
23/23 - 3s - 138ms/step - loss: 1669.6793 - val_loss: 1171.7664
Epoch 2/100
23/23 - 0s - 7ms/step - loss: 1576.1205 - val_loss: 1073.1741
Epoch 3/100
23/23 - 0s - 13ms/step - loss: 1404.3477 - val_loss: 903.8058
Epoch 4/100
23/23 - 0s - 8ms/step - loss: 1115.6125 - val_loss: 662.4260
Epoch 5/100
23/23 - 0s - 11ms/step - loss: 737.6196 - val_loss: 405.1009
Epoch 6/100
23/23 - 0s - 8ms/step - loss: 408.6606 - val_loss: 222.8067
Epoch 7/100
23/23 - 0s - 7ms/step - loss: 258.1481 - val_loss: 164.1157
Epoch 8/100
23/23 - 0s - 13ms/step - loss: 223.9626 - val_loss: 152.5854
Epoch 9/100
23/23 - 0s - 14ms/step - loss: 208.4768 - val_loss: 148.8394
Epoch 10/100
23/23 - 0s - 12ms/step - loss: 198.5166 - val_loss: 148.4975
Epoch 11/100
23/23 - 0s - 19ms/step - loss: 189.5382 - val_loss: 146.6809
Epoch 12/100
23/23 - 1s - 25ms/step - loss: 183.1068 - val_loss: 146.5950
Epoch 13/100
23/23 - 0s - 12ms/step - loss: 177.1673 - val_loss: 142.1675
Epoch 14/100
23/23 - 0s - 17ms/step - loss: 

<keras.src.callbacks.history.History at 0x7f9f2f28dd10>

### variations

adding 5 hidden layers each with 50 nodes and ReLU activation function, 1 single output layer with adam optimizer

In [30]:
def test_regression_model():
  model = Sequential()
  model.add(Input(shape=(numOfFeatures,)))
  model.add(Dense(50, activation='relu'))
  model.add(Dense(50, activation='relu'))
  model.add(Dense(50, activation='relu'))
  model.add(Dense(50, activation='relu'))
  model.add(Dense(50, activation='relu'))
  model.add(Dense(1))

  model.compile(optimizer='adam', loss='mean_squared_error')
  return model

In [31]:
test_model = test_regression_model()

In [32]:
test_model.fit(featuresNorm, target, validation_split=0.3, epochs=100, verbose =2)

Epoch 1/100
23/23 - 3s - 116ms/step - loss: 1621.0941 - val_loss: 1054.3682
Epoch 2/100
23/23 - 0s - 12ms/step - loss: 950.4409 - val_loss: 196.9261
Epoch 3/100
23/23 - 0s - 12ms/step - loss: 295.6071 - val_loss: 218.7522
Epoch 4/100
23/23 - 0s - 14ms/step - loss: 248.2499 - val_loss: 174.6168
Epoch 5/100
23/23 - 0s - 18ms/step - loss: 216.3423 - val_loss: 159.0069
Epoch 6/100
23/23 - 1s - 23ms/step - loss: 198.3630 - val_loss: 160.8890
Epoch 7/100
23/23 - 0s - 22ms/step - loss: 184.6168 - val_loss: 151.3741
Epoch 8/100
23/23 - 0s - 8ms/step - loss: 168.6210 - val_loss: 141.6033
Epoch 9/100
23/23 - 0s - 12ms/step - loss: 154.7256 - val_loss: 143.4797
Epoch 10/100
23/23 - 0s - 7ms/step - loss: 139.5005 - val_loss: 144.0588
Epoch 11/100
23/23 - 0s - 14ms/step - loss: 126.7549 - val_loss: 143.7778
Epoch 12/100
23/23 - 0s - 13ms/step - loss: 115.0425 - val_loss: 141.8217
Epoch 13/100
23/23 - 0s - 12ms/step - loss: 103.3772 - val_loss: 147.3150
Epoch 14/100
23/23 - 0s - 13ms/step - loss: 93

<keras.src.callbacks.history.History at 0x7f9f2f0aa0d0>

In [33]:
test_model2 = test_regression_model()

In [34]:
test_model2.fit(featuresNorm, target, validation_split=0.1, epochs=100)

Epoch 1/100
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - loss: 1559.8628 - val_loss: 832.7778
Epoch 2/100
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 788.3831 - val_loss: 218.6338
Epoch 3/100
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 217.7903 - val_loss: 202.1262
Epoch 4/100
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 197.5913 - val_loss: 191.9313
Epoch 5/100
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step - loss: 185.3894 - val_loss: 177.7802
Epoch 6/100
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 173.4704 - val_loss: 188.7988
Epoch 7/100
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step - loss: 161.4094 - val_loss: 154.0125
Epoch 8/100
[1m29/29[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step - loss: 138.0950 - val_loss: 152.0957
Epoch 9/100
[

<keras.src.callbacks.history.History at 0x7f9f27de5c90>

Based on the results, we notice that:

- Adding more hidden layers to the model increases its capacity to learn and represent complex relationships within the data. This allows the model to better identify, as a result, the model becomes more effective at fitting the training data and potentially improving its predictions.
- By reducing the proportion of data set aside for validation and using a larger portion for training, the model has access to more examples to learn from. This additional training data helps the model improve its understanding of the underlying trends, which can lead to better overall performance.

In [53]:
input = np.array([300,0,0,180,0,1000,700, 30])
input.shape

(8,)

In [54]:
inputData = input.reshape(1,-1)
inputData.shape

(1, 8)

In [55]:
#standardizing the input
inputDataNorm = (inputData - features.mean().values) / features.std().values
inputDataNorm

array([[ 0.18020085, -0.85647182, -0.8467326 , -0.07339447, -1.03863825,
         0.34829184, -0.91773727, -0.24793664]])

In [56]:
prediction = model.predict(inputDataNorm)
prediction

[1m1/1[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 83ms/step


array([[29.260757]], dtype=float32)

In [57]:
prediction.shape

(1, 1)

In [60]:
print(prediction[0][0])

29.260757


In [67]:
print(f'the strength of the concrete given the following feature inputs is: \n{inputData}, \nis: \n{prediction[0][0]}')

the strength of the concrete given the following feature inputs is: 
[[ 300    0    0  180    0 1000  700   30]], 
is: 
29.260757446289062
