### Importing libraries

In [35]:
import keras 
from keras.models import Sequential 
from keras.layers import Dense 
from sklearn.metrics import mean_squared_error

In [36]:
import pandas as pd 
import numpy as np 
from sklearn.model_selection import train_test_split

In [60]:
#loading dataset 
df= pd.read_csv("concrete_data.csv")
df

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.30
...,...,...,...,...,...,...,...,...,...
1025,276.4,116.0,90.3,179.6,8.9,870.1,768.3,28,44.28
1026,322.2,0.0,115.6,196.0,10.4,817.9,813.4,28,31.18
1027,148.5,139.4,108.6,192.7,6.1,892.4,780.0,28,23.70
1028,159.1,186.7,0.0,175.6,11.3,989.6,788.9,28,32.77


In [38]:
#checking for null values
df.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

In [39]:
#checking for duplicates
df.duplicated()

0       False
1       False
2       False
3       False
4       False
        ...  
1025    False
1026    False
1027    False
1028    False
1029    False
Length: 1030, dtype: bool

In [40]:
df.columns

Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer',
       'Coarse Aggregate', 'Fine Aggregate', 'Age', 'Strength'],
      dtype='object')

## Step A: Simple Neural Network (No Data Normalization):

In [42]:
# Initialize variables to store mean squared errors
mse_list = []

In [43]:
# Repeat the process 50 times
for _ in range(50):
    # Step 1: Randomly split the data into training and test datasets
    X = df.drop(columns=['Strength'])  # Features
    y = df['Strength']  # Target variable

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

    # Step 2: Build the neural network model
    model = Sequential()
    model.add(Dense(10, input_dim=X_train.shape[1], activation='relu'))
    model.add(Dense(1))  # Output layer (1 node for regression)
    model.compile(loss='mean_squared_error', optimizer='adam')

    # Step 3: Train the model
    model.fit(X_train, y_train, epochs=50)

    # Step 4: Evaluate the model and compute the mean squared error
    y_pred = model.predict(X_test)
    mse = mean_squared_error(y_test, y_pred)
    mse_list.append(mse)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50
Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/5

In [44]:
# Step 5: Report the mean and standard deviation of the mean squared errors
mean_mse = np.mean(mse_list)
std_mse = np.std(mse_list)
#Mean of Mean Squared Errors: This represents the average performance of the model across the 50 runs.
print(f"Mean of Mean Squared Errors: {mean_mse}")
#Standard Deviation of Mean Squared Errors: This indicates the variability or spread in the model's performance
print(f"Standard Deviation of Mean Squared Errors: {std_mse}")

Mean of Mean Squared Errors: 367.335202941597
Standard Deviation of Mean Squared Errors: 336.742453361398


A smaller mean MSE indicates better model performance, as it means the model's predictions are closer to the actual values on average. Conversely, a larger mean MSE suggests that the model's predictions are less accurate.

The standard deviation of the MSEs gives an idea of how consistent the model's performance is across the 50 runs. A smaller standard deviation implies that the model's performance is relatively stable, while a larger standard deviation suggests more variability in performance.
In our case :  the high standard deviation suggests a large variability in model performance.

## Step B:Data Normalization with StandardScaler:

In [45]:
# data standardization with  sklearn
from sklearn.preprocessing import StandardScaler

In [48]:
# copy of datasets
df1=df.copy()

In [50]:
# Repeat the process 50 times
for _ in range(50):
    # Step 1: Randomly split the data into training and test datasets
    X = df1.drop(columns=['Strength'])  # Features
    y = df1['Strength']  # Target variable

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # Step 2: Normalize the data using StandardScaler
    scaler = StandardScaler()
    X_train_normalized = scaler.fit_transform(X_train)
    X_test_normalized = scaler.transform(X_test)

    # Step 3: Build the neural network model
    model = Sequential()
    model.add(Dense(10, input_dim=X_train_normalized.shape[1], activation='relu'))
    model.add(Dense(1))  # Output layer (1 node for regression)
    model.compile(loss='mean_squared_error', optimizer='adam')

    # Step 4: Train the model
    model.fit(X_train_normalized, y_train, epochs=50, verbose=0)

    # Step 5: Evaluate the model and compute the mean squared error on the test data
    y_pred = model.predict(X_test_normalized)
    mse = mean_squared_error(y_test, y_pred)
    mse_list.append(mse)



In [51]:
# Step 6: Report the mean and standard deviation of the mean squared errors
mean_mse = np.mean(mse_list)
std_mse = np.std(mse_list)
print(f"Mean of Mean Squared Errors with StandardScaler: {mean_mse}")
print(f"Standard Deviation of Mean Squared Errors with StandardScaler: {std_mse}")

Mean of Mean Squared Errors with StandardScaler: 357.2921463608288
Standard Deviation of Mean Squared Errors with StandardScaler: 248.13146537486404


Because in step B we applied data normalization, this led to a slight improvement in the mean MSE compared to scenario A.

## Step C :Data Normalization with StandardScaler and 100 Epochs:

In [53]:
# copy of datasets
dfC=df.copy()

In [54]:
# Repeat the process 50 times
for _ in range(50):
    # Step 1: Randomly split the data into training and test datasets
    X = dfC.drop(columns=['Strength'])  # Features
    y = dfC['Strength']  # Target variable

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # Step 2: Normalize the data using StandardScaler
    scaler = StandardScaler()
    X_train_normalized = scaler.fit_transform(X_train)
    X_test_normalized = scaler.transform(X_test)

    # Step 3: Build the neural network model
    model = Sequential()
    model.add(Dense(10, input_dim=X_train_normalized.shape[1], activation='relu'))
    model.add(Dense(1))  # Output layer (1 node for regression)
    model.compile(loss='mean_squared_error', optimizer='adam')

    # Step 4: Train the model
    model.fit(X_train_normalized, y_train, epochs=100, verbose=0)

    # Step 5: Evaluate the model and compute the mean squared error on the test data
    y_pred = model.predict(X_test_normalized)
    mse = mean_squared_error(y_test, y_pred)
    mse_list.append(mse)



In [58]:
# Step 6: Report the mean and standard deviation of the mean squared errors
mean_mse = np.mean(mse_list)
std_mse = np.std(mse_list)
print(f"Mean of Mean Squared Errors with StandardScaler: {mean_mse}")
print(f"Standard Deviation of Mean Squared Errors with StandardScaler: {std_mse}")

Mean of Mean Squared Errors with StandardScaler: 248.12898062601934
Standard Deviation of Mean Squared Errors with StandardScaler: 207.25643516821864


In step C, we kept the data normalization using StandardScaler but increased the number of training epochs to 100. This change resulted in a significant improvement in the mean MSE compared to scenario B.

## Step D:Data Normalization with StandardScaler and 3 Hidden Layers of 10 Nodes (Relu Activation)

In [56]:
# copy of datasets
dfD=df.copy()

In [57]:
# Repeat the process 50 times
for _ in range(50):
    # Step 1: Randomly split the data into training and test datasets
    X = dfD.drop(columns=['Strength'])  # Features
    y = dfD['Strength']  # Target variable

    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

    # Step 2: Normalize the data using StandardScaler
    scaler = StandardScaler()
    X_train_normalized = scaler.fit_transform(X_train)
    X_test_normalized = scaler.transform(X_test)

    # Step 3: Build the neural network model
    model = Sequential()
    model.add(Dense(10, input_dim=X_train_normalized.shape[1], activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))  # Output layer (1 node for regression)
    model.compile(loss='mean_squared_error', optimizer='adam')

    # Step 4: Train the model
    model.fit(X_train_normalized, y_train, epochs=50, verbose=0)

    # Step 5: Evaluate the model and compute the mean squared error on the test data
    y_pred = model.predict(X_test_normalized)
    mse = mean_squared_error(y_test, y_pred)
    mse_list.append(mse)



In [59]:
# Step 6: Report the mean and standard deviation of the mean squared errors
mean_mse = np.mean(mse_list)
std_mse = np.std(mse_list)
print(f"Mean of Mean Squared Errors with StandardScaler: {mean_mse}")
print(f"Standard Deviation of Mean Squared Errors with StandardScaler: {std_mse}")

Mean of Mean Squared Errors with StandardScaler: 248.12898062601934
Standard Deviation of Mean Squared Errors with StandardScaler: 207.25643516821864


In step D, we kept the data normalization using StandardScaler but modified the neural network architecture by adding three hidden layers. The mean MSE remained the same as in scenario C, and the standard deviation also stayed consistent. This suggests that increasing the complexity of the neural network architecture did not lead to further improvements in mean MSE for this particular dataset.

## Summary

#### Normalizing the data (scenario B) improved the model's performance compared to not normalizing it (scenario A).
#### -Increasing the number of training epochs (scenario C) led to a significant improvement in the mean MSE.
#### -Adding complexity to the neural network architecture (scenario D) did not further improve the mean MSE, suggesting that the previous architecture with 100 epochs (scenario C) was sufficient for this task.
#### -Scenario C, with data normalization and 100 epochs, appears to be the most effective among the scenarios considered, as it achieved the lowest mean MSE and a relatively low standard deviation, indicating stable and improved performance.