# Concrete Strength Prediction Analysis

In this analysis, we'll explore the prediction of concrete strength using neural networks. We'll cover the following parts:

**Part A: Baseline Model**
In this part, we'll build a baseline neural network model with one hidden layer and evaluate its performance.

**Part B: Normalization and Comparison**
In this part, we'll normalize the data and compare the mean squared errors with the baseline model.

**Part C: Increase Epochs and Comparison**
Here, we'll increase the number of training epochs and compare the mean squared errors with the previous steps.

**Part D: Three Hidden Layers and Comparison**
Finally, we'll use a neural network with three hidden layers and compare its performance to the previous models.

## Analysis and Conclusion

In this section, we'll analyze the results from each part and draw meaningful conclusions.


## Import Libraries ##

In [1]:
# Import necessary libraries

import pandas as pd
import numpy as np

from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from keras.models import Sequential
from keras.layers import Dense
from sklearn.preprocessing import StandardScaler



## Part A: Baseline Model

In [2]:
#Load the csv file

df = pd.read_csv('concrete_data.csv')
df.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


In [3]:
# Display the count of missing values for each column
df.isnull().sum()

Cement                0
Blast Furnace Slag    0
Fly Ash               0
Water                 0
Superplasticizer      0
Coarse Aggregate      0
Fine Aggregate        0
Age                   0
Strength              0
dtype: int64

In [4]:
# Split the data into features (X) and target (y)
X = df.drop(columns=['Strength'])
y = df['Strength']

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)


In [5]:
# Build the neural network model
model = Sequential()
model.add(Dense(10, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model on the training data
model.fit(X_train, y_train, epochs=50)

# Predict using the trained model
predictions = model.predict(X_test)

# Calculate the Mean Squared Error (MSE)
mse = mean_squared_error(y_test, predictions)

Epoch 1/50
Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50
Epoch 34/50
Epoch 35/50
Epoch 36/50
Epoch 37/50
Epoch 38/50
Epoch 39/50
Epoch 40/50
Epoch 41/50
Epoch 42/50
Epoch 43/50
Epoch 44/50
Epoch 45/50
Epoch 46/50
Epoch 47/50
Epoch 48/50
Epoch 49/50
Epoch 50/50


In [6]:
# Create a formatted box for MSE output


mse_box = f"""
******************************
*     Mean Squared Error     *
*                            *
*        {mse:.4f}           *
*                            *
******************************
"""

print(mse_box)



******************************
*     Mean Squared Error     *
*                            *
*        329.1592           *
*                            *
******************************



In [7]:
# Create an empty list to store MSE values
mse_list = []

# Repeat the process 50 times
for _ in range(50):
    X = df.drop(columns=['Strength'])
    y = df['Strength']
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)
    
    # Build and compile the model
    model = Sequential()
    model.add(Dense(10, input_dim=X_train.shape[1], activation='relu'))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mean_squared_error')
    model.fit(X_train, y_train, epochs=50, verbose=0)  # Set verbose=0 to suppress output
    
    # Train the model (suppress verbose output)
    model.fit(X_train, y_train, epochs=50, verbose=0)
    
    # Predict using the trained model
    predictions = model.predict(X_test)
    
    # Calculate and store the MSE value
    mse = mean_squared_error(y_test, predictions)
    mse_list.append(mse)





In [8]:
# Print the list of MSE values for each iteration
for i, mse in enumerate(mse_list):
    mse_box = f"""
    *******************************
    *   Mean Squared Error {i+1:2d}     *
    *                             *
    *         {mse:.4f}            *
    *                             *
    *******************************
    """
    print(mse_box)# Build and compile the model


    *******************************
    *   Mean Squared Error  1     *
    *                             *
    *         98.0430            *
    *                             *
    *******************************
    

    *******************************
    *   Mean Squared Error  2     *
    *                             *
    *         2317.5540            *
    *                             *
    *******************************
    

    *******************************
    *   Mean Squared Error  3     *
    *                             *
    *         80.6004            *
    *                             *
    *******************************
    

    *******************************
    *   Mean Squared Error  4     *
    *                             *
    *         114.4262            *
    *                             *
    *******************************
    

    *******************************
    *   Mean Squared Error  5     *
    *                             *
    

In [9]:
# Calculate the mean and standard deviation of the MSE values
mean_mse = np.mean(mse_list)
std_mse = np.std(mse_list)

# Report the mean and standard deviation
print(f"Mean of Mean Squared Errors: {mean_mse:.4f}")
print(f"Standard Deviation of Mean Squared Errors: {std_mse:.4f}")

Mean of Mean Squared Errors: 198.8528
Standard Deviation of Mean Squared Errors: 361.5544


## Part B: Normalization and Comparison

In [10]:
# Normalize the data
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3, random_state=42)

# Build the neural network model
model = Sequential()
model.add(Dense(10, input_dim=X_train.shape[1], activation='relu'))
model.add(Dense(1))
model.compile(optimizer='adam', loss='mean_squared_error')

# Train the model on the training data
model.fit(X_train, y_train, epochs=50, verbose=0)

# Predict using the trained model
predictions = model.predict(X_test)

# Calculate the Mean Squared Error (MSE)
mse_normalized = mean_squared_error(y_test, predictions)

# Compare the mean squared errors
print(f"Mean Squared Error (Before Normalization): {mean_mse:.4f}")
print(f"Mean Squared Error (After Normalization): {mse_normalized:.4f}")

Mean Squared Error (Before Normalization): 198.8528
Mean Squared Error (After Normalization): 514.8816


In [11]:
# Create an empty list to store MSE values for normalized data
normalized_mse_list = []

# Repeat the process 50 times
for _ in range(50):
    # Normalize the data
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)
    
    # Split the data into training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3, random_state=42)
    
    # Build and compile the model
    model = Sequential()
    model.add(Dense(10, input_dim=X_train.shape[1], activation='relu'))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mean_squared_error')
    
    # Train the model (suppress verbose output)
    model.fit(X_train, y_train, epochs=50, verbose=0)
    
    # Predict using the trained model
    predictions = model.predict(X_test)
    
    # Calculate and store the MSE value
    mse_normalized = mean_squared_error(y_test, predictions)
    normalized_mse_list.append(mse_normalized)

# Calculate the mean of the mean squared errors for normalized data
mean_normalized_mse = np.mean(normalized_mse_list)





In [12]:
# Print the list of MSE values for each iteration
for i, mse in enumerate(normalized_mse_list):
    mse_box = f"""
    *******************************
    *   Mean Squared Error {i+1:2d}     *
    *                             *
    *         {mse:.4f}            *
    *                             *
    *******************************
    """
    print(mse_box)# Build and compile the model


    *******************************
    *   Mean Squared Error  1     *
    *                             *
    *         243.5635            *
    *                             *
    *******************************
    

    *******************************
    *   Mean Squared Error  2     *
    *                             *
    *         454.7247            *
    *                             *
    *******************************
    

    *******************************
    *   Mean Squared Error  3     *
    *                             *
    *         354.8817            *
    *                             *
    *******************************
    

    *******************************
    *   Mean Squared Error  4     *
    *                             *
    *         296.6535            *
    *                             *
    *******************************
    

    *******************************
    *   Mean Squared Error  5     *
    *                             *
   

In [13]:
# Calculate the mean and standard deviation of the normalized MSE values
mean_normalized_mse = np.mean(normalized_mse_list)
std_normalized_mse = np.std(normalized_mse_list)

# Report the results
print(f"Mean of Normalized Mean Squared Errors: {mean_normalized_mse:.4f}")
print(f"Standard Deviation of Normalized Mean Squared Errors: {std_normalized_mse:.4f}")

Mean of Normalized Mean Squared Errors: 344.0504
Standard Deviation of Normalized Mean Squared Errors: 90.9970


## Comparison of Mean Squared Errors

### Step A (Non-Normalized Data)
- Mean of Mean Squared Errors: 198.8528
- Standard Deviation of Mean Squared Errors: 361.5544

### Step B (Normalized Data)
- Mean of Normalized Mean Squared Errors: 344.0504
- Standard Deviation of Normalized Mean Squared Errors: 90.9970

The comparison between Step A and Step B shows that after normalizing the data, the mean squared errors have changed. The mean squared error is generally higher in the case of normalized data, indicating that the model's predictions have more variance from the actual values. The standard deviation of the mean squared errors is also lower for normalized data, suggesting that the spread of errors is less when the data is normalized.

## Part C: Increase Epochs and Comparison



In [14]:


# Split the data into features (X) and target (y)
X = df.drop(columns=['Strength'])
y = df['Strength']

# Create an empty list to store normalized MSE values
normalized_mse_list_100_epochs = []

# Repeat the process 50 times
for _ in range(50):
    # Normalize the data
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)
    
    # Split the data into training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3, random_state=42)
    
    # Build and compile the model
    model = Sequential()
    model.add(Dense(10, input_dim=X_train.shape[1], activation='relu'))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mean_squared_error')
    
    # Train the model with 100 epochs (suppress verbose output)
    model.fit(X_train, y_train, epochs=100, verbose=0)
    
    # Predict using the trained model
    predictions = model.predict(X_test)
    
    # Calculate and store the normalized MSE value
    mse_normalized_100_epochs = mean_squared_error(y_test, predictions)
    normalized_mse_list_100_epochs.append(mse_normalized_100_epochs)

# Calculate the mean of the normalized MSE values for 100 epochs
mean_normalized_mse_100_epochs = np.mean(normalized_mse_list_100_epochs)





In [15]:
# Compare the mean squared errors
print(f"Mean of Normalized Mean Squared Errors (50 Epochs): {mean_normalized_mse:.4f}")
print(f"Mean of Normalized Mean Squared Errors (100 Epochs): {mean_normalized_mse_100_epochs:.4f}")


Mean of Normalized Mean Squared Errors (50 Epochs): 344.0504
Mean of Normalized Mean Squared Errors (100 Epochs): 157.6991


## Analysis of Mean Squared Errors with Different Epochs

### Step B (50 Epochs)
- Mean of Normalized Mean Squared Errors: 344.0504

### Step C (100 Epochs)
- Mean of Normalized Mean Squared Errors: 157.6991

The comparison between Step B and Step C shows that increasing the number of epochs from 50 to 100 has significantly improved the mean squared error. The mean squared error decreased from 344.05044 to 157.6991, indicating that the model's predictions have become more accurate and closer to the actual values. This suggests that training the model for a greater number of epochs has helped it converge to a better solution, resulting in improved performance on the test data.


## Part D: Three Hidden Layers and Comparison

In [16]:

# Create an empty list to store normalized MSE values
normalized_mse_list_three_layers = []


# Repeat the process 50 times
for _ in range(50):
    # Normalize the data
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)
    
    # Split the data into training and test sets
    X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.3, random_state=42)
    
    # Build and compile the model with three hidden layers and ReLU activation
    model = Sequential()
    model.add(Dense(10, input_dim=X_train.shape[1], activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(10, activation='relu'))
    model.add(Dense(1))
    model.compile(optimizer='adam', loss='mean_squared_error')
    
    # Train the model (suppress verbose output)
    model.fit(X_train, y_train, epochs=50, verbose=0)
    
    # Predict using the trained model
    predictions = model.predict(X_test)
    
    # Calculate and store the normalized MSE value
    mse_normalized_three_layers = mean_squared_error(y_test, predictions)
    normalized_mse_list_three_layers.append(mse_normalized_three_layers)

# Calculate the mean of the normalized MSE values for three hidden layers
mean_normalized_mse_three_layers = np.mean(normalized_mse_list_three_layers)





In [17]:
# Compare the mean squared errors
print(f"Mean of Normalized Mean Squared Errors: {mean_normalized_mse:.4f}")
print(f"Mean of Normalized Mean Squared Errors (Three Layers): {mean_normalized_mse_three_layers:.4f}")

Mean of Normalized Mean Squared Errors: 344.0504
Mean of Normalized Mean Squared Errors (Three Layers): 126.8933


## Analysis of Mean Normalized Mean Squared Errors

### Model with One Hidden Layer
- Mean of Normalized Mean Squared Errors: 344.0504

### Model with Three Hidden Layers
- Mean of Normalized Mean Squared Errors: 126.8933

The comparison between the model with one hidden layer and the model with three hidden layers reveals a substantial improvement in the mean normalized mean squared error. The mean squared error has significantly decreased from 345.9214 (one hidden layer) to 124.2107 (three hidden layers). This suggests that the neural network with three hidden layers is better at approximating the underlying patterns in the data, resulting in more accurate predictions on the test set. The additional hidden layers allow the model to capture more complex relationships within the data, leading to improved performance.
