# Regression Model in Keras

## **A. Build a baseline model**

### Downloading the dataset.


The data set is read into a $pandas$ dataframe

In [4]:
import pandas as pd
import numpy as np

# Load the data
concrete_data = pd.read_csv('https://cocl.us/concrete_data')
concrete_data.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28,79.99
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28,61.89
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270,40.27
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365,41.05
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360,44.3


### Spliting the data into /predictors and /target

The target variable is set to: concrete sample strength. The predictos will be all other columns in the data set.

In [2]:
concrete_data.columns  #To show the columns of the data set.

Index(['Cement', 'Blast Furnace Slag', 'Fly Ash', 'Water', 'Superplasticizer',
       'Coarse Aggregate', 'Fine Aggregate', 'Age', 'Strength'],
      dtype='object')

In [3]:
concrete_data_columns = concrete_data.columns

predictors = concrete_data[concrete_data_columns[concrete_data_columns != 'Strength']] # All columns except 'Strength' as predictors
target = concrete_data['Strength'] # 'Strength' column as the target variable

In [5]:
n_cols = predictors.shape[1]   
# Define the number of columns in the predictors

## Building the Neural Network

Definition of a function that defines the regression model with the specified characteristics

- One hidden layer of 10 nodes, and a ReLU activation function

- Use the adam optimizer and the mean squared error  as the loss function.

1. Randomly split the data into a training and test sets by holding 30% of the data for testing. You can use the 
train_test_split
helper function from Scikit-learn.

2. Train the model on the training data using 50 epochs.

3. Evaluate the model on the test data and compute the mean squared error between the predicted concrete strength and the actual concrete strength. You can use the mean_squared_error function from Scikit-learn.

In [22]:
import keras
from keras.models import Sequential
from keras.layers import Dense, Input
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

# Defining the regression model
def regression_model():
    model = Sequential()
    # Add an Input layer with the shape of the input data
    model.add(Input(shape=(n_cols,))) 
    # Add a Dense layer with 10 units and ReLU activation
    model.add(Dense(10, activation='relu'))
    # Add an output Dense layer with 1 unit (no activation function for regression)
    model.add(Dense(1))
    # Compile the model with Adam optimizer and mean squared error loss function
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

# Split the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(predictors, target, test_size=0.3, random_state=42)

# Display the shapes of the resulting datasets
print(f"X_train shape: {X_train.shape}")
print(f"X_test shape: {X_test.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"y_test shape: {y_test.shape}")

# Create the model
model = regression_model()

# Train the model on the training data for 50 epochs
model.fit(X_train, y_train, epochs=50, verbose=1)

# Evaluate the model on the test data to get the loss
loss = model.evaluate(X_test, y_test)
print(f"Test loss: {loss}")

# Make predictions on the test data
y_pred = model.predict(X_test)

# Compute the mean squared error between the predicted and actual values
mse = mean_squared_error(y_test, y_pred)
print(f"Mean Squared Error: {mse}")

X_train shape: (721, 8)
X_test shape: (309, 8)
y_train shape: (721,)
y_test shape: (309,)
Epoch 1/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 1ms/step - loss: 78627.0625
Epoch 2/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 39949.0195 
Epoch 3/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 19285.8418 
Epoch 4/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 7805.3237 
Epoch 5/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 4317.0840 
Epoch 6/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 3420.1628 
Epoch 7/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 3061.2063 
Epoch 8/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 2666.8420 
Epoch 9/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step 

Once the model has evaluated the test data, the **Mean Squared Error is 192.08**, for the presented iteration.

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

In [23]:
# Initialize a list to store the mean squared errors
mse_list = []

# Repeat the process 50 times
for _ in range(50):
    
    # Create and train the model
    model = regression_model()
    model.fit(X_train, y_train, epochs=50, verbose=0)
    
    # Predict on the test set
    y_pred = model.predict(X_test)
    
    # Compute the mean squared error
    mse = mean_squared_error(y_test, y_pred)
    mse_list.append(mse)


# Compute the mean and standard deviation of the mean squared errors
mean_mse = np.mean(mse_list)
std_mse = np.std(mse_list)


# Print the results
mse_list
print(f"Mean of Mean Squared Errors: {mean_mse}")
print(f"Standard Deviation of Mean Squared Errors: {std_mse}")

[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━

#### The calculated Mean of Mean Squared Errors is 402.04
#### The calculated Standard Deviation of Mean Squared Errors: 402.36

Both Mean of Mean Squared Errors and Standard Deviation of Mean Squared Errors are relatively high, suggesting a significant spread of the MSEs around the mean and a strong variability in the performance of the model.

## **B. Normalize the data**

Repeat Part A but use a normalized version of the data. Recall that one way to normalize the data is by subtracting the mean from the individual predictors and dividing by the standard deviation

The first step is to normalize the data by substracting the mean and dividing by the standard deviation.

In [18]:
predictors_norm = (predictors - predictors.mean()) / predictors.std()
predictors_norm.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,0.862735,-1.217079,-0.279597
1,2.476712,-0.856472,-0.846733,-0.916319,-0.620147,1.055651,-1.217079,-0.279597
2,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,3.55134
3,0.491187,0.79514,-0.846733,2.174405,-1.038638,-0.526262,-2.239829,5.055221
4,-0.790075,0.678079,-0.846733,0.488555,-1.038638,0.070492,0.647569,4.976069


In [27]:
# Split the normalized data into training and test sets
Xnorm_train, Xnorm_test, y2_train, y2_test = train_test_split(predictors_norm, target, test_size=0.3, random_state=42)

# Train the model on the normilized training data for 50 epochs
model.fit(Xnorm_train, y2_train, epochs=50, verbose=1) 

# Make predictions on the test data
ynormed_pred = model.predict(Xnorm_test)

# Compute the mean squared error between the predicted and actual values
mse = mean_squared_error(y2_test, ynormed_pred)
print(f"Mean Squared Error: {mse}")

Epoch 1/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 115.6469 
Epoch 2/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1000us/step - loss: 113.9235
Epoch 3/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 112.7434
Epoch 4/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 114.0084 
Epoch 5/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 951us/step - loss: 107.2537
Epoch 6/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 954us/step - loss: 125.1565
Epoch 7/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 980us/step - loss: 116.1303
Epoch 8/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 970us/step - loss: 106.4720
Epoch 9/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 109.1621
Epoch 10/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1m

Once the model has evaluated the test data, the **Mean Squared Error is 102.31**, for the presented iteration. A significant difference indicating improved accuracy of the model.

An observation worth acknowledging is that training the model with normalized data tends to result in a lower average loss or Mean Squared Error (MSE) compared to training with unnormalized data. Normalization, which typically involves scaling the data to have a mean of zero and a standard deviation of one, or scaling the data to a specific range (e.g., 0 to 1), helped improving the perfomance of the model.

With the normilized data, the following steps are performed:

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

In [28]:
# Initialize a list to store the mean squared errors
norm_mse_list = []

# Repeat the process 50 times
for _ in range(50):
    
    # Create and train the model
    model = regression_model()
    model.fit(Xnorm_train, y2_train, epochs=50, verbose=0)
    
    # Predict on the test set
    y2_pred = model.predict(Xnorm_test)
    
    # Compute the mean squared error
    mse2 = mean_squared_error(y2_test, y2_pred)
    norm_mse_list.append(mse2)


# Compute the mean and standard deviation of the mean squared errors
mean_mse = np.mean(norm_mse_list)
std_mse = np.std(norm_mse_list)


# Print the results
print(f"Mean of Mean Squared Errors: {mean_mse}")
print(f"Standard Deviation of Mean Squared Errors: {std_mse}")

[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 7ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━

### How does the mean of the mean squared errors compare to that from Step A?

Considering my initial calculations:
Step A: 
- Mean of Mean Squared Errors: 402.04
- Standard Deviation of Mean Squared Errors: 402.36

Now, normilizing the data, gives me the following:

- Mean of Mean Squared Errors: 345.49
- Standard Deviation of Mean Squared Errors: 112.67

The mean MSE decreased after normalizing the data. This indicates that the model's average predictive accuracy improved with normalized data. The standard deviation of the MSE decreased significantly after normalization. This means that the model's performance became much more consistent across different training runs. The significant reduction in the standard deviation of the MSE suggests that normalizing the data points to a more stable and reliable model training, the model's performance is less variable across different training runs.

## **C. Increate the number of epochs**

Repeat Part B but use 100 epochs this time for training.

How does the mean of the mean squared errors compare to that from Step B?

- First, the model will be trained with 100 epochs.

In [30]:
# Train the model on the normilized training data for 100 epochs
model.fit(Xnorm_train, y2_train, epochs=100, verbose=1) 

# Make predictions on the test data
ynormed_pred = model.predict(Xnorm_test)

# Compute the mean squared error between the predicted and actual values
mse = mean_squared_error(y2_test, ynormed_pred)
print(f"Mean Squared Error: {mse}")

Epoch 1/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 111.0833
Epoch 2/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 114.7939
Epoch 3/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 126.2253 
Epoch 4/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 103.8494
Epoch 5/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 111.2231
Epoch 6/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1000us/step - loss: 109.0545
Epoch 7/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 116.1826 
Epoch 8/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 108.4334 
Epoch 9/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1000us/step - loss: 106.1133
Epoch 10/100
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s

Once the model has evaluated the test data, the **Mean Squared Error is 75.67** for the presented iteration, showing a significant improvement compared to the 50-epoch test. Suggesting a better accuracy of the model.

Now, the following steps are performed:

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors (100 epochs each).

5. Report the mean and the standard deviation of the mean squared errors.

In [31]:
# Initialize a list to store the mean squared errors
norm_mse_list = []

# Repeat the process 50 times
for _ in range(50):
    
    # Create and train the model
    model = regression_model()
    model.fit(Xnorm_train, y2_train, epochs=100, verbose=0)
    
    # Predict on the test set
    y2_pred = model.predict(Xnorm_test)
    
    # Compute the mean squared error
    mse2 = mean_squared_error(y2_test, y2_pred)
    norm_mse_list.append(mse2)


# Compute the mean and standard deviation of the mean squared errors
mean_mse = np.mean(norm_mse_list)
std_mse = np.std(norm_mse_list)


# Print the results
print(f"Mean of Mean Squared Errors: {mean_mse}")
print(f"Standard Deviation of Mean Squared Errors: {std_mse}")

[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 4ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━

### How does the mean of the mean squared errors compare to that from Step A amd Step B?

Considering the results from steps A, B and C:

Step A: 
- Mean of Mean Squared Errors: 402.04
- Standard Deviation of Mean Squared Errors: 402.36

Step B (Normalized data, 50 epochs): 
- Mean of Mean Squared Errors: 345.49
- Standard Deviation of Mean Squared Errors: 112.67

Step C (Normalized data, 100 epochs): 
- Mean of Mean Squared Errors: 160.89
- Standard Deviation of Mean Squared Errors: 17.36

A significant impact in both accuracy and consistency has been achieved with data normalization and increased training. The mean MSE decreased from 402.04 (Step A) to 345.49 (Step B) and further to 160.89 (Step C), indicating that the model's predictions are becoming more accurate. 

The standard deviation of the MSE decreased significantly from 402.36 (Step A) to 112.67 (Step B) and then to 17.36 (Step C). This reduction in variability shows that the model's performance is becoming more stable and reliable.

## **D. Increase the number of hidden layers**

Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

How does the mean of the mean squared errors compare to that from Step B?

First, a modification to the original neural network is done.

In [34]:
# Defining the regression model
def regression_model():
    model = Sequential()
    # Add an Input layer with the shape of the input data
    model.add(Input(shape=(n_cols,))) 
    # Add a Dense layer with 10 units and ReLU activation
    model.add(Dense(10, activation='relu'))
    # Add a second Dense layer 
    model.add(Dense(10, activation='relu'))
    # Add a third Dense layer 
    model.add(Dense(10, activation='relu'))
    # Add an output Dense layer with 1 unit (no activation function for regression)
    model.add(Dense(1))
    # Compile the model with Adam optimizer and mean squared error loss function
    model.compile(optimizer='adam', loss='mean_squared_error')
    return model

Next, the model is trained with the normalized set of data. (50 epochs)

In [38]:
# Train the model on the normilized training data for 100 epochs
model.fit(Xnorm_train, y2_train, epochs=50, verbose=1) 

# Make predictions on the test data
ynormed_pred = model.predict(Xnorm_test)

# Compute the mean squared error between the predicted and actual values
mse = mean_squared_error(y2_test, ynormed_pred)
print(f"Mean Squared Error: {mse}")

Epoch 1/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 972us/step - loss: 72.4482
Epoch 2/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 69.8877 
Epoch 3/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 74.5409 
Epoch 4/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 71.0963 
Epoch 5/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 909us/step - loss: 70.7329
Epoch 6/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 948us/step - loss: 81.6221
Epoch 7/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 954us/step - loss: 73.8518
Epoch 8/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 936us/step - loss: 73.1839
Epoch 9/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1000us/step - loss: 67.6904
Epoch 10/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1000us/s

Once the model has evaluated the test data, the **Mean Squared Error is 76.16** for the presented iteration, showing a significant improvement compared to the 1 hidden layer (10 nodes) at 50-epoch test at MSE=102. Suggesting a better accuracy of the model (A similar performance to the 1 hidden layer - 100 epochs model).

Now, the following steps are performed:

4. Repeat steps 1 - 3, 50 times, i.e., create a list of 50 mean squared errors.

5. Report the mean and the standard deviation of the mean squared errors.

In [39]:
# Initialize a list to store the mean squared errors
norm_mse_list = []

# Repeat the process 50 times
for _ in range(50):
    
    # Create and train the model
    model = regression_model()
    model.fit(Xnorm_train, y2_train, epochs=50, verbose=0)
    
    # Predict on the test set
    y2_pred = model.predict(Xnorm_test)
    
    # Compute the mean squared error
    mse2 = mean_squared_error(y2_test, y2_pred)
    norm_mse_list.append(mse2)


# Compute the mean and standard deviation of the mean squared errors
mean_mse = np.mean(norm_mse_list)
std_mse = np.std(norm_mse_list)


# Print the results
print(f"Mean of Mean Squared Errors: {mean_mse}")
print(f"Standard Deviation of Mean Squared Errors: {std_mse}")

[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 8ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 10ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 128ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 9ms/step
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 5ms/step 
[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m

### How does the mean of the mean squared errors compare to that from Step B?

Considering the results from WStep B:

Step B (Normalized data, 1 hidden layer (10 nodes) - 50 epochs): 
- Mean of Mean Squared Errors: 345.49
- Standard Deviation of Mean Squared Errors: 112.67

Step D (Normalized data, 3 hidden layer (10 nodes) - 50 epochs): 
- Mean of Mean Squared Errors: 125.98
- Standard Deviation of Mean Squared Errors: 11.68

 Adding more hidden layers resulted in a substantial improvement in the model's accuracy. The deeper network was able to capture more complex patterns in the data, highlighting the importance of model complexity and depth in capturing the underlying patterns in the data for better predictive performance.

