#### D. Increase the number of hidden layers (5 marks)

Repeat part B but use a neural network with the following instead:

- Three hidden layers, each of 10 nodes and ReLU activation function.

How does the mean of the mean squared errors compare to that from Step B?

In [1]:
# Import necessary libraries
import numpy as np  # For numerical operations
import pandas as pd  # For data manipulation

from sklearn.metrics import mean_squared_error  # To calculate the Mean Squared Error
from sklearn.model_selection import train_test_split  # To split the dataset into training and test sets
from sklearn.preprocessing import StandardScaler  # To normalize the data
from tensorflow.keras.layers import Dense  # To define the layers of the neural network
from tensorflow.keras.models import Sequential  # To build the neural network
from tensorflow.keras.optimizers import Adam  # Optimizer for training the model

In [2]:
# Load the dataset
url = "concrete_data.csv"  # Path to the dataset
data = pd.read_csv(url)  # Load the dataset into a pandas DataFrame

In [3]:
# Split data into predictors (X) and target variable (y)
X = data.drop("Strength", axis=1)  # Features/predictors (all columns except "Strength")
y = data["Strength"]  # Target variable ("Strength")

In [4]:
# Part A: Build Baseline Model
def baseline_model(X, y):
    """
    Builds and evaluates a baseline regression model using Keras.
    The model:
    - Has one hidden layer with 10 nodes and ReLU activation.
    - Uses the Adam optimizer and mean squared error loss function.
    The process is repeated 50 times, and the mean and standard deviation of the MSEs are computed.

    Parameters:
    X: Features (predictors)
    y: Target variable (concrete strength)

    Returns:
    Mean and standard deviation of the MSEs from 50 iterations.
    """
    mse_list = []  # List to store MSEs from each iteration

    # Repeat the process 50 times
    for _ in range(50):
        # Split the dataset into training (70%) and testing (30%) sets
        X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=np.random.randint(0, 100))
        
        # Build the neural network model
        model = Sequential([
            Dense(10, activation='relu', input_shape=(X_train.shape[1],)),  # Hidden layer with 10 nodes
            Dense(1)  # Output layer with a single node (for regression)
        ])
        
        # Compile the model using Adam optimizer and mean squared error loss
        model.compile(optimizer=Adam(), loss='mean_squared_error')
        
        # Train the model on the training data for 50 epochs
        model.fit(X_train, y_train, epochs=50, verbose=0)
        
        # Evaluate the model on the test data
        y_pred = model.predict(X_test, verbose=0)  # Predict on the test set
        mse = mean_squared_error(y_test, y_pred)  # Calculate Mean Squared Error
        mse_list.append(mse)  # Append the MSE to the list
    
    # Return the mean and standard deviation of the MSEs
    return np.mean(mse_list), np.std(mse_list)

# Part B: Normalize the data
def normalized_model(X, y):
    """
    Builds and evaluates a regression model using normalized data.
    The normalization process:
    - Subtracts the mean and divides by the standard deviation for each feature.
    The rest of the process is the same as the baseline model, with 50 iterations.

    Parameters:
    X: Features (predictors)
    y: Target variable (concrete strength)

    Returns:
    Mean and standard deviation of the MSEs from 50 iterations with normalized data.
    """
    # Normalize the data using StandardScaler
    scaler = StandardScaler()  # Initialize the scaler
    X_normalized = scaler.fit_transform(X)  # Fit the scaler to X and transform it
    
    # Call the baseline model function with the normalized data
    return baseline_model(X_normalized, y)

# Part D: Increase the number of hidden layers
def increased_layers_model(X, y):
    """
    Builds and evaluates a regression model with three hidden layers.
    The model:
    - Has three hidden layers, each with 10 nodes and ReLU activation.
    - Uses the Adam optimizer and mean squared error loss function.
    The dataset is normalized before training, and the process is repeated 50 times.

    Parameters:
    X: Features (predictors)
    y: Target variable (concrete strength)

    Returns:
    Mean and standard deviation of the MSEs from 50 iterations with increased hidden layers.
    """
    mse_list = []  # List to store MSEs from each iteration

    # Normalize the dataset to zero mean and unit variance
    scaler = StandardScaler()  # Initialize the StandardScaler
    X_normalized = scaler.fit_transform(X)  # Fit and transform X

    # Repeat the process 50 times for robust evaluation
    for _ in range(50):
        # Split the normalized dataset into training and testing sets (70% train, 30% test)
        X_train, X_test, y_train, y_test = train_test_split(
            X_normalized, y, test_size=0.3, random_state=np.random.randint(0, 100)
        )
        
        # Build the neural network model with three hidden layers
        model = Sequential([
            Dense(10, activation='relu', input_shape=(X_train.shape[1],)),  # First hidden layer with 10 nodes
            Dense(10, activation='relu'),  # Second hidden layer with 10 nodes
            Dense(10, activation='relu'),  # Third hidden layer with 10 nodes
            Dense(1)  # Output layer with a single node (for regression task)
        ])
        
        # Compile the model using Adam optimizer and mean squared error loss function
        model.compile(optimizer=Adam(), loss='mean_squared_error')
        
        # Train the model on the training data for 50 epochs
        # Adding more layers increases the model's complexity, which can improve its ability to learn complex patterns
        model.fit(X_train, y_train, epochs=50, verbose=0)
        
        # Evaluate the model on the test data
        y_pred = model.predict(X_test, verbose=0)  # Predict on the test set
        mse = mean_squared_error(y_test, y_pred)  # Calculate the Mean Squared Error
        mse_list.append(mse)  # Append the MSE to the list
    
    # Return the mean and standard deviation of the MSEs
    return np.mean(mse_list), np.std(mse_list)

In [5]:
# Call the function to run the model with increased hidden layers and capture the results
mean_d, std_d = increased_layers_model(X, y)

In [6]:
print("Part D - Increased Hidden Layers: Mean MSE =", mean_d, "Std MSE =", std_d)

Part D - Increased Hidden Layers: Mean MSE = 129.39389660207763 Std MSE = 13.736224909535


### **Comparison of Results:**

| **Model**                | **Mean MSE** | **Std MSE** | **Observations**                                                                                                                                       |
|---------------------------|--------------|-------------|---------------------------------------------------------------------------------------------------------------------------------------------------------|
| **Part A - Baseline**     | 283.68       | 236.16      | Baseline performance with unnormalized data. Shows moderate accuracy but high variability, indicating inconsistent performance across iterations.       |
| **Part B - Normalized**   | 369.76       | 93.85       | Normalization increases variability in predictions (higher Mean MSE), but significantly reduces Std MSE, showing more consistent model behavior.         |
| **Part C - Increased Epochs** | 170.80       | 19.65       | Training for 100 epochs improves accuracy (lower Mean MSE) and reduces variability significantly, suggesting better learning of the data patterns.       |
| **Part D - Increased Hidden Layers** | 129.39       | 13.74       | Adding more hidden layers further improves accuracy (lowest Mean MSE) and stabilizes performance (lowest Std MSE), indicating better model complexity.   |

---

### **Conclusions:**

1. **Baseline Model (Part A):**
   - The baseline model achieves a **Mean MSE of 283.68**, but its high **Std MSE of 236.16** reflects inconsistent performance.
   - This suggests the need for further optimization (e.g., normalization, training adjustments, or architecture changes).

2. **Normalized Model (Part B):**
   - Normalizing the data leads to a **higher Mean MSE (369.76)** but a much **lower Std MSE (93.85)**.
   - The trade-off indicates that normalization improves model consistency but may require other adjustments (like epochs or layers) to achieve better accuracy.

3. **Increased Epochs (Part C):**
   - Training for 100 epochs significantly reduces the **Mean MSE to 170.80** and the **Std MSE to 19.65**.
   - This shows that extending the training allows the model to learn better patterns, resulting in both improved accuracy and stability.

4. **Increased Hidden Layers (Part D):**
   - Adding three hidden layers achieves the **best performance** with the **lowest Mean MSE (129.39)** and **lowest Std MSE (13.74)**.
   - This indicates that increasing model depth enables the network to better capture complex relationships in the data, improving both accuracy and consistency.

---

### **Overall Summary:**

- **Accuracy (Mean MSE):**  
  The performance improves progressively from Part A to Part D, with the lowest Mean MSE observed in Part D. This shows that enhancing the model through additional training epochs and hidden layers improves its predictive capability.

- **Stability (Std MSE):**  
  The standard deviation of MSE decreases across the parts, with the lowest Std MSE in Part D. This suggests that increasing training time and network depth stabilizes model predictions.

- **Best Model:**  
  **Part D (Increased Hidden Layers)** outperforms all others in both accuracy and consistency, making it the most effective configuration for this task.

---

### **Recommendations:**

- Use **normalized data**, extend training epochs, and design a deeper model (with multiple hidden layers) for optimal performance.
- Future experimentation could explore further hyperparameter tuning (e.g., number of layers, nodes, learning rate) or regularization techniques to refine the model further.