# Concrete Network
Assignment submission for Introduction to Deep Learning

**Part B.** Normalize the data

*Task:* Repeat the task from Part A. but with normalized data.

*Reminder:* In Part A. we were asked to build a neural network with
- one hidden layer of 10 nodes and ReLU activation functions
- **adam** optimizer and **mean squared error** as loss function

## Preparation: Loading modules and data
We install required packages, load all required modules, load the data,
and split it into a feature dataframe `features` and a target series
`target`.

In [1]:
# For compatibility, we install the required packages in the same
# version as used in the course labs.
%pip install numpy==2.0.2
%pip install pandas==2.2.2
%pip install tensorflow_cpu==2.18.0
%pip install scikit-learn==1.6.1

Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.
Note: you may need to restart the kernel to use updated packages.


We import the required Python modules and load the concrete data from
the specified URL.

In [2]:
import numpy as np
import pandas as pd
from tensorflow import keras
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error
from numpy.typing import ArrayLike
from IPython.display import clear_output

class ClearDisplay(keras.callbacks.Callback):
    """A simple custom callback function for the Keras fitting that
    clears the display before starting a new epoch"""
    def on_epoch_begin(self, epoch, logs=None):
        clear_output()
    def on_train_batch_end(self, batch, logs=None):
        pass

filepath='https://cocl.us/concrete_data'
concrete_data = pd.read_csv(filepath)

concrete_data.describe()

2025-02-17 16:14:36.999717: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6


As seen before, the data is already clean and we can proceed isolating
the **Strength** as our target.

In [3]:
target_name = 'Strength'
target = concrete_data[target_name]
features = concrete_data.drop(target_name, axis=1)
features.head()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age
0,540.0,0.0,0.0,162.0,2.5,1040.0,676.0,28
1,540.0,0.0,0.0,162.0,2.5,1055.0,676.0,28
2,332.5,142.5,0.0,228.0,0.0,932.0,594.0,270
3,332.5,142.5,0.0,228.0,0.0,932.0,594.0,365
4,198.6,132.4,0.0,192.0,0.0,978.4,825.5,360


## Build the model
(unchanged from Part A.)

In [4]:
def baseline_model(input_shape: tuple) -> keras.models.Sequential:
    """Build a baseline neural network with Keras.

    The model uses ReLU activation functions, adam optimizer and
    mean squared error as loss function.

    Args:
        input_shape (tuple) : The shape of the input data samples

    Returns:
        Model with one hidden layer of 10 nodes
    """
    model = keras.models.Sequential()
    model.add(keras.layers.Input(input_shape)) # input layer 
    model.add(keras.layers.Dense(10, activation='relu')) # hidden layer
    model.add(keras.layers.Dense(1)) # output layer

    model.compile(optimizer='adam', loss='mean_squared_error')
    return model


## 1. Split the data
We use `train_test_split` from *scikit-learn* for splitting our data.
Before doing so, however, we normalize both the target and feature data.
We apply a standard scaling, shifting values by the mean and dividing by
the standard deviation, such that the rescaled values have a mean of 0
and a standard deviation of 1. The `StandardScaler` from *scikit-learn*
does this out of the box.

Note that we must scale target and features separately and retain the
scaler for the target so we can scale predicted values back to actual
concrete strength values later.

In [5]:
# Rescale the features:
X = StandardScaler().fit_transform(features)

# Rescale the target, keep a reference to the scaler to reverse the
# transformation later:
target_scaler = StandardScaler()
y = target_scaler.fit_transform(target.values.reshape(-1, 1))

# Split the data into training and testing data:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)

## 2. Train the model
We load our model and train it on the training data using 50 epochs.

In [6]:
model = baseline_model(X_train[0].shape) # initialize model
model.fit(X_train, y_train, epochs=50, callbacks=[ClearDisplay()])

Epoch 50/50
[1m23/23[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 0.1818 


<keras.src.callbacks.history.History at 0x7d5899afb1d0>

## 3. Evaluate the model
We evaluate the model on the test data and compute the mean squared
error between the predicted and actual values for the concrete strength

In [7]:
y_pred = model.predict(X_test)
mse = mean_squared_error(y_test, y_pred)
print(f"The mean squared error for the normalized target is {mse:.4f}")

# Undo the rescaling of the StandardScaler to give the MSE in same units
# (squared) as the concrete strength:
strength_true = target_scaler.inverse_transform(y_test)
strength_pred = target_scaler.inverse_transform(y_pred)
strength_mse = mean_squared_error(strength_true, strength_pred)
print(f"The MSE for the concrete strength is {strength_mse:.4f}")

[1m10/10[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 3ms/step 
The mean squared error for the normalized target is 0.2619
The MSE for the concrete strength is 73.0271


Just like for Part A., let us look at the first 10 data points to get
an idea of how well the predictions match the true data. We do this
first for the normalized values and then for the original scale concrete
strength:

In [8]:
for test, pred in ((y_test, y_pred), (strength_true, strength_pred)):
    for i in range(10):
        print(f"{test[i].item():5.2f}, {pred[i].item():5.2f},", end=' ')
        print(f"{(test[i].item() - pred[i].item())**2:5.2f}")
    print()

 1.92,  0.50,  2.01
-0.40, -0.69,  0.09
-1.09, -1.07,  0.00
 0.84,  0.17,  0.44
 0.35,  0.01,  0.12
-1.01, -0.47,  0.29
-1.02, -0.96,  0.00
-1.24, -1.48,  0.06
-0.74, -0.36,  0.14
 0.31,  0.24,  0.01

67.87, 44.18, 561.24
29.22, 24.34, 23.83
17.58, 17.90,  0.10
49.77, 38.70, 122.53
41.72, 36.01, 32.58
19.01, 28.02, 81.18
18.75, 19.77,  1.05
15.09, 11.15, 15.53
23.52, 29.77, 39.08
40.93, 39.74,  1.41



The predictions are much more accurate than they were for the model
trained on the raw data without normalization.

## 4. Repeat 50 times
Create a list of 50 mean squared errors by running the split, train,
test cycle repeatedly. We calculate the MSE both based on the normalized
target data and based on the concrete strength obtained from inverting
the `StandardScaler`.

In [9]:
def single_cycle(X: ArrayLike, y: ArrayLike) -> tuple[float, float]:
    """Run a single cycle of splitting, training, and testing.
    
    Args:
        X : Feature data
        y : Target data

    Returns:
        The mean squared errors for the normalized target data and for
        the unnormalized concrete strength as a tuple
    """
    X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3)
    model = baseline_model(X_train[0].shape) # initialize model
    model.fit(X_train, y_train, epochs=50, verbose=0) # fit silently
    y_pred = model.predict(X_test, verbose=0) # predict silently
    strength_test = target_scaler.inverse_transform(y_test)
    strength_pred = target_scaler.inverse_transform(y_pred)
    return ( mean_squared_error(y_test, y_pred),
             mean_squared_error(strength_test, strength_pred) )

mses = [single_cycle(X, y) for _ in range(50)] # run 50 times
normalized_mses = [t[0] for t in mses]
strength_mses = [t[1] for t in mses]

## 5. Report mean and standard deviation
We calculate the mean and the standard deviation of the list of mean
squared errors for the concrete strengths.

In [10]:
normalized_mses = np.array(normalized_mses) # convert to numpy array
strength_mses = np.array(strength_mses) # convert to numpy array
print(f"The calculated mean squared errors for the normalized targets have:")
print(f"  a mean value of         {normalized_mses.mean():.4f}")
print(f"  a standard deviation of {normalized_mses.std():.4f}")
print(f"The calculated mean squared errors for the concrete strength have:")
print(f"  a mean value of         {strength_mses.mean():.4f}")
print(f"  a standard deviation of {strength_mses.std():.4f}")


The calculated mean squared errors for the normalized targets have:
  a mean value of         0.2562
  a standard deviation of 0.0621
The calculated mean squared errors for the concrete strength have:
  a mean value of         71.4198
  a standard deviation of 17.3135


## Discussion
Obviously, the errors are much smaller for the normalized values, since
the concrete strength has a mean of about 36 (standard deviation 16),
such that the normalization considerably decreases these values.

What we want to compare, however, is the mean squared error for the
concrete strength rescaled to its original size. Here we find an mean
squared error of 71 on average, compared about 300 in Part A., when we
worked without normalization. Hence, we confirm, as we could already
recognize from the small sample of predictions considered under 3.,
that the neural network model works considerably better on the
normalized data.

For example, in Part A. we found predictions of 99, far above the
maximum value in the concrete data set, when the true concrete strength
was 47, just above the 75th percentile. The predictions obtained from
the normalized data show no such severe outliers.

In [11]:
concrete_data.describe()

Unnamed: 0,Cement,Blast Furnace Slag,Fly Ash,Water,Superplasticizer,Coarse Aggregate,Fine Aggregate,Age,Strength
count,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0,1030.0
mean,281.167864,73.895825,54.18835,181.567282,6.20466,972.918932,773.580485,45.662136,35.817961
std,104.506364,86.279342,63.997004,21.354219,5.973841,77.753954,80.17598,63.169912,16.705742
min,102.0,0.0,0.0,121.8,0.0,801.0,594.0,1.0,2.33
25%,192.375,0.0,0.0,164.9,0.0,932.0,730.95,7.0,23.71
50%,272.9,22.0,0.0,185.0,6.4,968.0,779.5,28.0,34.445
75%,350.0,142.95,118.3,192.0,10.2,1029.4,824.0,56.0,46.135
max,540.0,359.4,200.1,247.0,32.2,1145.0,992.6,365.0,82.6
