 ## D. Neural Network: Regression

Modify the developed architecture to model regression task. Test your architecture on the **Airfoil Self-Noise** Data Set ('Airfoil.csv'). Save your solution as a seperate notebook file with appropriate filename.

**Note:**
1. Perform the train/validate/test split as 70/15/15.
2. Use Random seed as '777' wherever needed.
3. Report appropriate measures like MSE, MAE.

More details on the dataset can be found at: https://archive.ics.uci.edu/ml/datasets/Airfoil+Self-Noise

In [1]:
# Package imports
import numpy as np
import sklearn
import sklearn.linear_model
from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score
from sklearn import preprocessing
import matplotlib.pyplot as plt

%matplotlib inline

np.random.seed(777)



## 1. Loading the dataset and preprocessing

In [2]:
import pandas as pd
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the dataset
file_path = 'Airfoil.csv'
data = pd.read_csv(file_path)

X = data.iloc[:, :-1]  # all columns except the last one
Y = data.iloc[:, -1]   # only the last column
X = X.values
Y = Y.values

X_train, X_temp, Y_train, Y_temp = train_test_split(X, Y, test_size=0.3, random_state=777)
X_val, X_test, Y_val, Y_test = train_test_split(X_temp, Y_temp, test_size=0.5, random_state=777)

scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_val = scaler.transform(X_val)
X_test = scaler.transform(X_test)

X_train = X_train.T 
X_val = X_val.T 
X_test = X_test.T
# Shapes of the datasets
print("Shapes of the datasets:")
print("X_train:", X_train.shape)
print("Y_train:", Y_train.shape)
print("X_val:", X_val.shape)
print("Y_val:", Y_val.shape)
print("X_test:", X_test.shape)
print("Y_test:", Y_test.shape)


Shapes of the datasets:
X_train: (5, 1051)
Y_train: (1051,)
X_val: (5, 225)
Y_val: (225,)
X_test: (5, 226)
Y_test: (226,)


In [3]:
def model_architecture(X, Y):
    """
    Arguments:
    X -- input dataset of shape (input size, number of examples)
    Y -- labels of shape (output size, number of examples)
    
    Returns:
    n_x -- the size of the input layer
    n_h -- the size of the hidden layer
    n_y -- the size of the output layer
    """
    ### START CODE HERE ### 
    n_x = X.shape[0] # size of input layer
    n_h = 10
    n_y = 1 
    ### END CODE HERE ###
    return (n_x, n_h, n_y)


In [4]:
def initialize_parameters(n_x, n_h, n_y):
    """
    Argument:
    n_x -- size of the input layer
    n_h -- size of the hidden layer
    n_y -- size of the output layer (number of classes for multiclass)

    Returns:
    params -- python dictionary containing your parameters:
                    W1 -- weight matrix of shape (n_h, n_x)
                    b1 -- bias vector of shape (n_h, 1)
                    W2 -- weight matrix of shape (n_y, n_h)
                    b2 -- bias vector of shape (n_y, 1)
    """

    np.random.seed(2)

    ### START CODE HERE ###
    W1 = np.random.randn(n_h, n_x) * 0.01
    b1 = np.zeros((n_h, 1))
    W2 = np.random.randn(n_y, n_h) * 0.01
    b2 = np.zeros((n_y, 1))
    ### END CODE HERE ###

    assert (W1.shape == (n_h, n_x))
    assert (b1.shape == (n_h, 1))
    assert (W2.shape == (n_y, n_h))
    assert (b2.shape == (n_y, 1))

    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2}

    return parameters


In [5]:
def forward_propagation(X, parameters):
    """
    Argument:
    X -- input data of size (n_x, m)
    parameters -- python dictionary containing your parameters (output of initialization function)
    
    Returns:
    A2 -- The output of the second activation (linear)
    cache -- a dictionary containing "Z1", "A1", "Z2" and "A2"
    """
    # Retrieve each parameter from the dictionary "parameters"
    W1 = parameters["W1"]
    b1 = parameters["b1"]
    W2 = parameters["W2"]
    b2 = parameters["b2"]
    
    # Implement Forward Propagation to calculate A2
    Z1 = np.dot(W1, X) + b1
    A1 = np.tanh(Z1)
    Z2 = np.dot(W2, A1) + b2
    A2 = Z2  # Linear activation function
    
    assert(A2.shape == (parameters["W2"].shape[0], X.shape[1]))
    
    cache = {"Z1": Z1,
             "A1": A1,
             "Z2": Z2,
             "A2": A2}
    
    return A2, cache


In [6]:
def compute_cost(A2, Y):
    """
    Arguments:
    A2 -- The output of the second activation, of shape (1, number of examples)
    Y -- "true" labels vector of shape (number of examples,)
       
    Returns:
    cost -- mean squared error cost
    """
    
    m = Y.size  # number of examples

    # Reshape Y to match the shape of A2
    Y = Y.reshape(A2.shape)
    
    # Compute the mean squared error cost
    cost = (1/m) * np.sum((A2 - Y)**2)
    
    cost = float(np.squeeze(cost))

    assert(isinstance(cost, float))
    
    return cost

In [7]:

def backprop(parameters, cache, X, Y):
    """
    Arguments:
    parameters -- python dictionary containing our parameters 
    cache -- a dictionary containing "Z1", "A1", "Z2" and "A2".
    X -- input data
    Y -- "true" labels
    
    Returns:
    grads -- python dictionary containing your gradients with respect to different parameters
    """
    m = X.shape[1]
    
    # First, retrieve W1 and W2 from the dictionary "parameters".
    ### START CODE HERE ### 
    W1 = parameters["W1"]
    W2 = parameters["W2"]
    ### END CODE HERE ###
        
    # Retrieve also A1 and A2 from dictionary "cache".
    ### START CODE HERE ### 
    A1 = cache["A1"]
    A2 = cache["A2"]
    ### END CODE HERE ###
    
    # Backward propagation: calculate dW1, db1, dW2, db2. 
    ### START CODE HERE ### 
    dZ2 = A2 - Y
    dW2 = (1/m) * np.dot(dZ2, A1.T)
    db2 = (1/m) * np.sum(dZ2, axis=1, keepdims=True)
    dZ1 = np.dot(W2.T, dZ2) * (1 - np.power(A1, 2))
    dW1 = (1/m) * np.dot(dZ1, X.T)
    db1 = (1/m) * np.sum(dZ1, axis=1, keepdims=True)
    ### END CODE HERE ###
    
    grads = {"dW1": dW1,
             "db1": db1,
             "dW2": dW2,
             "db2": db2}
    
    return grads


In [8]:
def update(parameters, grads, learning_rate = 0.01):
    """
    Arguments:
    parameters -- python dictionary containing your parameters 
    grads -- python dictionary containing your gradients 
    learning_rate -- The learning rate
    
    Returns:
    parameters -- python dictionary containing your updated parameters 
    """
    # Retrieve each parameter from the dictionary "parameters"
    ### START CODE HERE ### 
    W1 = parameters["W1"]
    b1 = parameters["b1"]
    W2 = parameters["W2"]
    b2 = parameters["b2"]
    ### END CODE HERE ###
    
    # Retrieve each gradient from the dictionary "grads"
    ### START CODE HERE ### 
    dW1 = grads["dW1"]
    db1 = grads["db1"]
    dW2 = grads["dW2"]
    db2 = grads["db2"]
    ## END CODE HERE ###
    
    # Update rule for each parameter
    ### START CODE HERE ### 
    W1 = W1 - learning_rate * dW1
    b1 = b1 - learning_rate * db1
    W2 = W2 - learning_rate * dW2
    b2 = b2 - learning_rate * db2
    ### END CODE HERE ###
    
    parameters = {"W1": W1,
                  "b1": b1,
                  "W2": W2,
                  "b2": b2}
    
    return parameters

In [9]:
def NeuralNetwork(X, Y, n_h, num_iterations = 10000, learning_rate = 0.01, print_cost=False):
    """
    Arguments:
    X -- dataset
    Y -- labels 
    n_h -- size of the hidden layer
    num_iterations -- Number of iterations in gradient descent loop
    learning_rate -- The learning rate
    print_cost -- if True, print the cost every 1000 iterations
    
    Returns:
    parameters -- parameters learnt by the model. They can then be used to make predictions.
    """
    
    np.random.seed(3)
    n_x = model_architecture(X, Y)[0]
    n_y = model_architecture(X, Y)[2]
    
    # Initialize parameters
    ### START CODE HERE ### 
    parameters = initialize_parameters(n_x, n_h, n_y)
    ### END CODE HERE ###
    
    # Loop (gradient descent)

    for i in range(0, num_iterations):
         
        ### START CODE HERE ### 
        # Forward propagation. Inputs: "X, parameters". Outputs: "A2, cache".
        A2, cache = forward_propagation(X, parameters)
        
        # Cost function. Inputs: "A2, Y, parameters". Outputs: "cost".
        cost = compute_cost(A2, Y)
 
        # Backpropagation. Inputs: "parameters, cache, X, Y". Outputs: "grads".
        grads = backprop(parameters, cache, X, Y)
 
        # Gradient descent parameter update. Inputs: "parameters, grads". Outputs: "parameters".
        parameters =  update(parameters, grads, learning_rate)
        
        ### END CODE HERE ###
        
        # Print the cost every 100 iterations
        if print_cost and i % 100 == 0:
            print ("Cost after iteration %i: %f" %(i, cost))

    return parameters

In [10]:
def predict(parameters, X):
    """
    Arguments:
    parameters -- python dictionary containing your parameters 
    X -- input data 
    
    Returns
    predictions -- vector of predictions of our model
    """
    
    # Compute output using forward propagation
    A2, cache = forward_propagation(X, parameters)
    predictions = A2.flatten()  # Flatten to ensure it is one-dimensional
    return predictions


In [11]:
from sklearn.metrics import mean_squared_error
def run_experiment(n_h, num_iterations, learning_rate, X_train, Y_train, X_val, Y_val):
    parameters = NeuralNetwork(X_train, Y_train, n_h, num_iterations, learning_rate, print_cost=False)
    predictions_val = predict(parameters, X_val)
    rmse = np.sqrt(mean_squared_error(Y_val, predictions_val))
    return rmse

# Initialize an empty list to store the results
results = []

# Run experiments
exp1_rmse = run_experiment(4, 10000, 0.01, X_train, Y_train, X_val, Y_val)
results.append(('Experiment 1', exp1_rmse))

exp2_rmse = run_experiment(8, 15000, 0.01, X_train, Y_train, X_val, Y_val)
results.append(('Experiment 2', exp2_rmse))

exp3_rmse = run_experiment(4, 10000, 0.005, X_train, Y_train, X_val, Y_val)
results.append(('Experiment 3', exp3_rmse))

exp4_rmse = run_experiment(8, 15000, 0.005, X_train, Y_train, X_val, Y_val)
results.append(('Experiment 4', exp4_rmse))

# Print results
print("resulst on the Validation set!")
for exp_result in results:
    print(f"{exp_result[0]}: RMSE = {exp_result[1]}")


resulst on the Validation set!
Experiment 1: RMSE = 3.962694025920766
Experiment 2: RMSE = 2.929347178469321
Experiment 3: RMSE = 4.696265341772368
Experiment 4: RMSE = 3.024782259392391


In [12]:
#Use the best hyperparameter from above 
parameters = NeuralNetwork(X_train, Y_train, 8, num_iterations=15000, learning_rate=0.01, print_cost=True)


Cost after iteration 0: 15637.592061
Cost after iteration 100: 28.018027
Cost after iteration 200: 19.195073
Cost after iteration 300: 18.851588
Cost after iteration 400: 18.721332
Cost after iteration 500: 18.650092
Cost after iteration 600: 18.597418
Cost after iteration 700: 18.560685
Cost after iteration 800: 18.535111
Cost after iteration 900: 18.516154
Cost after iteration 1000: 18.501854
Cost after iteration 1100: 18.490667
Cost after iteration 1200: 18.481374
Cost after iteration 1300: 18.473152
Cost after iteration 1400: 18.465431
Cost after iteration 1500: 18.457735
Cost after iteration 1600: 18.449572
Cost after iteration 1700: 18.440353
Cost after iteration 1800: 18.429275
Cost after iteration 1900: 18.415134
Cost after iteration 2000: 18.395922
Cost after iteration 2100: 18.367871
Cost after iteration 2200: 18.323430
Cost after iteration 2300: 18.250487
Cost after iteration 2400: 18.143069
Cost after iteration 2500: 18.010522
Cost after iteration 2600: 17.825806
Cost after

## Check the accuracy on the training set

In [13]:
# Print accuracy
predictions = predict(parameters, X_train)
print(predictions.shape)
print(X_train.shape)
mse = np.mean((predictions - Y_train) ** 2)
mae = np.mean(np.abs(predictions - Y_train))
print("Training measures")
print("MSE: ", mse)
print("MAE: ", mae)

(1051,)
(5, 1051)
Training measures
MSE:  7.121771702980565
MAE:  2.0876002136089764


## Check the accuracy on the test set

In [14]:
predictions_test = predict(parameters, X_test)
mseTest = np.mean((predictions_test - Y_test) ** 2)
maeTest = np.mean(np.abs(predictions_test - Y_test))
print("Test measures")
print("MSE: ", mseTest)
print("MAE: ", maeTest)

Test measures
MSE:  8.585186282112387
MAE:  2.3313471402948793


References:
- https://www.coursera.org/learn/neural-networks-deep-learning
- http://scs.ryerson.ca/~aharley/neural-networks/
- http://cs231n.github.io/neural-networks-case-study/
- https://archive.ics.uci.edu/ml/datasets.php