<a href="https://colab.research.google.com/github/jahanviakuri/Resume_screening/blob/main/Assignment4_(3).ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Assignment 4: Neural Networks (30 marks)
### Due date: March 31 at 11:59pm
*Author: YOUR NAME*

For this assignment, you will be practicing using scikit-learn and TensorFlow to implement basic neural networks (MLP). You can use the given dataset below, or you can use the dataset you have selected for your project.

**Note: If you use the dataset from your project - this assignment is meant to be completed individually. You may work with your group members to complete this assignment, but the work you submit must be your own. Submitting identical assignments is a form of academic misconduct**

In [None]:
import numpy as np
import pandas as pd

## Part 1: Load your dataset (1 mark)

As stated above, you can use the dataset from your project. If you want to practice neural networks with a different dataset, you can use the energy dataset from Yellowbrick (https://www.scikit-yb.org/en/latest/api/datasets/energy.html)

In [None]:
!pip install yellowbrick

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [None]:
# Load dataset

from yellowbrick.datasets import load_energy
from sklearn.ensemble import RandomForestRegressor
from sklearn.model_selection import train_test_split
df = pd.read_excel('ENB2012_data.xlsx')
df.columns=["relative compactness","surface area","wall area","roof area","overall height","orientation","glazing area","glazing area distribution","heating load", "cooling load"]



## Part 2: Process your dataset (5 marks)

In [None]:
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_absolute_error

In [None]:
# Check if there are any missing values - if yes, decide how to fill them

print(df.isnull().sum())

relative compactness         0
surface area                 0
wall area                    0
roof area                    0
overall height               0
orientation                  0
glazing area                 0
glazing area distribution    0
heating load                 0
cooling load                 0
dtype: int64


In [None]:
# Check the range of each feature - do you need to scale your data?
range=df.describe().loc[['min','max']]
print(range)

     relative compactness  surface area  wall area  roof area  overall height  \
min                  0.62         514.5      245.0     110.25             3.5   
max                  0.98         808.5      416.5     220.50             7.0   

     orientation  glazing area  glazing area distribution  heating load  \
min          2.0           0.0                        0.0          6.01   
max          5.0           0.4                        5.0         43.10   

     cooling load  
min         10.90  
max         48.03  


In [None]:
# Split your data into training and testing datasets (select random_state=0 and use default test_size)

features = [
   "relative compactness",
   "surface area",
   "wall area",
   "roof area",
   "overall height",
   "orientation",
   "glazing area",
   "glazing area distribution"
]
target = ["heating load", "cooling load"]

X, y = df[features], df[target]

X.shape
y.shape

(768, 2)

In [None]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

In [None]:
# Implement scaling and/or encoding here if needed (2 marks for preprocessing properly or justifying why it isn't needed)


scaler = StandardScaler()
X_train = scaler.fit_transform(X_train)
X_test = scaler.transform(X_test)

For energy efficiency database, all the features are numerical and measured on similar scales so scaling is not that  necessary 

Only the columns  'orientation' and 'glazing area distribution' have categorial values,but these are already encoded numerically so no need of encoding 

## Part 4: Implement MLP using scikit-learn (5 marks)

In [None]:
from sklearn.neural_network import MLPRegressor
from sklearn.model_selection import train_test_split

from sklearn.metrics import mean_absolute_error


In [None]:
# Test using default parameters (set max_iter=500 - for this assignment, don't worry about reaching convergence)


mlp1 = MLPRegressor(hidden_layer_sizes=(100,), activation='relu', solver='adam',alpha=0.001, max_iter=500, random_state=None)
mlp1.fit(X_train, y_train)




In [None]:
# Test using two hidden layers with 100 nodes each

mlp2 = MLPRegressor(hidden_layer_sizes=(100,100), activation='relu', solver='adam', max_iter=500, random_state=None)
mlp2.fit(X_train, y_train)




In [None]:
# Test using three hidden layers with 100 nodes each
mlp3 = MLPRegressor(hidden_layer_sizes=(100,100,100), activation='relu', solver='adam', max_iter=500, random_state=None)
mlp3.fit(X_train, y_train)


## Part 5: Implement MLP using TensorFlow (7 marks)

In [None]:
import tensorflow as tf

from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers.experimental import preprocessing

Instead of scaling the data using a scikit-learn scaler, you can scale the data using a normalization layer.

In [None]:
# Define normalization layer
normalizer = preprocessing.Normalization()
normalizer.adapt(X_train)

A normalization layer normalizes the input data so that the mean and standard deviation are close to 0 and 1, respectively. This can help to increase neural network performance by making the optimisation process more reliable and efficient. 

Using `keras.Sequential`, implement an MLP with the same hidden layer setups as above:

In [None]:
# One hidden layer with 100 nodes and the relu activation function
# Compile the model with loss='mean_absolute_error' and optimizer=tf.keras.optimizers.Adam(0.001)
# Fit the model using validation_split=0.2, verbose=0 and epochs=100


modeltf1= tf.keras.Sequential([
    normalizer,
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(1, activation='linear')
])

optim=tf.keras.optimizers.Adam(0.001)

modeltf1.compile(optimizer=optim,loss='mean_absolute_error')

history=modeltf1.fit(X_train, y_train,validation_split=0.2, epochs=100,verbose=0)




In [None]:
# Repeat with two hidden layers with 100 nodes each and the relu activation function


modeltf2 = tf.keras.Sequential([
    normalizer,
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(1, activation='linear')
])

optim=tf.keras.optimizers.Adam(0.001)

modeltf2.compile(optimizer=optim,loss='mean_absolute_error')

history=modeltf2.fit(X_train, y_train,validation_split=0.2, epochs=100,verbose=0)




In [None]:
# Repeat with three hidden layers with 100 nodes each and the relu activation function

modeltf3 = tf.keras.Sequential([
    normalizer,
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(100, activation='relu'),
    tf.keras.layers.Dense(1, activation='linear')
])

optim=tf.keras.optimizers.Adam(0.001)

modeltf3.compile(optimizer=optim,loss='mean_absolute_error')

history=modeltf3.fit(X_train, y_train,validation_split=0.2, epochs=100,verbose=0)


## Part 6: Compare the accuracy of both methods (7 marks)

For this part, calculate the mean absolute error for each model and print in a table using pandas

In [None]:
# Calculate the MAE for the three scikit-learn tests


y_pred = mlp1.predict(X_test)

mae = mean_absolute_error(y_test, y_pred)
print("mean_absolute_error",mae)

y_pred = mlp2.predict(X_test)

mae = mean_absolute_error(y_test, y_pred)
print("mean_absolute_error",mae)

y_pred = mlp3.predict(X_test)

mae = mean_absolute_error(y_test, y_pred)
print("mean_absolute_error",mae)


mean_absolute_error 1.8501426908292982
mean_absolute_error 1.5010294029910742
mean_absolute_error 0.7818889824561874


In [None]:
# Calculate the MAE for the three tensor flow tests

mae = modeltf1.evaluate(X_test, y_test, verbose=0)
print("mean_absolute_error",mae)

mae = modeltf2.evaluate(X_test, y_test, verbose=0)
print("mean_absolute_error",mae)

mae = modeltf3.evaluate(X_test, y_test, verbose=0)
print("mean_absolute_error",mae)

mean_absolute_error 1.9021259546279907
mean_absolute_error 1.887394666671753
mean_absolute_error 1.7154834270477295


In [None]:
# Print the results

results = {
    'Model': ['MAE_scikit-learn', 'MAE_tensor-flow'],
    'Model1': [1.8501, 1.9021],
    'Model2': [1.5010,1.8873],
    'Model3': [0.7818, 1.7154],
    
}

res = pd.DataFrame(results)


print(res)

              Model  Model1  Model2  Model3
0  MAE_scikit-learn  1.8501  1.5010  0.7818
1   MAE_tensor-flow  1.9021  1.8873  1.7154


## Part 7: Questions (5 marks total)

### Question 1: Which model produced the least amount of error? (1 mark)

*ANSWER HERE*

When we implemeneted the MLP using scikit-learn with three hidden layers with 100 nodes each and the relu activation function  then we got least mean_absolute_error 

              Model  Model1  Model2  Model3
0  MAE_scikit-learn  1.8501  1.5010  0.7818
1   MAE_tensor-flow  1.9021  1.8873  1.7154

### Question 2: Why are the numbers different between the scikit-learn and TensorFlow methods when we used the same number of hidden layers and hidden units per layer? (2 marks)

*ANSWER HERE*

Eventhough we are using the same number of hidden layers and neurons per layers there are few reasons behind diffenent MAE'S

Both scikit-learn and tensorflow libraries  uses different algorithms and optimization techniques

Hyperparameter tuning also places a major role ex:learning rate,regularoization strength etc can have a significant effect on model's performace .we have to ensure both the libraries have same hyperparameter settings


Preprocessing steps also effects the MAE'S 
ex: if the data is scaled differently in scikit-learn and TensorFlow, this can lead to different results

### Question 3: Reflection (2 marks)
Include a sentence or two about:
- what you liked or disliked,
- found interesting, confusing, challenging, motivating
while working on this assignment.

*ANSWER HERE*

This assignment gives you a basic picture on how  to implement  MLP using scikit-learn and tensorflow

computing and comparing the MAE for each model by changing the hidden layers and nodes gives you a picture of 
how the model performance effects with the change in layers and nodes in MLP

Felt bit challenging while learning the hyperparameter settings for tensorflow models

I think it would be more helpful if we have more lectures,notes and examples on implementation of MLP using  scikit-learn and tensor flow and some more lectures on in-depth concepts of these libraries..
