# Assignment 3 (Part 1)

Part 1 of the third assignment is worth 25 points.

## Neural Networks

Also known as Multi-Layer-Perceptrons (MLP). Hence for this assignment you will use the MLPClassifier class from Sklearn. 

Take a look at the documentation to learn more about the default parameterisation (which activation function it uses, which optimizer/solver it uses, number and size of hidden layers, etc.) of the MLPClassifer: 

https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html. 


This parctical part of the assignment is devided in 2 main tasks:


*   Training a neural network on MNIST data (5 points)
*   Training a neural networks on customer data (20 points)







### Task 1: Neural Network Classifier on MNIST 

In [None]:
# load required libraries
from sklearn.datasets import fetch_openml
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

The task will be to perform classification on handwritten digits from 0 to 9 (MNIST dataset). (We've seen this dataset in the previous assignment)

In [None]:
# download dataset from https://www.openml.org/ which contains many sample datasets for machine learning
X, y = fetch_openml('mnist_784', version=1, return_X_y=True)

**Recap**: The dataset contains 70000 examples of which each example has 784 values (pixels). These pixels are in a flat array but represent a 28 by 28 pixel gray-scale image. Values range from 0 to 255 which is common in the RGB value range. A value of 0 represents a black pixel whereas 255 represents a white pixel. Different shades of gray are any value larger than 0 but smaller than 255.

In [None]:
# if we want to plot a single example we need to reshape the array
first_image = np.array(<IMAGE>, dtype='float').reshape((28, 28))
plt.imshow(first_image, cmap='gray')

#### Instructions

**You are expected to do:**


*   Data preparation: 
 *   Perform a 80/20 train/test split
 *   Perform feature scaling
*   Train the model
 *   Please use `MLPClassifier` from `sklearn.neural_network`
*   Evaluate the model performance
 *   Calculate the accuracy
 *   Plot the confusion matrix
 *   Additionally, plot some misclassified instances  (if there are any). You can use the plt.imshow() function as shown above
* Compare the model performance with the results of the softmax regression on MNIST in the previous assignment



### Task 2: neural network classifier


In [1]:
# load required libraries
from sklearn.preprocessing import MinMaxScaler
from sklearn.model_selection import train_test_split
from sklearn.neural_network import MLPClassifier
from sklearn.metrics import accuracy_score
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

#### Dataset 

This is a classic marketing bank dataset uploaded originally in the UCI Machine Learning Repository and contains >41k records. You can find more information about the features (attributes) on the official UCI website:
https://archive.ics.uci.edu/ml/datasets/bank+marketing

The dataset gives you information about a marketing campaign of a financial institution in which can be analysed in order to find ways to look for future strategies in order to improve future marketing campaigns for the bank.

The target variable is called 'deposit' which describes if a person has subscribed to a term deposit (German: "Termineinlage", more information: https://www.investopedia.com/terms/t/termdeposit.asp).

---

Your task will be to train a neural network which will be used to predict if a person will subscribe to a term deposit.

In [None]:
# Import the data
data = pd.read_csv('https://raw.githubusercontent.com/schneiderson/ATIT2-22/main/sample_data/bank.csv')
X = data.iloc[:, :-1]
y = data.iloc[:, -1]
X.head()

#### Instructions

This task will combine a lot of different aspect of what we have discussed in class over the past weeks.

**You are expected to do:**

*   **Data exploration** (6 points):
 *   Check which features are available. 
 > *   Can some features directly be discarded?
 *   Check if data is messy (e.g. missing values)
 *   Check for correlation with target variable
 *   Look for outliers
 *   Class distribution
*   **Data preparation** (6 points):
 *   Perform some data cleaning e.g.
  >  *   Replace missing values
  >  *   Outlier handling
  >  *   Removal of duplicates
 *   Convert non-numeric features to numeric features
 *   Perform a 80/20 train/test split
 *   Perform feature scaling
 *   In case of class imbalance, think about how you want to deal with it. Please briefly explain your decision.
*   **Training and model evaluation** (8 points):
 *   Please use `MLPClassifier` from `sklearn.neural_network`
 > *   The model should have 4 hidden layers with sizes hidden_layer_size=(10, 1) (parameter hidden_layer_sizes)
 > *   Set the batch_size to 64
 *   Evaluate the model performance
 > *   Calculate the accuracy and other metrics which might be helpful to evaluate the model's performance
 > *   Based on you findings, describe some measures you could take to improve the model's performance even further.
 *   Please train another model using one of the techniques we have discussed in the lectures and compare the performance to the performance achieved with the neural network.


**For each decision you make, briefly explain your reasoning.**



---


#### Further tips for working on the assignment:

When analyzing the model's performance, please think about what the baseline performance of the task would be and if your model performs better or not. It is quite unlikely the model will get a perfect score with the given parametrization. You can try to improve the performance by varying several hyperparameters of the model (e.g. number of hidden layers and number of neurons in a hidden layer, batch_size, training epochs, etc.). 

Please be aware that too many hidden layers and neurons and a large number of epochs will cause the model to train longer. If the model is too complex you might encounter time-outs in Colab.

If the number of epochs is too low, sklearn will show a warning that the model has not yet converged.