<a href="https://colab.research.google.com/github/dton24/Notes/blob/main/Neural_Net_assignment.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# CIS 4321
# Dr. Mohammad Salehan
## Artificial Neural Networks
East-West Airlines has entered into a partnership with the wireless phone company Telcon to sell the latter's service
via direct mail.  These are a sample of data, provided so that the analyst can develop a model to classify East-West
customers as to whether they purchase a wireless phone service contract (target variable Phone_sale).


### Data Description
* ID#: Unique ID
* Topflight: Indicates whether flyer has attained elite "Topflight" status, 1 = yes, 0 = no
* Balance: Number of miles eligible for award travel
* Qual_miles: Number of miles counted as qualifying for Topflight status
* cc1_miles?: Has member earned miles with airline freq. flyer credit card in the past 12 months (1=Yes/0=No)?
* cc2_miles?: Has member earned miles with Rewards credit card in the past 12 months (1=Yes/0=No)?
* cc3_miles?: Has member earned miles with Small Business credit card in the past 12 months (1=Yes/0=No)?
* Bonus_miles: Number of miles earned from non-flight bonus transactions in the past 12 months
* Bonus_trans: Number of non-flight bonus transactions in the past 12 months
* Flight_miles_12mo: Number of flight miles in the past 12 months
* Flight_trans_12: Number of flight transactions in the past 12 months
* Online_12: Number of online purchases within the past 12 months
* Email: E-mail address on file. 1= yes, 0 =no?
* Club_member: Member of the airline's club (paid membership), 1=yes, 0=no
* Any_cc_miles_12mo: Dummy variable indicating whether member added miles on any credit card type within the past 12 months (1='Y', 0='N')
* Phone_sale: Dummy variable indicating whether member purchased Telcom service as a result of the direct mail campaign (1=sale, 0=no sale)

In [None]:
%matplotlib inline

from pathlib import Path

import pandas as pd
from sklearn.preprocessing import StandardScaler
from sklearn.neural_network import MLPClassifier
from sklearn.model_selection import train_test_split
import matplotlib.pylab as plt
from dmba import classificationSummary

In [None]:
df = pd.read_excel('EastWestAirlinesNN.xlsx', 'data')
df.shape

(4987, 16)

In [None]:
df.head(1)

Unnamed: 0,ID#,Topflight,Balance,Qual_miles,cc1_miles?,cc2_miles?,cc3_miles?,Bonus_miles,Bonus_trans,Flight_miles_12mo,Flight_trans_12,Online_12,Email,Club_member,Any_cc_miles_12mo,Phone_sale
0,1.0,0.0,28143.0,0.0,0.0,1.0,0.0,174.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0


Here is of the two classes in the target column.

In [None]:
df['Phone_sale'].value_counts(normalize=True)

0.0    0.868606
1.0    0.131394
Name: Phone_sale, dtype: float64

1. What is the naive rule in this example?

## Preprocessing

In [None]:
df = df.drop(columns=['ID#'])
df.dropna(inplace=True)
df.head(1)

Unnamed: 0,Topflight,Balance,Qual_miles,cc1_miles?,cc2_miles?,cc3_miles?,Bonus_miles,Bonus_trans,Flight_miles_12mo,Flight_trans_12,Online_12,Email,Club_member,Any_cc_miles_12mo,Phone_sale
0,0.0,28143.0,0.0,0.0,1.0,0.0,174.0,1.0,0.0,0.0,0.0,1.0,0.0,1.0,0.0


## Partitioning

In [None]:
X = df.drop(columns=['Phone_sale'])
y = df['Phone_sale']
train_X, valid_X, train_y, valid_y = train_test_split(X, y, test_size=0.4, random_state=26)

## Normalization
2. Normalize the dataset using Z-score.

In [None]:
# Intialize the scaler
scaler = StandardScaler()

# Fit and transform training data
x_train_scaled = scaler.fit_transform(train_X)

x_train_scaled

array([[-0.45660141, -0.33084676, -0.18754907, ...,  0.74090375,
        -0.31372099,  0.9249103 ],
       [-0.45660141, -0.27308743, -0.18754907, ..., -1.34970297,
        -0.31372099,  0.9249103 ],
       [-0.45660141, -0.44355756, -0.18754907, ..., -1.34970297,
        -0.31372099,  0.9249103 ],
       ...,
       [-0.45660141, -0.51084787, -0.18754907, ..., -1.34970297,
        -0.31372099, -1.08118593],
       [-0.45660141, -0.40578955, -0.18754907, ..., -1.34970297,
        -0.31372099,  0.9249103 ],
       [-0.45660141, -0.48050494, -0.18754907, ...,  0.74090375,
        -0.31372099,  0.9249103 ]])

In [None]:
# Fit and transform test data
x_valid_scaled = scaler.fit_transform(valid_X)
x_valid_scaled

array([[-0.4380307 , -0.59108237, -0.17839992, ..., -1.37319807,
        -0.32358792, -1.06531856],
       [-0.4380307 ,  0.16989776, -0.17839992, ...,  0.72822707,
        -0.32358792,  0.93868636],
       [-0.4380307 , -0.67588168, -0.17839992, ...,  0.72822707,
        -0.32358792, -1.06531856],
       ...,
       [-0.4380307 ,  0.30797335, -0.17839992, ...,  0.72822707,
        -0.32358792, -1.06531856],
       [-0.4380307 ,  0.50752442, -0.17839992, ...,  0.72822707,
        -0.32358792,  0.93868636],
       [-0.4380307 , -0.24863265, -0.17839992, ...,  0.72822707,
        -0.32358792,  0.93868636]])

## Training
3. Train a Neural Networl with a single hidden layer composed of 3 nodes and relu activation function and display training and validation performance.

In [None]:
# Initialize the MLPClassifier with one hidden layer of 3 nodes (hidden_layer_sizes=(3,)), and ReLU activation function (activation = 'relu')
mlp = MLPClassifier(hidden_layer_sizes=(3,), activation='relu', random_state=26)

# Train the neural network on the scaled training data
mlp.fit(x_train_scaled, train_y)

# Predictions for the training set
train_predictions = mlp.predict(x_train_scaled)

# Predictions for the validation set
valid_predictions = mlp.predict(x_valid_scaled)  # Make sure you have already scaled the validation features

# Display training performance
print("Training Performance:")
classificationSummary(train_y, train_predictions)

# Display validation performance
print("\nValidation Performance:")
classificationSummary(valid_y, valid_predictions)

Training Performance:
Confusion Matrix (Accuracy 0.8442)

       Prediction
Actual    0    1
     0 2501  111
     1  355   24

Validation Performance:
Confusion Matrix (Accuracy 0.8345)

       Prediction
Actual    0    1
     0 1646   72
     1  258   18


4. Based on the above results, is overfitting or underfitting a concern? Why?

5. Compare the performance of the validation set with the naive model. Is this a useful model? Why? If not, how would you improve it?