## Basic Neural Networks - and Beyond...

**Data Science for Business - Spring 2025**

**Created by Aditya Deshpande and Chris Volinsky**

 Lets see if Neural Nets can improve on our models on the DirectMarketing data set...

In [1]:
#Loading Libraries
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.preprocessing import LabelBinarizer

#Importing data

[You can download the data here](https://drive.google.com/uc?export=download&id=1deEx-Ey37F7qznPlIqmaAjjkmkvBtV28). (Or, you probably already have it: `DirectMarketing.csv`) Each record represents an individual who was targeted with a direct marketing offer.  The offer was a solicitation to make a charitable donation. You'll remember this data set from last chapter!


I've copied all of the data prep code from our last module when we analyzed this data:


In [2]:
df = pd.read_csv("DirectMarketing.csv")
# remove cases where Firstdate == 0 using .loc
df = df.loc[df.Firstdate != 0]


In [3]:
# RUN THIS CELL TO DO ALL OF THE EDA/PROCESSING

# replace gavr and glast with log versions of same features using .loc
df_clean = df
df_clean['gavr'] = np.log(df.gavr+1)
df_clean['glast'] = np.log(df.glast+1)
income_cat = pd.Categorical(df['Income'], categories=[0,1,2,3,4,5,6,7])
df_clean['Income'] = income_cat

rfaf2_cat = pd.Categorical(df['rfaf2'], categories=[1,2,3,4])
df_clean['rfaf2'] = rfaf2_cat

# transform categoricals
df_clean = pd.get_dummies(df_clean, columns=['rfaa2', 'pepstrfl','Income','rfaf2'],drop_first=True)
df_clean.head()

# manage dates
df_clean = df_clean[df_clean.Firstdate > 8000]
df_clean['Firstdate'] = pd.to_datetime(df_clean['Firstdate'], format='%y%m', errors='coerce')

df_clean['Lastdate'] = pd.to_datetime(df_clean['Lastdate'], format='%y%m', errors='coerce')

# Create a new feature 'tenure'
df_clean['tenure'] = df_clean['Lastdate'] - df_clean['Firstdate']

# maybe check to see this is always greater than zero?
df_clean['tenure'].min()
today = df_clean['Lastdate'].max()
df_clean['recency'] = today - df_clean['Lastdate']

# remove Firstdate and Lastdate
df_clean = df_clean.drop(['Firstdate', 'Lastdate'], axis=1)


# make sure recency and tenure is a numeric that I can do calcuations on
df_clean['recency'] = pd.to_numeric(df_clean['recency'].dt.days, errors='coerce')
df_clean['tenure'] = pd.to_numeric(df_clean['tenure'].dt.days, errors='coerce')


In [4]:
df_clean.head()


Unnamed: 0,Amount,glast,gavr,class,rfaa2_E,rfaa2_F,rfaa2_G,pepstrfl_X,Income_1,Income_2,Income_3,Income_4,Income_5,Income_6,Income_7,rfaf2_2,rfaf2_3,rfaf2_4,tenure,recency
0,0.06,3.931826,3.433987,0,False,False,True,False,False,False,True,False,False,False,False,False,False,False,365,519
1,0.16,3.044522,3.070376,1,False,False,True,True,False,True,False,False,False,False,False,False,False,True,1492,366
2,0.2,1.791759,2.277267,0,True,False,False,False,False,False,False,False,False,False,False,False,False,True,152,337
3,0.13,3.258097,3.157,0,False,False,True,False,False,False,False,False,False,True,False,True,False,False,547,337
4,0.1,3.258097,2.60269,0,False,False,True,False,False,False,False,False,False,False,False,False,False,False,761,458


In [5]:
X = df_clean.drop(['class'], axis=1)
y = df_clean['class']


#ML Modeling


First, we will fit two standard ML classification models, Logistic Regression and Random Forests,  to compare with Neural Nets:

In [6]:
# start by initializing a dictionary for all of our ROC scores:

model_auc_scores = {}

In [7]:
#Loading Libraries
from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression

In [8]:
#Split Data into Testing and Training Data
# original random_state = 42 gives results *81, 78, 85)
random_state_value = 99
X_train,X_test, y_train,y_test = train_test_split(X,y,test_size = 0.2,random_state = random_state_value)

## Logistic Regression

In [9]:

lrmodel = LogisticRegression(solver="liblinear")
lrmodel.fit(X_train,y_train)

y_pred_lr = lrmodel.predict(X_test)
y_prob_lr = lrmodel.predict_proba(X_test)[:, 1]

In [10]:
# calculate AUC score and store in our dictionary

auc_lr= metrics.roc_auc_score(y_test,y_prob_lr)
print("AUC Score",round(auc_lr,4))

model_auc_scores['Logistic Regression'] = auc_lr


AUC Score 0.6012


## Random Forest

In [11]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import roc_auc_score

rf_model = RandomForestClassifier(max_depth=10, min_samples_split=10)
rf_model.fit(X_train, y_train)
y_prob_rf = rf_model.predict_proba(X_test)[:, 1]
y_pred_rf = rf_model.predict(X_test)
auc_rf = roc_auc_score(y_test, y_prob_rf)
model_auc_scores['Random Forest'] = auc_rf

y_pred_rf = lrmodel.predict(X_test)

auc_rf=metrics.roc_auc_score(y_test, y_prob_rf)

print("AUC Score",round(auc_rf,3))
model_auc_scores['Random Forest'] = auc_rf


AUC Score 0.625


## Neural Networks (using Keras)

Neural Nets in Python can be run via an ever-growing number of libraries and functions.  The first python functions were `MLPClassifier`and `MLPRegressor`.  These work fine but are limited in scope.  The `Keras` library by Tensorflow (developed by Google) is an attempt to create high-level interfaces to these very powerful and complex models.  However, you can find countless similar libraries to implement neural nets.

In [12]:
#Loading Libraries

import numpy as np
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.callbacks import EarlyStopping
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split

For NN to run corrrectly, you should scale your data!

In [13]:
from sklearn.preprocessing import StandardScaler

scaler = StandardScaler()
X_train_scaled = scaler.fit_transform(X_train)
X_test_scaled = scaler.transform(X_test)

In [14]:
# Defining SIMPLE Keras Model

# 2 hidden layers:  19 ( => 12 => 8 ) => 1

kmodel = Sequential()
kmodel.add(Dense(12,input_shape =(19,), activation = "relu"))
kmodel.add(Dense(8,activation = "relu"))
kmodel.add(Dense(1,activation = "sigmoid"))



  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [15]:
#Compile Keras Model
kmodel.compile(loss = "binary_crossentropy", optimizer = "adam", metrics =['accuracy'])


In [16]:
#Fitting Keras Model
kmodel.fit(X_train_scaled,y_train,epochs = 10, batch_size = 64)

Epoch 1/10
[1m2397/2397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 2ms/step - accuracy: 0.9377 - loss: 0.2485
Epoch 2/10
[1m2397/2397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2ms/step - accuracy: 0.9477 - loss: 0.2023
Epoch 3/10
[1m2397/2397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.9499 - loss: 0.1955
Epoch 4/10
[1m2397/2397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2ms/step - accuracy: 0.9488 - loss: 0.1979
Epoch 5/10
[1m2397/2397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2ms/step - accuracy: 0.9488 - loss: 0.1979
Epoch 6/10
[1m2397/2397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 2ms/step - accuracy: 0.9494 - loss: 0.1963
Epoch 7/10
[1m2397/2397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m6s[0m 2ms/step - accuracy: 0.9492 - loss: 0.1964
Epoch 8/10
[1m2397/2397[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 2ms/step - accuracy: 0.9497 - loss: 0.1957
Epoch 9/10
[1m2397/2397

<keras.src.callbacks.history.History at 0x7ac66a8c61e0>

In [17]:
# Get predicted probabilities for the positive class (class 1)
y_prob = kmodel.predict(X_test_scaled)

# Calculate AUC

auc = roc_auc_score(y_test, y_prob)
print("AUC Score",round(auc,4))

model_auc_scores['Simple NN'] = auc


[1m1199/1199[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 1ms/step
AUC Score 0.6158


In [18]:
## Now we make it more complex, with an extra layer, and Dropout (for regularization)

kmodel = Sequential()
kmodel.add(Dense(12,input_shape =(19,), activation = "relu")) # 19 features
kmodel.add(Dropout(0.3))  # Add dropout
kmodel.add(Dense(8,activation = "relu"))
kmodel.add(Dropout(0.3))  # Add dropout to the new layer
kmodel.add(Dense(6,activation = "relu"))
kmodel.add(Dense(1,activation = "sigmoid"))



  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [19]:
#Compile Keras Model
kmodel.compile(loss = "binary_crossentropy", optimizer = "adam", metrics =['accuracy'])


In [20]:
#Fitting Keras Model

# with more complicated model, might need more epochs, larger batch?

kmodel.fit(X_train_scaled,y_train,epochs = 20, batch_size = 256)

Epoch 1/20
[1m600/600[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 3ms/step - accuracy: 0.9320 - loss: 0.3172
Epoch 2/20
[1m600/600[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 3ms/step - accuracy: 0.9487 - loss: 0.2122
Epoch 3/20
[1m600/600[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9494 - loss: 0.2012
Epoch 4/20
[1m600/600[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9494 - loss: 0.1992
Epoch 5/20
[1m600/600[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9486 - loss: 0.2003
Epoch 6/20
[1m600/600[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9486 - loss: 0.2005
Epoch 7/20
[1m600/600[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 2ms/step - accuracy: 0.9495 - loss: 0.1973
Epoch 8/20
[1m600/600[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 4ms/step - accuracy: 0.9498 - loss: 0.1965
Epoch 9/20
[1m600/600[0m [32m━━━━━━━━

<keras.src.callbacks.history.History at 0x7ac66a7ae090>

In [21]:
# Get predicted probabilities for the positive class (class 1)
y_prob = kmodel.predict(X_test_scaled)

# Calculate AUC

auc = roc_auc_score(y_test, y_prob)
print("AUC Score",round(auc,4))

model_auc_scores['Extra NN'] = auc

[1m1199/1199[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 1ms/step
AUC Score 0.6232


# Results and Comparison

In [22]:
print("random state =", random_state_value)

for model_name, auc_score in model_auc_scores.items():
    print(f"{model_name}: AUC = {auc_score:.4f}")

random state = 99
Logistic Regression: AUC = 0.6012
Random Forest: AUC = 0.6245
Simple NN: AUC = 0.6158
Extra NN: AUC = 0.6232


In [23]:
kmodel.summary()

# Neural Nets - discussion
You may notice that the performance here is underwhelming compared to other options - with much more complicated models.  That is often true - Neural Nets can often underperform on problems that are simple -- try not to "kill the fly with the sledgehammer"

Nonetheless there are many ways to try and improve Neural Nets, endless customizations that can be mined for better performance, as long as you are careful about overfitting:

- Hyperparameter Tuning: Experiment with different values for parameters like the number of epochs, batch size, learning rate , and the number of neurons in each layer.
- Adding more layers: Add more dense layers to your model.
- Regularization: Dropout layers are one way, that often protect against overfitting.  You can also do Lasso-like regularlization.
- Activation Functions: 'Relu' is a good default, but these can also get more complex  (e.g., tanh, sigmoid) although
- Optimizers: You can experiment with different optimizers (e.g., SGD, RMSprop, Adamax, Adagrad) in model.compile(), but is is really for advanced users.
- Early Stopping: Implement early stopping during training to prevent overfitting. This involves monitoring a metric (like validation loss) and stopping training when it stops improving.

A lot of these customizations will require learning new libraries like Tensorflow, or writing custom code.   For the level of this class, you can stick to the functions presented here.  

This is a field that could take a lifetime to master, but one that is driving the AI development of today.  To go further I recommend the book:

`Python Machine Learning By Example` by Yuxi Liu