# Lab 8 - Neural Networks
- **Author:** Emily Aiken ([emilyaiken@berkeley.edu](mailto:emilyaiken@berkeley.edu))
- **Date:** March 16, 2022
- **Course:** INFO 251: Applied machine learning

## Topics:
1. Neural networks (regression)
2. Neural networks (classification)
3. Neural networks (multiclass classification)

## Learning Goals:
At the end of this lab, you will...
- Know how to code up feed forward neural networks in Keras for regression, classification, and multiclass classification problems
- Know the main hyperparameters for neural networks: number of hidden layers, number of hidden nodes, activation functions
- Know the main optimization parameters for neural networks: optimizer, learning rate, batch size, epochs

## Resources:
- [Keras activation functions](https://keras.io/api/layers/activations/)
- [Keras optimizers](https://keras.io/api/optimizers/)
- [Keras loss functions](https://keras.io/api/losses/)
- [Keras performance metrics](https://keras.io/api/metrics/)

In [1]:
import warnings
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn import datasets
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor, RandomForestClassifier
from sklearn.neural_network import MLPRegressor
from sklearn.metrics import r2_score, roc_auc_score, accuracy_score

from keras.models import Sequential
from keras.layers import Dense
import tensorflow as tf

### I. Regression Data: Loading and Baseline Model

In [2]:
# Data
data = datasets.load_boston()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
df.head()

Unnamed: 0,CRIM,ZN,INDUS,CHAS,NOX,RM,AGE,DIS,RAD,TAX,PTRATIO,B,LSTAT,target
0,0.00632,18.0,2.31,0.0,0.538,6.575,65.2,4.09,1.0,296.0,15.3,396.9,4.98,24.0
1,0.02731,0.0,7.07,0.0,0.469,6.421,78.9,4.9671,2.0,242.0,17.8,396.9,9.14,21.6
2,0.02729,0.0,7.07,0.0,0.469,7.185,61.1,4.9671,2.0,242.0,17.8,392.83,4.03,34.7
3,0.03237,0.0,2.18,0.0,0.458,6.998,45.8,6.0622,3.0,222.0,18.7,394.63,2.94,33.4
4,0.06905,0.0,2.18,0.0,0.458,7.147,54.2,6.0622,3.0,222.0,18.7,396.9,5.33,36.2


In [3]:
# Standardize the data
for col in df.columns:
    if col != 'target':
        mean, std = df[col].mean(), df[col].std()
        df[col] = (df[col] - mean)/std

In [4]:
# Split data into training and test
train, test = train_test_split(df, shuffle=True, test_size=0.25, random_state=0)
x_train, y_train = train.drop('target', axis=1), train['target']
x_test, y_test = test.drop('target', axis=1), test['target']

In [5]:
# Let's fit a basic random forest model -- just as a baseline
model = RandomForestRegressor(max_depth=8, n_estimators=50, random_state=1)
model.fit(x_train, y_train)
yhat_train = model.predict(x_train)
yhat_test = model.predict(x_test)
print('RF r2 on training set: %.2f' % r2_score(y_train, yhat_train))
print('RF r2 on test set: %.2f' % r2_score(y_test, yhat_test))

RF r2 on training set: 0.98
RF r2 on test set: 0.78


### II. Neural Network (Regression)

#### A. Scikit-learn implementation

In [6]:
model = MLPRegressor(hidden_layer_sizes=[5, 3], activation='relu', solver='adam', max_iter=500,
                    shuffle=True, random_state=1)
model.fit(x_train, y_train)
yhat_train = model.predict(x_train)
yhat_test = model.predict(x_test)

# Get metrics
print('r2 on training set: %.2f' % r2_score(y_train, yhat_train))
print('r2 on test set: %.2f' % r2_score(y_test, yhat_test))

r2 on training set: 0.77
r2 on test set: 0.58


#### B. Keras Implementation

In [7]:
# Random seeds
np.random.seed(1)
tf.random.set_seed(1)

# Define NN
model = Sequential()
model.add(Dense(5, input_dim=len(x_train.columns), activation='relu')) # First layer defines input_dim
model.add(Dense(1, activation='linear')) # For regression/classification, last layer of size 1
model.compile(loss='mse', optimizer='adam', metrics=['mse']) # No r2 metric available in keras

# Fit and predict with NN
model.fit(x_train, y_train, epochs=50, batch_size=10, verbose=0)
yhat_train = model.predict(x_train)
yhat_test = model.predict(x_test)

# Get metrics
print('r2 on training set: %.2f' % r2_score(y_train, yhat_train))
print('r2 on test set: %.2f' % r2_score(y_test, yhat_test))

r2 on training set: 0.65
r2 on test set: 0.42


In [8]:
# TODO: Tune the hyperparameters until the r2 score on the test set exceeds that of the random forest
# Random seeds
np.random.seed(1)
tf.random.set_seed(1)

# Define NN
model = Sequential()
model.add(Dense(10, input_dim=len(x_train.columns), activation='relu')) # First layer defines input_dim
model.add(Dense(10, activation='relu')) # First layer defines input_dim
model.add(Dense(1, activation='linear')) # For regression/classification, last layer of size 1
model.compile(loss='mse', optimizer='adam', metrics=['mse']) # No r2 metric available in keras

# Fit and predict with NN
model.fit(x_train, y_train, epochs=200, batch_size=5, verbose=0)
yhat_train = model.predict(x_train)
yhat_test = model.predict(x_test)

# Get metrics
print('r2 on training set: %.2f' % r2_score(y_train, yhat_train))
print('r2 on test set: %.2f' % r2_score(y_test, yhat_test))

r2 on training set: 0.94
r2 on test set: 0.81


### III. Classification Data: Loading and Baseline Model

In [9]:
# Load data
data = datasets.load_breast_cancer()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
for col in df.columns:
    if col != 'target':
        df[col] = df[col] + np.random.normal(0, 4*df[col].std(), len(df))
df.head()

Unnamed: 0,mean radius,mean texture,mean perimeter,mean area,mean smoothness,mean compactness,mean concavity,mean concave points,mean symmetry,mean fractal dimension,...,worst texture,worst perimeter,worst area,worst smoothness,worst compactness,worst concavity,worst concave points,worst symmetry,worst fractal dimension,target
0,40.887089,11.786745,97.412238,1285.976341,0.168614,0.308313,0.026358,0.190491,0.362086,0.06641,...,-13.906665,362.911363,-49.050592,0.12816,0.010495,0.905038,0.336757,0.323227,0.231988,0
1,11.946562,9.857078,216.805684,-2603.394391,0.043558,-0.004412,-0.244151,-0.024154,0.148703,0.107751,...,-10.868511,241.870562,-2539.739095,0.158484,1.422741,0.294084,0.091141,-0.038541,0.258604,0
2,12.244788,32.335553,40.280585,4026.141822,-0.003135,0.378844,0.776733,0.241175,0.231942,0.037065,...,-36.630216,183.694892,2538.407099,0.105701,-0.694013,1.77317,0.197306,0.45999,0.11664,0
3,-3.704775,26.774269,64.925556,1023.365281,0.135199,0.529861,0.844492,-0.032502,0.254744,0.104587,...,34.714214,118.254599,-4205.038762,0.381332,-0.571568,0.001554,0.489827,0.64866,0.222814,0
4,32.488955,46.21944,184.189166,3483.886208,0.089133,0.03296,0.292348,0.116302,0.162262,-0.000282,...,37.406781,120.23716,1042.45177,0.135641,0.051512,1.082255,-0.44446,-0.125717,0.132616,0


In [10]:
# Standardize the data
for col in df.columns:
    if col != 'target':
        mean, std = df[col].mean(), df[col].std()
        df[col] = (df[col] - mean)/std

In [11]:
# Split data into training and test
train, test = train_test_split(df, shuffle=True, test_size=0.25, random_state=0)
x_train, y_train = train.drop('target', axis=1), train['target']
x_test, y_test = test.drop('target', axis=1), test['target']

In [12]:
# Let's fit a basic random forest model -- just as a baseline
model = RandomForestClassifier(max_depth=4, n_estimators=50, random_state=1)
model.fit(x_train, y_train)
yhat_train = model.predict_proba(x_train)[:, 1]
yhat_test = model.predict_proba(x_test)[:, 1]
print('RF AUC on training set: %.2f' % roc_auc_score(y_train, yhat_train))
print('RF AUC on test set: %.2f' % roc_auc_score(y_test, yhat_test))

RF AUC on training set: 0.97
RF AUC on test set: 0.72


### IV. Neural Network (Classification)

In [1]:
# TODO: Train a neural network to predict malignance. Tune hyperparameters until the AUC score exceeds that
# of the random forest above.
np.random.seed(1)
tf.random.set_seed(1)

# Define NN
model = Sequential()
model.add(Dense(5, input_dim=len(x_train.columns), activation='relu')) # First layer defines input_dim
model.add(Dense(1, activation='sigmoid')) # For regression/classification, last layer of size 1
model.compile(loss='binary_crossentropy', optimizer='adam', metrics=['AUC']) # No r2 metric available in keras

# Fit and predict with NN
model.fit(x_train, y_train, epochs=50, batch_size=10, verbose=1)
yhat_train = model.predict(x_train)
yhat_test = model.predict(x_test)

# Get metrics
print('AUC on training set: %.2f' % roc_auc_score(y_train, yhat_train))
print('AUC on test set: %.2f' % roc_auc_score(y_test, yhat_test))

NameError: name 'np' is not defined

### V. Multiclass Classification Data Loading and Baseline Model

In [14]:
# Load data
data = datasets.load_wine()
df = pd.DataFrame(data.data, columns=data.feature_names)
df['target'] = data.target
for col in df.columns:
    if col != 'target':
        df[col] = df[col] + np.random.normal(0, 4*df[col].std(), len(df))
df.head()

Unnamed: 0,alcohol,malic_acid,ash,alcalinity_of_ash,magnesium,total_phenols,flavanoids,nonflavanoid_phenols,proanthocyanins,color_intensity,hue,od280/od315_of_diluted_wines,proline,target
0,19.504747,4.748949,2.60148,11.7395,125.182287,3.797853,2.098297,-0.351655,1.00362,-0.830891,2.71851,4.241905,3847.096222,0
1,11.21344,0.35094,1.931256,5.966019,106.426417,2.169337,-0.658779,1.135534,2.014843,-0.982897,1.221235,2.477744,641.461948,0
2,11.444865,-3.326536,2.177566,27.737064,117.457887,-4.844807,-4.943451,0.186871,7.773352,-2.178549,1.797575,2.110637,2185.05848,0
3,10.885742,3.351114,1.762072,12.079059,198.586822,5.051264,5.423326,-0.122147,3.892414,15.078533,0.260579,-0.333696,2245.056564,0
4,16.050244,4.838524,2.258219,-2.883396,100.805252,-1.087033,8.917637,-0.119661,2.446909,19.465605,1.20976,-2.851885,-630.388596,0


In [15]:
# Standardize the data
for col in df.columns:
    if col != 'target':
        mean, std = df[col].mean(), df[col].std()
        df[col] = (df[col] - mean)/std

In [16]:
# Split data into training and test
train, test = train_test_split(df, shuffle=True, test_size=0.25, random_state=0)
x_train, y_train = train.drop('target', axis=1), train['target']
x_test, y_test = test.drop('target', axis=1), test['target']

In [17]:
# Let's fit a basic random forest model -- just as a baseline
model = RandomForestClassifier(max_depth=6, n_estimators=50, random_state=1)
model.fit(x_train, y_train)
yhat_train = model.predict(x_train)
yhat_test = model.predict(x_test)
print('RF accuracy on training set: %.2f' % accuracy_score(y_train, yhat_train))
print('RF accuracy on test set: %.2f' % accuracy_score(y_test, yhat_test))

RF accuracy on training set: 1.00
RF accuracy on test set: 0.29


### VI. Neural Network (Multiclass Classification)

In [18]:
# Random seeds
np.random.seed(1)
tf.random.set_seed(1)

# One hot encode the y variable
y_train_dummies = pd.get_dummies(y_train)
y_test_dummies = pd.get_dummies(y_test)

# Define NN
model = Sequential()
model.add(Dense(5, input_dim=len(x_train.columns), activation='relu')) # First layer defines input_dim
model.add(Dense(len(y_train_dummies.columns), activation='softmax')) 
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 

# Fit and predict with NN
model.fit(x_train, y_train_dummies, epochs=50, batch_size=10, verbose=0)
yhat_train = model.predict(x_train)
yhat_test = model.predict(x_test)

# Convert probabilities to categorical predictions
yhat_train = np.argmax(yhat_train, axis=1)
yhat_test = np.argmax(yhat_test, axis=1)

# Get metrics
print('Accuracy on training set: %.2f' % accuracy_score(y_train, yhat_train))
print('Accuracy on test set: %.2f' % accuracy_score(y_test, yhat_test))

Accuracy on training set: 0.53
Accuracy on test set: 0.44


In [19]:
# TODO: Tune the hyperparameters until the overall accuracy score exceeds that of the random forest
# Random seeds
np.random.seed(1)
tf.random.set_seed(1)

# One hot encode the y variable
y_train_dummies = pd.get_dummies(y_train)
y_test_dummies = pd.get_dummies(y_test)

# Define NN
model = Sequential()
model.add(Dense(20, input_dim=len(x_train.columns), activation='relu')) # First layer defines input_dim
model.add(Dense(20, activation='relu')) # First layer defines input_dim
model.add(Dense(len(y_train_dummies.columns), activation='softmax')) 
model.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['accuracy']) 

# Fit and predict with NN
model.fit(x_train, y_train_dummies, epochs=20, batch_size=2, verbose=0)
yhat_train = model.predict(x_train)
yhat_test = model.predict(x_test)

# Convert probabilities to categorical predictions
yhat_train = np.argmax(yhat_train, axis=1)
yhat_test = np.argmax(yhat_test, axis=1)

# Get metrics
print('Accuracy on training set: %.2f' % accuracy_score(y_train, yhat_train))
print('Accuracy on test set: %.2f' % accuracy_score(y_test, yhat_test))

Accuracy on training set: 0.83
Accuracy on test set: 0.33
