# Artificial Neural Network (ANN) Model for Pedestrian Injury Severity

This notebook aims to perform neural network training on the processed dataset for predicting the pedestrian injury severity. The unique values in the target variables are Fatal (4), Major (3), Minor (2), Minimal (1), and None (0) which means we will be performing a *multiclassification* exercise. This notebook is outline as follows:

1. Import Data and Preprocessing
2. Neural Network (NN) Training
3. Performance Metrics

We are hoping that atleast one of these models will yield more than 78% accuracy as observed from the conventional Machine Learning (ML) models

# 1. Import Data and Preprocessing

## 1.1. Load Necessary Packages

In [1]:
# PACKAGES
# Neural Networks
import tensorflow as tf
from tensorflow.keras.layers import Dense, Input, Dropout
from tensorflow.keras.models import Sequential
from tensorflow.keras.losses import SparseCategoricalCrossentropy
from tensorflow.keras.regularizers import L2, L1
from tensorflow.keras.callbacks import EarlyStopping
from tensorflow.keras.optimizers import Adam
# Data Analysis, Manipulation and Visualization
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
# Transformation, Preprocessing, and Model Metrics
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler, OneHotEncoder
from sklearn.metrics import f1_score, precision_score, recall_score
# Other Packages
import warnings
pd.set_option('display.max_columns', None)
pd.set_option('display.max_rows', None)
warnings.filterwarnings('ignore')
%matplotlib inline

## 1.2. Load Data for Preprocessing

In [2]:
np.random.seed(100)

In [3]:
# Import data
df = pd.read_csv('data_for_ml.csv')
# Drop the 'INDEX_' column, we will not be using it
df = df.drop('INDEX_', axis=1)
# Create a mapping dictionary to label encode the target variable ('INJURY')
injury = {
    'Fatal' : 4,
    'Major' : 3,
    'Minor' : 2,
    'Minimal' : 1,
    np.nan : 0
}
df['INJURY'] = df['INJURY'].replace(injury).astype('int64')
# Review first 5 records
df.head()

Unnamed: 0,HOUR,DAYOFWEEK,VISIBILITY,LIGHT,RDSFCOND,TEMP,REL_HUMID,LOCCOORD,TRAFFCTL,ROADCLASS,SPEEDLMT,VEH_ADT,PED_ADT,LAND_USE,POP_2021,PRIV_DWELL,LAND_AREA,INVAGE,PEDCOND,PEDACT,VEHINV,VIOL,INJURY
0,7,weekday,Other,Dark,Wet,1.5,0.99,Midblock,No Control,Major Arterial,60,498.0,219.0,Mixed Use,504,263,225,45 to 64,Distracted,Crossing without ROW,automobile,speeding,3
1,19,weekday,Rain,Dark,Wet,4.8,1.0,Midblock,Traffic Signal including Transit,Major Arterial,60,351.0,63.0,Residential,452,205,183,Over 65,Normal,Crossing without ROW,automobile,speeding,3
2,22,weekday,Clear,Dark,Dry,-5.8,0.67,Intersection,Traffic Signal including Transit,Major Arterial,60,650.0,50.0,Residential,807,375,322,15 to 29,Distracted,Crossing with ROW,automobile,speeding,3
3,18,weekend,Clear,Dark,Dry,2.4,0.76,Intersection,Pedestrian Crossover,Major Arterial,60,437.0,8.0,Residential,907,315,305,Under 15,Normal,Crossing with ROW,automobile,aggressive driving,4
4,14,weekday,Clear,Daylight,Dry,3.5,0.78,Intersection,Traffic Signal including Transit,Major Arterial,60,903.0,43.0,Residential,974,371,364,Over 65,Unknown,Crossing without ROW,automobile,speeding,4


## 1.3. Preprocessing / Transformation / Reduction

*Split into X and y values*

In [4]:
X = df.iloc[:,:-1].values
y = df.iloc[:,-1].values

*One Hot Encode the Categorical Variables for efficient modeling*

In [5]:
categorical_columns = [1, 2, 3, 4, 7, 8, 9, 13, 17, 18, 19, 20, 21]
encoder = OneHotEncoder(categories='auto', sparse=False, handle_unknown='ignore')
X_categorical = encoder.fit_transform(X[:, categorical_columns])
print(X_categorical[5:])

[[1. 0. 1. ... 0. 0. 1.]
 [0. 1. 0. ... 0. 0. 1.]
 [0. 1. 1. ... 0. 0. 1.]
 ...
 [1. 0. 0. ... 0. 0. 0.]
 [0. 1. 1. ... 0. 0. 0.]
 [1. 0. 1. ... 0. 0. 1.]]


*Now that we have performed the binary encoding, we need to combine the `X_categorical` variable to the rest of the variables (i.e., numerical)*

In [6]:
numerical_columns = [0, 5, 6, 10, 11, 12, 14, 15, 16]
X_encoded = np.concatenate((X_categorical, X[:, numerical_columns].astype(float)), axis=1)
print(X_encoded[5:])

[[1.000e+00 0.000e+00 1.000e+00 ... 4.740e+02 1.470e+02 1.470e+02]
 [0.000e+00 1.000e+00 0.000e+00 ... 4.020e+02 1.480e+02 1.400e+02]
 [0.000e+00 1.000e+00 1.000e+00 ... 4.890e+02 2.550e+02 2.300e+02]
 ...
 [1.000e+00 0.000e+00 0.000e+00 ... 4.820e+02 2.520e+02 2.310e+02]
 [0.000e+00 1.000e+00 1.000e+00 ... 1.981e+03 1.195e+03 1.112e+03]
 [1.000e+00 0.000e+00 1.000e+00 ... 1.170e+03 4.820e+02 4.670e+02]]


*Split into train and test data. We will be using 80-20 split (i.e., 80% train set and 20% test set)*

In [7]:
X_train, X_test, y_train, y_test = train_test_split(X_encoded, y, test_size=0.20, random_state=100)

*Finally, we need to scale the numerical features for both the train and test attributes*

In [8]:
scaler = StandardScaler()
X_train[:, -len(numerical_columns):] = scaler.fit_transform(X_train[:, -len(numerical_columns):])
X_test[:, -len(numerical_columns):] = scaler.transform(X_test[:, -len(numerical_columns):])
# Print the dimensions of both X_train and X_test 
print(f'X_train dimension: {X_train.shape}')
print(f'X_test dimension: {X_test.shape}')

X_train dimension: (2490, 83)
X_test dimension: (623, 83)


*We may have 22 attributes (13 categorical and 9 numerical) but as shown in the dimension in both train and test set that we have 83 input units for our neural network models*

# 2. Neural Network (NN) Modeling

We will be constructing two (2) NN architecture to experiment. Here are some values used in constructing the NN:

1. All hidden layer activation function is `ReLU` with the output layer activation function as `linear` for the multiclassification exercise.
2. All hidden layer must have 65 nodes with alternating dropout layer with dropout rate of 10% (0.1)
3. L2 regularization is applied to the output layer to reduce overfitting. Regularization rate applied was 0.01.
4. Adam optimizer is used with learning_rate equal to 0.01 with epoch equal to 100
5. Sparse Categorical Crossentropy is the loss function

We also need to seet a random seed for reproducibility.

In [9]:
tf.random.set_seed(100)

In [30]:
early_stopping = EarlyStopping(monitor='val_accuracy', patience=20, restore_best_weights=True)

## 2.1. Design 1 - 2 Hidden Layers

In [31]:
# Construct NN Architecture
design1 = Sequential([
    Dense(units=65, activation='relu', input_shape=(83,)),
    Dropout(rate=0.1),
    Dense(units=5, activation='linear', kernel_regularizer=L1(0.01))
])
# Compile the model
design1.compile(loss=SparseCategoricalCrossentropy(from_logits=True),
               optimizer=Adam(learning_rate=0.01),
               metrics=['accuracy'])
# Train the model
design1.fit(X_train, y_train, epochs=100, validation_data=[X_test, y_test], callbacks=early_stopping)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100


<keras.src.callbacks.History at 0x1666deff890>

## 2.2. Design 2 - 4 Hidden Layers

In [32]:
# Construct NN Architecture
design2 = Sequential([
    Dense(units=65, activation='relu', input_shape=(83,)),
    Dropout(rate=0.1),
    Dense(units=65, activation='relu'),
    Dropout(rate=0.1),
    Dense(units=5, activation='linear', kernel_regularizer=L1(0.01))
])
# Compile the model
design2.compile(loss=SparseCategoricalCrossentropy(from_logits=True),
               optimizer=Adam(learning_rate=0.01),
               metrics=['accuracy'])
# Train the model
design2.fit(X_train, y_train, epochs=100, validation_data=[X_test, y_test], callbacks=early_stopping)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100


<keras.src.callbacks.History at 0x1666f19b890>

## 2.3. Design 3 - 6 Hidden Layers

In [33]:
# Construct NN Architecture
design3 = Sequential([
    Dense(units=65, activation='relu', input_shape=(83,)),
    Dropout(rate=0.1),
    Dense(units=65, activation='relu'),
    Dropout(rate=0.1),
    Dense(units=65, activation='relu'),
    Dropout(rate=0.1),
    Dense(units=5, activation='linear', kernel_regularizer=L1(0.01))
])
# Compile the model
design3.compile(loss=SparseCategoricalCrossentropy(from_logits=True),
               optimizer=Adam(learning_rate=0.01),
               metrics=['accuracy'])
# Train the model
design3.fit(X_train, y_train, epochs=100, validation_data=[X_test, y_test], callbacks=early_stopping)

Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100


<keras.src.callbacks.History at 0x1666f33c490>

# 3. Performance Metrics

For performance metrics, we will be measuring **Accuracy**, **Precision**, **Recall**, and **F1 Score** for both architecture and compare. We will also perform feature importances and plot the most accurate neural network.

In [34]:
models = [design1, design2, design3]
for model in models:
    y_pred = model.predict(X_test)
    y_pred_classes = np.argmax(y_pred, axis=1)
    train_loss, train_accuracy = model.evaluate(X_train, y_train)
    test_loss, test_accuracy = model.evaluate(X_test, y_test)
    precision = precision_score(y_test, y_pred_classes, average='weighted')
    recall = recall_score(y_test, y_pred_classes, average='weighted')
    f1 = f1_score(y_test, y_pred_classes, average='weighted')
    print(f"Model Performance Metrics:")
    print(f"Training Accuracy: {train_accuracy*100:.2f}%")
    print(f"Test Accuracy: {test_accuracy*100:.2f}%")
    print(f"Precision: {precision*100:.2f}%")
    print(f"Recall: {recall*100:.2f}%")
    print(f"F1 Score: {f1*100:.2f}%\n")

Model Performance Metrics:
Training Accuracy: 80.24%
Test Accuracy: 78.81%
Precision: 71.08%
Recall: 78.81%
F1 Score: 71.80%

Model Performance Metrics:
Training Accuracy: 84.62%
Test Accuracy: 78.65%
Precision: 70.86%
Recall: 78.65%
F1 Score: 73.41%

Model Performance Metrics:
Training Accuracy: 84.94%
Test Accuracy: 78.97%
Precision: 73.15%
Recall: 78.97%
F1 Score: 73.58%

