# Driver Drowsiness Detection System

Studies indicate that fatigue-related crashes account for about 20% of road accidents and even more on roads with no driving hour regulations. Driver detection systems, particularly those focusing on drowsiness detection, aim to address that alarming rate by monitoring drivers for signs of drowsiness and issuing timely alerts to prevent potential crashes.

For our final project, we chose to develop a DDS by using the UTA Real-Life Drowsiness Dataset, which features diverse participants and comprehensive data. We will train a convolutional neural network (CNN) to analyze facial, eye, and mouth movements at different stages of drowsiness. The model will provide warnings and alerts based on the detected level of fatigue, with accuracy tests ensuring its reliability.

### Requirements 
● TensorFlow: Developed by the Google Brain team for machine learning and artificial intelligence, Tensorflow has a allows for training and inference of deep neural networks.

● Keras: Provides a Python interface for artificial neural networks (inbuilt python library).

● Numpy: Used for scientific computing in Python. Provides support for arrays, matrices, and various mathematical functions to operate on them. 

● OpenCV: Machine learning and compiter vision library; contains >2500 algorhitms optimized for various CV tasks 

● Scikit-learn: Data mining, data analysis. In this project, used for splitting datasets. 

● Pandas: Data manipulation and analysis library. Used to create dataframes associating frames with their labels.

In [1]:
# Importing required libraries
import numpy as np 
import pandas as pd 
import tensorflow as tg
from tensorflow.keras.applications import VGG16
from tensorflow.keras.models import Model
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.optimizers import Adam
from sklearn.model_selection import train_test_split
from tensorflow.keras.preprocessing.image import ImageDataGenerator
import cv2
import os

## Frame - Class (Label) Association
Frames captured are associated with "not drowsy", "neutral", and "drowsy" classes, based on the 'vid' label within the parsed filename. They're later saved to a pandas dataframe for training, validating, and testing. 

In [2]:
def parse_filename(filename):
    parts = filename.split('_')
    for i, part in enumerate(parts):
        if part.lower() == 'vid':
            label = int(parts[i + 1])
            if label == 0:
                return 'not_drowsy'
            elif label == 5:
                return 'neutral'
            elif label == 10:
                return 'drowsy'
            else:
                return None
    return None

In [3]:
# ! Universalize the working directory
working_directory = '/kaggle/input/drowsy-driver-imagesonly/Drowsey Driver Images'

def create_dataframe(image_dir):
    data = []
    for root, dirs, files in os.walk(image_dir):
        for file in files:
            if file.endswith('.jpg'):
                label = parse_filename(file)
                if label:
                    data.append((os.path.join(root, file), label))
    return pd.DataFrame(data, columns=['filepath', 'label'])

df = create_dataframe(working_directory)
print(df.head(50))

                                             filepath       label
0   /kaggle/input/drowsy-driver-imagesonly/Drowsey...     neutral
1   /kaggle/input/drowsy-driver-imagesonly/Drowsey...  not_drowsy
2   /kaggle/input/drowsy-driver-imagesonly/Drowsey...     neutral
3   /kaggle/input/drowsy-driver-imagesonly/Drowsey...     neutral
4   /kaggle/input/drowsy-driver-imagesonly/Drowsey...      drowsy
5   /kaggle/input/drowsy-driver-imagesonly/Drowsey...      drowsy
6   /kaggle/input/drowsy-driver-imagesonly/Drowsey...     neutral
7   /kaggle/input/drowsy-driver-imagesonly/Drowsey...     neutral
8   /kaggle/input/drowsy-driver-imagesonly/Drowsey...  not_drowsy
9   /kaggle/input/drowsy-driver-imagesonly/Drowsey...     neutral
10  /kaggle/input/drowsy-driver-imagesonly/Drowsey...      drowsy
11  /kaggle/input/drowsy-driver-imagesonly/Drowsey...      drowsy
12  /kaggle/input/drowsy-driver-imagesonly/Drowsey...     neutral
13  /kaggle/input/drowsy-driver-imagesonly/Drowsey...     neutral
14  /kaggl

In [4]:
df.sample(frac = 1)

Unnamed: 0,filepath,label
372,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,drowsy
4491,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,not_drowsy
1398,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,drowsy
4637,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,drowsy
3747,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,neutral
...,...,...
5687,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,drowsy
5074,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,drowsy
1035,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,neutral
3036,/kaggle/input/drowsy-driver-imagesonly/Drowsey...,drowsy


## Data Preparation and Augmentation
The dataset is split into training, validation, and testing sets. The frames are then rescaled, as well as augmented for the training dataset to increase the variety of data. 

In [5]:
# Initialization of the train, validation, and test datasets extracted from the UTA RealLife Drowsiness Dataset. 
train_val_df, test_df = train_test_split(df, test_size=0.2, stratify=df['label'], random_state=42)
train_df, val_df = train_test_split(train_val_df, test_size=0.25, stratify=train_val_df['label'], random_state=42)

train_datagen = ImageDataGenerator(rescale=0.2)
val_datagen = ImageDataGenerator(rescale=0.2)
test_datagen = ImageDataGenerator(rescale=0.2)

# Artificially increases size of the training dataset; ensures a wider range of imgs. 
train_generator = train_datagen.flow_from_dataframe(
    train_df,
    x_col='filepath',
    y_col='label',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical'
)

val_generator = val_datagen.flow_from_dataframe(
    val_df,
    x_col='filepath',
    y_col='label',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical'
)

test_generator = test_datagen.flow_from_dataframe(
    test_df,
    x_col='filepath',
    y_col='label',
    target_size=(224, 224),
    batch_size=32,
    class_mode='categorical',
    shuffle=False
)

Found 3450 validated image filenames belonging to 3 classes.
Found 1150 validated image filenames belonging to 3 classes.
Found 1150 validated image filenames belonging to 3 classes.


## Model Definition, Compilation, and Training
The model architecture is defined using a pre-trained (on ImageNet) VGG16 base model. The top layers are excluded and the input shape is specified to match the dimensions of our input data. Custom layers are then added for the 3-class classification. To prevent the weights of the pre-trained VGG16 base model from being updated during training, we freeze all the layers of the base model, after which the model is compiled, and trained using the training and validation datasets. 

In [6]:
base_model = VGG16(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

x = base_model.output
x = Flatten()(x)
x = Dense(128, activation='relu')(x)
predictions = Dense(3, activation='softmax')(x)  # 3 classes: 0 - not_drowsy, 5 - drowsy, 10 - neutral

model = Model(inputs=base_model.input, outputs=predictions)

# The base is freezed
for layer in base_model.layers:
    layer.trainable = False

model.compile(optimizer=Adam(learning_rate=0.001), loss='categorical_crossentropy', metrics=['accuracy'])

# ! Actual training
model.fit(
    train_generator,
    steps_per_epoch=train_generator.samples // train_generator.batch_size,
    validation_data=val_generator,
    validation_steps=val_generator.samples // val_generator.batch_size,
    epochs=5
)

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
[1m58889256/58889256[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Epoch 1/5


  self._warn_if_super_not_called()


[1m107/107[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1306s[0m 12s/step - accuracy: 0.8289 - loss: 2.2893 - val_accuracy: 0.9804 - val_loss: 0.0505
Epoch 2/5
[1m  1/107[0m [37m━━━━━━━━━━━━━━━━━━━━[0m [1m16:43[0m 9s/step - accuracy: 0.9375 - loss: 0.1057

  self.gen.throw(typ, value, traceback)


[1m107/107[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m18s[0m 78ms/step - accuracy: 0.9375 - loss: 0.1057 - val_accuracy: 1.0000 - val_loss: 0.0026
Epoch 3/5
[1m107/107[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1299s[0m 12s/step - accuracy: 0.9941 - loss: 0.0165 - val_accuracy: 0.9848 - val_loss: 0.0326
Epoch 4/5
[1m107/107[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m17s[0m 78ms/step - accuracy: 1.0000 - loss: 6.4003e-04 - val_accuracy: 1.0000 - val_loss: 0.0027
Epoch 5/5
[1m107/107[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1296s[0m 12s/step - accuracy: 0.9984 - loss: 0.0048 - val_accuracy: 0.9911 - val_loss: 0.0263


<keras.src.callbacks.history.History at 0x7b6c9c94c4c0>

In [7]:
model.save_weights('drowsiness_detection_weights.weights.h5')

## Performance evaluation
Obtained results: 0.0897449404001236 test loss, 0.9937499761581421 test accuracy

In [8]:
test_loss, test_accuracy = model.evaluate(test_generator, steps=test_generator.samples // test_generator.batch_size)
print(f'Test loss: {test_loss}')
print(f'Test accuracy: {test_accuracy}')

[1m35/35[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m318s[0m 9s/step - accuracy: 0.9866 - loss: 0.0303
Test loss: 0.0258723646402359
Test accuracy: 0.9883928298950195
