Detecting Driver Distraction

Approach for detecting driver distraction

Abstract

Distracted driving leads to 1,000 deaths daily in the US alone. Although current automobile safety systems exist, they do not incorporate a measure of distraction itself. The purpose of our project was to develop a machine learning-based model that decodes neural data into a measure of distraction, which could be incorporated into future safety systems. To train our system, we collected electroencephalography (EEG) data using a wireless Muse headband while participants played City Car Driving. In every trial of driving, each participant was presented either with no distraction or a randomly chosen distraction. We used data processing techniques, to normalize and clean our data, and tested supervised machine learning models to decode relative frequency band powers into distraction. For binary classification to detect whether an individual was distracted, we tested a logistic regression model (71.3% accuracy) and a multilayer perceptron (81.2% accuracy). Next, we designed a distraction metric on a scale of 0 to 5. We implemented a multivariable linear regression (mse = 0.93), and a decision tree classifier (71% accuracy) to predict this metric. The multilayer perceptron most effectively detected the presence of distraction, and the decision tree best predicted distraction level. Using recursive feature elimination, we determined that the delta and beta frequency bands drove our classifier performance. With this research, we demonstrated that a commercially-available wireless EEG system can decode neural data into distraction. Our EEG-based early driver distraction detection system has the potential to be incorporated into safety systems to improve driver safet

Project Details

Rationale

On January 6th 2020, three people were killed and fourteen were injured when a trucker in Indiana crashed into eight cars. Surprisingly, this entire accident was caused by a single person distracted by his coffee mug. The consequences of distracted driving can be disastrous. According to the CDC, distracted driving accidents kill approximately 9 people and injure 1,000 people daily in the USA. Our project aims to detect distracted driving using EEG data and prevent potential accidents. In the future, our system can be integrated with current safety systems in vehicles to predict accidents earlier and take appropriate actions. Our project has the potential to reduce automobile accidents and save thousands of lives.

Research Goal

Our research question explores how distracted driving manifests in data from EEG brain sensors. We hypothesize that there is a correlation between neural activity and distracted driving. Our goal is to thereby create a machine learning model that can learn this correlation using EEG neural activity data from a Muse headband. Finally, we propose a method for EEG data used to detect distracted driving to be incorporated into existing driver safety systems.

Procedure

For the purposes of our experiment, we recruited participants, who were adult drivers. We did not target any particular ethinic group, race, or population. Participants were recruited by asking members in our family, our neighbors, and students at the University of Washington. The total time for each experiment was 45 minutes. First, we asked our participants to read and fill the informed consent form. Participation in the experiment was completely voluntary and participants were given the right to stop/withdraw at any time. Next, the participants were instructed how to successfully fit the Muse 2 headband, which is used to get EEG data, to ensure good signal quality. The participants were given around 5 minutes to practice and familiarize themselves with the driving simulator. After this, we started recording the EEG data using an app called MuseDirect. We also took a screen recording of the driving simulator. We recorded timepoints of when the distraction was started and ended. Three driving tasks were assigned to each subject (drive to the shopping mall, drive to service station, or driving to a parking lot - all on the simulator). For each task, three distractions (randomly chosen from the list below) were presented to the participants at different times during the task. Here is a list of possible distractions:

Solving 5 math problems we ask (e.g., 34+57)
Write a list of fruits on their phone/text someone
Take a sip of water
Pick up a ball dropped next to the driver’s seat
Answer a riddle
Conversation about favorite food
Play I spy (e.g., I spy a red building)
Say a tongue twister

After the driving tasks have been completed, we stopped the EEG recording and allowed the participants to remove the Muse Headband. We asked the participants to rate how distracted they felt during the distracting task on a scale from 1 (least distracted) to 5 (most distracted). Finally, we debriefed and thanked the participants, answering any questions they had. All data was kept anonymous and confidential. The flowchart below is a brief summary of the procedure described above.

Data Analysis

We used an app called MuseDirect, which allows us to download EEG data to a CSV file containing the EEG data of a person wearing a MuseHeadband. After an experiment is completed, we analyzed the screen recording and extracted speed and road curvature data from the videos. We created a binary classifier which predicted whether the participant was distracted. This program analyzed data (for this classifier, we used alpha, beta, theta, gamma, and delta waves) to determine if the distracted value was a 0 (not distracted) or 1 (distracted). We also tested other machine learning models for optimal accuracy. Python and Sci-kit libraries were used for the models in this project.

Driver Distraction Models

We created two models to determine whether a driver was distracted. We used the collected EEG data to do this. Here are the two models:

We created a logistic regression model which learned a logistic function to determine whether the driver is distracted or not at a certain time (1 or 0). The accuracy of this model was 79%. Below is the confusion matrix of this model.

We created a multilayer perceptron to predict whether the driver was distracter. The accuracy of the model was 82%, and below is the confusion matrix.

We created two models to determine how distracted the driver was. We created a metric (shown in the table below) based on a subject score of the visibility of distraction in the participants' driving. We found this value to be correlated with the number of swerves. The R-value was 0.74.

Below is a short description of the two models:

We created a linear regression model learns a linear model to predict a level of distraction from 1-5. The mean squared error was 1.34. This graph shows predicted (orange) versus actual (blue) distraction values for 25 samples from the testing data. Bars highlighted green have roughly no error.

We created a decision tree which learns a tree that is used to predict distraction level from 1-5. The accuracy was 72%. The confusion matrix for this model is below.

Recursive Feature Elimination

Recursive feature elimination was used to determine importance of each feature in the logistic regression model. The coefficients of a logistic regression model for binary classification, where the twenty feature are the five waves for each of the four electrodes. The values in yellow below are the most important features.

After performing recursive feature elimination, we concluded that the delta and beta values maximize driver classifier performance. Beta values are associated with cognitive tasks, while delta waves are slower, lower frequency waves. This supports our results, because they may account for differing amounts of focus required when driving with or without distraction.

Conclusion

For binary classification, the multilayer perceptron has a higher accuracy than the logistic regression model. This may be because our neural network has multiple layers and can be more powerful for complex data, especially when the data from different classes is not linearly separable. For predicting distraction level, the decision tree model performed better than the linear regression model. This could mean that the features are not linearly related to the distraction level. The decision tree, on the other hand, could capture non-linear relationships in the data. This model worked better with a higher tree depth, which is because a higher depth makes the model more powerful.

Future Work

There are multiple steps we would like to take to improve the accuracy of our model:

Collect more experimental data to train our models
Revise the distraction level value so it better captures distraction level
Perform experiment on a real driving simulator (with a steering wheel, brakes…)
Experiment using other algorithms and data processing techniques including Ridge Regression and Convolution Neural Network
Collect more data on car stats during the experiment We plan to implement this end to end system in cars with the current safety systems. The diagram below is how we envision our system integrating with current car systems.

Acknowledgments

We are deeply grateful to Courtnie Paschall (MD/PhD Student at the University of Washington) for her guidance. Our thanks to Nikolas Ioannou, a UW student, for his inputs. We are also very thankful to our family who supported us over the course of this project. Lastly, we thank the participants who kindly volunteered to help us collect our data.

Files

Here is a brief description of the main files you will find in this repository:

add_distraction_type creates a dictionary mappping the distraction type to a key
crash_swerve_counter.py counts the number of crashes and swerves based on live labeling of the data
set_trials.py randomly selected a driving task and distraction for the participant to perform.
binary_decision_tree.py trains and tests a decision tree model. This program also creates a confusion matrix.
logistic_regression.py creates a logistic regression model.
linear_regression.py creates a linear regression model and calculates the mean squared error of the model.
feature_selection_logistic_regression.py performs recursive elimination on the logistic regression model to determine the features that contribute the most to the model's accuracy.
data_preprocessing removed empty rows and fixed other issues with the CSV files.
distraction_index.py set the distraction level on a scale of 1-5 for each file.
stream.py and real_time.py were created in an attempt to predict distraction level real time. To do this, we would have to have a stream of data flowing into our models.
poster_picture.png is the poster we presented at the 2020 Washington State Science & Engineering Fair (WSSEF). It won first place at this fair.

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
src		src
.gitignore		.gitignore
README.md		README.md
poster_picture.png		poster_picture.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Detecting Driver Distraction

Abstract

Project Details

Rationale

Research Goal

Procedure

Data Analysis

Driver Distraction Models

Recursive Feature Elimination

Conclusion

Future Work

Acknowledgments

Files

About

Releases

Packages

Languages

cleahwin/driver-distraction

Folders and files

Latest commit

History

Repository files navigation

Detecting Driver Distraction

Abstract

Project Details

Rationale

Research Goal

Procedure

Data Analysis

Driver Distraction Models

Recursive Feature Elimination

Conclusion

Future Work

Acknowledgments

Files

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages