The goal of this project is to develop a predictive software tool that can assess the likelihood of fatal collisions, benefiting both the police department and the general public. For law enforcement, the tool will aid in enhancing security measures and planning road conditions in specific neighborhoods. Meanwhile, individuals will be able to utilize the tool to evaluate the necessity for additional precautions based on factors such as weather conditions and time. Leveraging a dataset collected by the Toronto police department over five years, the project aims to create a predictive service that can classify incidents as either resulting in fatality or not, using relevant features.
Dataset sourced: Toronto Police Service's official
Data shape: (18194, 57)
A simplified version has been published on Kaggle.
-
Confusion Metrics
-
Classification Reports
STACK_CLF ====================== Accuracy, Precision, Recall, F1: precision recall f1-score support 0 1.0000000 0.9928128 0.9963934 3061 1 0.9927892 1.0000000 0.9963816 3029 accuracy 0.9963875 6090 macro avg 0.9963946 0.9964064 0.9963875 6090 weighted avg 0.9964136 0.9963875 0.9963875 6090 BAGGINGCLASSIFIER_5 ====================== Accuracy, Precision, Recall, F1: precision recall f1-score support 0 1.0000000 0.9993466 0.9996732 3061 1 0.9993402 1.0000000 0.9996700 3029 accuracy 0.9996716 6090 macro avg 0.9996701 0.9996733 0.9996716 6090 weighted avg 0.9996718 0.9996716 0.9996716 6090 GRADIENTBOOSTINGCLASSIFIER_16 ====================== Accuracy, Precision, Recall, F1: precision recall f1-score support 0 1.0000000 0.9970598 0.9985277 3061 1 0.9970375 1.0000000 0.9985166 3029 accuracy 0.9985222 6090 macro avg 0.9985188 0.9985299 0.9985221 6090 weighted avg 0.9985265 0.9985222 0.9985222 6090
-
ROC Curves
Rather than using heavy modern frameworks, this time I tried a different approach -- put everything in a small py file.
I had to write string html strings, but it is super fun!
- The api is a single
app.py
file, but it is versatile and convenient. - It has a simple UI and easy to interact.
- It can smoothly do prediction, randomly choose test record, support customizing the test data, and display the result.
After clone and navigate to the directory, install the required dependencies using pip:
pip install flask pandas joblib
-
Run the
app.py
file:python app.py
-
Click http://127.0.0.1:5000/ to run the app.
-
If you are tired to enter data, click Lucky, the system will
- Pick a record from
X_test
- Fill the boxes with the values from the record
- Get the corresponding value from the
y_test
, and display it in the command line, so you can check the correctness.
# In Terminal, VS Code 127.0.0.1 - - [11/Apr/2024 11:04:24] "POST / HTTP/1.1" 200 - Index: 2482, Actual Value: 0 (If never modified.)
- Pick a record from
-
When the input is ready, click Predict button.
-
If you want to try one more, click One More to restore the form, or simply click Lucky to fill the data in one click.