Skip to content

A machine learning project assess the likelihood of fatal collisions. The best model achieves >99.9% accuracy with 100% recall.

License

Notifications You must be signed in to change notification settings

Dongli99/ksi-predictor

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

KSI Predictor - Model, API and UI

numpy python python

Overview

The goal of this project is to develop a predictive software tool that can assess the likelihood of fatal collisions, benefiting both the police department and the general public. For law enforcement, the tool will aid in enhancing security measures and planning road conditions in specific neighborhoods. Meanwhile, individuals will be able to utilize the tool to evaluate the necessity for additional precautions based on factors such as weather conditions and time. Leveraging a dataset collected by the Toronto police department over five years, the project aims to create a predictive service that can classify incidents as either resulting in fatality or not, using relevant features.

Dataset sourced: Toronto Police Service's official
Data shape: (18194, 57)

Notebook

A simplified version has been published on Kaggle.

Process

  1. Data Exploration
  2. Data Modelling
  3. Model Building
  4. Evaluation

Performance

  1. Confusion Metrics

  2. Classification Reports

    STACK_CLF
    ======================
    Accuracy, Precision, Recall, F1: 
                    precision    recall  f1-score   support
    
            0  1.0000000 0.9928128 0.9963934      3061
            1  0.9927892 1.0000000 0.9963816      3029
    
        accuracy                      0.9963875      6090
    macro avg  0.9963946 0.9964064 0.9963875      6090
    weighted avg  0.9964136 0.9963875 0.9963875      6090
    
    BAGGINGCLASSIFIER_5
    ======================
    Accuracy, Precision, Recall, F1: 
                    precision    recall  f1-score   support
    
            0  1.0000000 0.9993466 0.9996732      3061
            1  0.9993402 1.0000000 0.9996700      3029
    
        accuracy                      0.9996716      6090
    macro avg  0.9996701 0.9996733 0.9996716      6090
    weighted avg  0.9996718 0.9996716 0.9996716      6090
    
    GRADIENTBOOSTINGCLASSIFIER_16
    ======================
    Accuracy, Precision, Recall, F1: 
                    precision    recall  f1-score   support
    
            0  1.0000000 0.9970598 0.9985277      3061
            1  0.9970375 1.0000000 0.9985166      3029
    
        accuracy                      0.9985222      6090
    macro avg  0.9985188 0.9985299 0.9985221      6090
    weighted avg  0.9985265 0.9985222 0.9985222      6090
  3. ROC Curves

API & UI

Rather than using heavy modern frameworks, this time I tried a different approach -- put everything in a small py file.
I had to write string html strings, but it is super fun!

Features

  • The api is a single app.py file, but it is versatile and convenient.
  • It has a simple UI and easy to interact.
  • It can smoothly do prediction, randomly choose test record, support customizing the test data, and display the result.

Installation

After clone and navigate to the directory, install the required dependencies using pip:

pip install flask pandas joblib

Usage

  1. Run the app.py file:

    python app.py
  2. Click http://127.0.0.1:5000/ to run the app.

Interaction

  1. If you are tired to enter data, click Lucky, the system will

    • Pick a record from X_test
    • Fill the boxes with the values from the record
    • Get the corresponding value from the y_test, and display it in the command line, so you can check the correctness.
    # In Terminal, VS Code
    127.0.0.1 - - [11/Apr/2024 11:04:24] "POST / HTTP/1.1" 200 -
    Index: 2482, Actual Value: 0 (If never modified.)
  2. When the input is ready, click Predict button.

    Note: The actual target will display in the command line.

  3. If you want to try one more, click One More to restore the form, or simply click Lucky to fill the data in one click.

About

A machine learning project assess the likelihood of fatal collisions. The best model achieves >99.9% accuracy with 100% recall.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published