# **OPEN-ARC**
---

### Project 9: Traffic Accident Prediction Model:
**Challenge:** Create an AI model, capable of predicting traffic accidents based on a set of features.


### Terms and Use:
Learn more about the project's [LICENSE](https://github.com/Infinitode/OPEN-ARC/blob/main/LICENSE) and read our [CODE_OF_CONDUCT](https://github.com/Infinitode/OPEN-ARC/blob/main/CODE_OF_CONDUCT) before contributing to the project. You can contribute to this project from here: [https://github.com/Infinitode/OPEN-ARC/](https://github.com/Infinitode/OPEN-ARC/).

---

Please fill out this performance sheet to help others quickly see your model's performance **(optional)**:

### Performance Sheet:
| Contributor | Architecture Type | Platform | Base Model | Dataset | Accuracy | Link |
|-------------|-------------------|----------|------------|---------|----------|------|
| Infinitode  | XGBClassifier  | Kaggle   | ✔  | Traffic Accident Prediction 💥🚗 | 85.2%    | [Notebook](https://github.com/Infinitode/OPEN-ARC/blob/main/Project-9-TAPM/project-9-tapm.ipynb) |
| Username  | Unknown  | Kaggle   | ✗/✔  | Traffic Accident Prediction 💥🚗 | Score    | [Notebook](https://github.com) |

---

### Model: XGBoostClassifier:
This implementation uses an XGBoost Classifier model. You can learn more about XGBoost classifiers from here: https://apmonitor.com/pds/index.php/Main/XGBoostClassifier

In [3]:
# Import necessary libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, roc_auc_score
from xgboost import XGBClassifier
from imblearn.over_sampling import SMOTE
import joblib

# Load the dataset
data = pd.read_csv('/kaggle/input/traffic-accident-prediction/dataset_traffic_accident_prediction1.csv')

# Data exploration
print(data.head())
print(data.info())
print(data.describe())

# Define mappings for encoding categorical variables
weather_mapping = {
    "Clear": 0,
    "Rainy": 1,
    "Foggy": 2,
    "Snowy": 3,
    "Stormy": 4
}
road_type_mapping = {
    "Highway": 0,
    "City Road": 1,
    "Rural Road": 2,
    "Mountain Road": 3
}
time_of_day_mapping = {
    "Morning": 0,
    "Afternoon": 1,
    "Evening": 2,
    "Night": 3,
}
accident_severity_mapping = {
    "Low": 0,
    "Moderate": 1,
    "High": 2
}
road_condition_mapping = {
    "Dry": 0,
    "Wet": 1,
    "Icy": 2,
    "Under Construction": 3
}
vehicle_type_mapping = {
    "Car": 0,
    "Truck": 1,
    "Motorcycle": 2,
    "Bus": 3
}
road_light_condition_mapping = {
    "Daylight": 0,
    "Artificial Light": 1,
    "No Light": 2
}

# Apply mappings to the dataset
data['Weather'] = data['Weather'].map(weather_mapping)
data['Road_Type'] = data['Road_Type'].map(road_type_mapping)
data['Time_of_Day'] = data['Time_of_Day'].map(time_of_day_mapping)
data['Accident_Severity'] = data['Accident_Severity'].map(accident_severity_mapping)
data['Road_Condition'] = data['Road_Condition'].map(road_condition_mapping)
data['Vehicle_Type'] = data['Vehicle_Type'].map(vehicle_type_mapping)
data['Road_Light_Condition'] = data['Road_Light_Condition'].map(road_light_condition_mapping)

# Handle missing values by dropping rows with NaN
data = data.dropna()

# Separate features and target
X = data.drop(columns=["Accident"])
y = data["Accident"]

# Apply SMOTE for class imbalance
smote = SMOTE(random_state=42)
X_resampled, y_resampled = smote.fit_resample(X, y)

# Train-test split with stratification
X_train, X_test, y_train, y_test = train_test_split(
    X_resampled, y_resampled, test_size=0.2, random_state=42, stratify=y_resampled
)

# Initialize the XGBClassifier
xgb = XGBClassifier(random_state=42, use_label_encoder=False, eval_metric='logloss', colsample_bytree=0.8, learning_rate=0.01, max_depth=5, n_estimators=100, scale_pos_weight=1, subsample=0.8)

# Fit the model
xgb.fit(X_resampled, y_resampled)

# Evaluate the model on the test set
y_pred = xgb.predict(X_test)
y_pred_proba = best_model.predict_proba(X_test)[:, 1]

accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy: {accuracy:.4f}")
print("Classification Report:")
print(classification_report(y_test, y_pred))

roc_auc = roc_auc_score(y_test, y_pred_proba)
print(f"ROC-AUC Score: {roc_auc:.4f}")

  Weather   Road_Type Time_of_Day  Traffic_Density  Speed_Limit  \
0   Rainy   City Road     Morning              1.0        100.0   
1   Clear  Rural Road       Night              NaN        120.0   
2   Rainy     Highway     Evening              1.0         60.0   
3   Clear   City Road   Afternoon              2.0         60.0   
4   Rainy     Highway     Morning              1.0        195.0   

   Number_of_Vehicles  Driver_Alcohol Accident_Severity      Road_Condition  \
0                 5.0             0.0               NaN                 Wet   
1                 3.0             0.0          Moderate                 Wet   
2                 4.0             0.0               Low                 Icy   
3                 3.0             0.0               Low  Under Construction   
4                11.0             0.0               Low                 Dry   

  Vehicle_Type  Driver_Age  Driver_Experience Road_Light_Condition  Accident  
0          Car        51.0               48

After training, we get an accuracy score of `0.8522`, and an ROC-AUC score of `0.8255`.

In [4]:
# Save the trained model
joblib.dump(xgb, 'xgboost_tapm.pkl')
print("Model saved as 'xgboost_tapm.pkl'.")

# Reload and use the saved model
loaded_model = joblib.load('xgboost_tapm.pkl')
print("Model loaded successfully.")

Model saved as 'xgboost_tapm.pkl'.
Model loaded successfully.


In [6]:
import random

def test_random_samples(model, X_test, y_test, n_samples=5):
    """
    Selects random samples from the test set, makes predictions, and compares with actual values.
    
    Parameters:
    - model: Trained XGBoost classifier.
    - X_test: Feature set for testing.
    - y_test: True labels for testing.
    - n_samples: Number of random samples to test.
    
    Returns:
    None
    """
    # Convert X_test and y_test to DataFrame for easier indexing
    X_test_df = X_test.reset_index(drop=True)
    y_test_df = y_test.reset_index(drop=True)

    # Pick random indices
    random_indices = random.sample(range(len(X_test)), n_samples)
    
    print("Testing on Random Samples:")
    for idx in random_indices:
        sample = X_test_df.iloc[idx]
        true_label = y_test_df.iloc[idx]
        
        # Predict using the model
        prediction = model.predict(sample.values.reshape(1, -1))
        
        # Output results
        print(f"Sample Index: {idx}")
        print(f"Features: {sample.values}")
        print(f"True Label: {true_label}, Predicted Label: {prediction[0]}")
        print("-" * 40)

# Example usage
test_random_samples(xgb, X_test, y_test)

Testing on Random Samples:
Sample Index: 93
Features: [ 1.  2.  1.  0. 80.  2.  0.  0.  0.  0. 45. 39.  1.]
True Label: 0.0, Predicted Label: 0
----------------------------------------
Sample Index: 83
Features: [ 1.70905541  0.          0.3545277   1.6454723  60.          3.3545277
  0.          1.6454723   0.          0.3545277  48.         46.58188918
  1.3545277 ]
True Label: 1.0, Predicted Label: 1
----------------------------------------
Sample Index: 37
Features: [ 0.  2.  1.  1. 50.  3.  0.  0.  0.  2. 35. 34.  1.]
True Label: 0.0, Predicted Label: 0
----------------------------------------
Sample Index: 30
Features: [ 0.7007342   0.6496329   1.3503671   2.         42.99265798  3.0511013
  0.          1.3503671   1.          0.         34.3503671  28.5985316
  0.3503671 ]
True Label: 1.0, Predicted Label: 1
----------------------------------------
Sample Index: 46
Features: [  3.   0.   0.   2. 100.   2.   0.   1.   1.   0.  26.  17.   0.]
True Label: 0.0, Predicted Label: 0
--

### The End:

This is the end of this project notebook, make sure to experiment and contribute to help improve the model and implementation. You can browse more of the open-source free projects on our GitHub repository: https://github.com/Infinitode/OPEN-ARC. If you like this project, make sure to star the repo and contribute your implementation, or help others in the community.

~ Infinitode