# Traffic Analysis Project


This notebook demonstrates the process of analyzing and predicting traffic patterns in Bangalore using machine learning techniques.

## Objectives:
1. **Traffic Volume Prediction**: Predict traffic volume on different roads in Bangalore.
2. **Congestion Level Classification**: Classify congestion levels into categories (Low, Medium, High).
3. **Incident Detection**: Predict incident reports based on traffic data.

## Load and Preprocess the Data


In [None]:

import pandas as pd
from sklearn.preprocessing import LabelEncoder
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

# Load the dataset
file_path = 'Banglore_traffic_Dataset 2.csv'
df = pd.read_csv(file_path)

# Convert 'Date' column to datetime format and extract time-based features
df['Date'] = pd.to_datetime(df['Date'])
df['Year'] = df['Date'].dt.year
df['Month'] = df['Date'].dt.month
df['Day'] = df['Date'].dt.day
df['Weekday'] = df['Date'].dt.weekday

# Encode categorical columns ('Weather Conditions' and 'Roadwork and Construction Activity')
label_encoder = LabelEncoder()
df['Weather Conditions'] = label_encoder.fit_transform(df['Weather Conditions'])
df['Roadwork and Construction Activity'] = label_encoder.fit_transform(df['Roadwork and Construction Activity'])

# Drop the original 'Date' column
df.drop(columns=['Date'], inplace=True)

# Split the data for modeling
X = df.drop(columns=['Traffic Volume', 'Congestion Level', 'Incident Reports'])
y_traffic_volume = df['Traffic Volume']
y_congestion_level = df['Congestion Level']
y_incident_reports = df['Incident Reports']


## Traffic Volume Prediction - Random Forest Regressor

In [None]:

from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error, r2_score

# Split the data into train and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y_traffic_volume, test_size=0.2, random_state=42)

# Initialize the Random Forest Regressor
rf_regressor = RandomForestRegressor(random_state=42)

# Train the model
rf_regressor.fit(X_train, y_train)

# Predict on test data
y_pred = rf_regressor.predict(X_test)

# Evaluate the model
mse = mean_squared_error(y_test, y_pred)
rmse = mean_squared_error(y_test, y_pred, squared=False)
r2 = r2_score(y_test, y_pred)

(mse, rmse, r2)


## Congestion Level Classification - Random Forest Classifier

In [None]:

from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report

# Categorize congestion levels
congestion_labels = ['Low', 'Medium', 'High']
df['Congestion Level Category'] = pd.cut(df['Congestion Level'], bins=[0, 33, 66, 100], labels=congestion_labels)

# Separate features and target variable for classification
y_congestion = df['Congestion Level Category']
X_classification = df.drop(columns=['Congestion Level', 'Traffic Volume', 'Incident Reports', 'Congestion Level Category'])

# Split the data
X_train_class, X_test_class, y_train_class, y_test_class = train_test_split(X_classification, y_congestion, test_size=0.2, random_state=42)

# Initialize the Random Forest Classifier
rf_classifier = RandomForestClassifier(random_state=42)

# Train the model
rf_classifier.fit(X_train_class, y_train_class)

# Predict on test data
y_pred_class = rf_classifier.predict(X_test_class)

# Evaluate the model
classification_rep = classification_report(y_test_class, y_pred_class, target_names=congestion_labels)

classification_rep


## Incident Detection - Random Forest Classifier

In [None]:

# Prepare the data for incident detection
y_incident = df['Incident Reports']
X_incident_classification = df.drop(columns=['Congestion Level', 'Traffic Volume', 'Incident Reports'])

# Split the data for incident detection
X_train_incident, X_test_incident, y_train_incident, y_test_incident = train_test_split(X_incident_classification, y_incident, test_size=0.2, random_state=42)

# Standardize numerical features
scaler = StandardScaler()
X_train_incident_scaled = scaler.fit_transform(X_train_incident)
X_test_incident_scaled = scaler.transform(X_test_incident)

# Initialize Random Forest Classifier for incident detection
rf_incident_classifier = RandomForestClassifier(random_state=42)

# Train the model
rf_incident_classifier.fit(X_train_incident_scaled, y_train_incident)

# Predict on the test data
y_pred_incident = rf_incident_classifier.predict(X_test_incident_scaled)

# Evaluate the model performance
incident_classification_report = classification_report(y_test_incident, y_pred_incident)

incident_classification_report


## Conclusion

In [None]:

# Project Summary
# - Traffic Volume Prediction: Achieved high R² with Random Forest Regressor.
# - Congestion Level Classification: 94% accuracy with Random Forest Classifier.
# - Incident Detection: Model trained with Random Forest Classifier but requires further improvements.
