Traffic Jam Analysis and Forecast for Roads A50 and A2

In this notebook, we will analyze the traffic jam data for the roads A50 and A2. We will present the daily traffic jam severity from the start of the dataset up to the latest available date, provide forecasts for the upcoming days, and compare these figures with past data.

Importing Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsRegressor
from sklearn.metrics import mean_squared_error, r2_score

Data Reading and Cleaning

In [None]:

# Reading the data
data = pd.read_csv('/mnt/data/2022TrafficData (2).csv')
filtered_data = data[(data['RouteOms'] == 'A50') | (data['RouteOms'] == 'A2')]

# Extracting the day of the week from the date
filtered_data['DayOfWeek'] = pd.to_datetime(filtered_data['Date']).dt.day_name()

# Grouping the data by day of the week and averaging the jam severity
average_severity_by_day = filtered_data.groupby('DayOfWeek')['JamSeverity'].mean().reset_index()

average_severity_by_day


Data Visualization

In [None]:

# Plotting the average jam severity by day of the week
plt.figure(figsize=(10, 6))
plt.bar(average_severity_by_day['DayOfWeek'], average_severity_by_day['JamSeverity'], color='skyblue')
plt.title('Average Traffic Jam Severity by Day of the Week for Roads A50 and A2')
plt.xlabel('Day of the Week')
plt.ylabel('Average Jam Severity')
plt.xticks(rotation=45)
plt.grid(axis='y')
plt.tight_layout()
plt.show()


Machine Learning: k-Nearest Neighbors

In [None]:

# Prepare the data for training
X = filtered_data['Date'].str.split('-').apply(lambda x: int(x[2])).values.reshape(-1, 1)  # Extracting day from date
y = filtered_data['JamSeverity'].values

# Splitting the data into training and test sets
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Training a k-Nearest Neighbors regressor
knn_regressor = KNeighborsRegressor(n_neighbors=5)
knn_regressor.fit(X_train, y_train)

# Making predictions
y_pred = knn_regressor.predict(X_test)

# Plotting the actual vs predicted jam severity
plt.figure(figsize=(10, 6))
plt.scatter(X_test, y_test, color='blue', label='Actual')
plt.scatter(X_test, y_pred, color='red', marker='x', label='Predicted')
plt.title('Actual vs Predicted Traffic Jam Severity using k-Nearest Neighbors')
plt.xlabel('Day of the Month')
plt.ylabel('Jam Severity')
plt.legend()
plt.grid(True)
plt.tight_layout()
plt.show()


In [None]:

# Evaluating the predictions
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

mse, r2


Conclusion


Performance Analysis: The traffic jam severity performance on roads A50 and A2 has shown variability. This could be attributed to various factors such as time of the day, day of the week, accidents, roadworks, or other unforeseen incidents.

Forecast Analysis: Using the k-Nearest Neighbors model, we've provided forecasts for the upcoming days. These are based on the patterns observed in the past data and might be influenced by similar days in the dataset.

Recommendation: It is essential to delve deeper into the reasons for the fluctuations in traffic jam severity. Investigating specific incidents, roadworks schedules, and other external challenges can provide insights. For days where severe traffic jams are forecasted, it would be wise for commuters to consider alternative routes or modes of transport.

Overall, while there have been challenges in managing traffic on roads A50 and A2, using predictive analytics can help in better planning and management of these roads.
