# Simulated Traffic Data Generation and Route Suggestion Model
            This Jupyter Notebook presents code for the generation of a simulated dataset containing user and traffic data. The dataset includes user IDs, timestamps, current locations, destinations, traffic parameters, alternate routes, and estimated times and distances. Additionally, it demonstrates the creation of a machine learning model for route suggestion based on traffic conditions. The model is trained and evaluated, and a real-time route suggestion is provided.

### Simulated Dataset Creation

 In this section, we create a simulated dataset with user and traffic data. The dataset includes various features such as user IDs, timestamps, current locations, destination, traffic speed, traffic congestion, alternate routes, estimated times, and estimated distances. We generate 1000 entries with random data for demonstration purposes.

##### Importing Libraries

In [41]:
import random
import pandas as pd
import datetime
from sklearn.model_selection import train_test_split
from sklearn.tree import DecisionTreeClassifier
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import LabelEncoder

##### Simulating a dataset with user and traffic data
###### Initializing empty lists to store data

In [19]:
user_ids = []
timestamps = []
current_locations = []
destinations = []
traffic_speeds = []
traffic_congestions = []
alternate_routes = []
estimated_times = []
estimated_distances = []

###### Simulating 1000 entries

In [23]:
for _ in range(1000):
    user_id = random.randint(1, 100)
    timestamp = datetime.datetime(2023, 10, 25, random.randint(0, 23), random.randint(0, 59))
    current_location = (random.uniform(40.6, 40.8), random.uniform(-74.1, -73.9))
    destination = (random.uniform(40.6, 40.8), random.uniform(-74.1, -73.9))
    traffic_speed = random.randint(10, 60)
    traffic_congestion = random.uniform(0, 1)

    # Simulate a random alternate route with a few coordinates
    
    num_alternate_route_points = random.randint(2, 10)
    alternate_route = [(random.uniform(40.6, 40.8), random.uniform(-74.1, -73.9)) for _ in range(num_alternate_route_points)]

    estimated_time = random.randint(10, 60)
    estimated_distance = random.uniform(2, 10)

    user_ids.append(user_id)
    timestamps.append(timestamp)
    current_locations.append(current_location)
    destinations.append(destination)
    traffic_speeds.append(traffic_speed)
    traffic_congestions.append(traffic_congestion)
    alternate_routes.append(alternate_route)
    estimated_times.append(estimated_time)
    estimated_distances.append(estimated_distance)

##### Creating a DataFrame with the simulated data

In [24]:
data = pd.DataFrame({
    'User_ID': user_ids,
    'Timestamp': timestamps,
    'Current_Location': current_locations,
    'Destination': destinations,
    'Traffic_Speed': traffic_speeds,
    'Traffic_Congestion': traffic_congestions,
    'Alternate_Route': alternate_routes,
    'Estimated_Time_Alternate': estimated_times,
    'Estimated_Distance_Alternate': estimated_distances
})

##### Displaying the simulated data

In [25]:
data.head()

Unnamed: 0,User_ID,Timestamp,Current_Location,Destination,Traffic_Speed,Traffic_Congestion,Alternate_Route,Estimated_Time_Alternate,Estimated_Distance_Alternate
0,40,2023-10-25 18:34:00,"(40.731946762364, -73.97689108595226)","(40.65202465495198, -73.99705913358297)",29,0.150436,"[(40.75264623900403, -74.04194228967836), (40....",35,7.354829
1,46,2023-10-25 14:21:00,"(40.787256677981254, -74.04123601177999)","(40.66387859192334, -73.98899876851507)",21,0.403945,"[(40.688941841590676, -74.00870877368276), (40...",37,9.584325
2,67,2023-10-25 00:26:00,"(40.658331198652526, -73.99842534605598)","(40.686967622122104, -74.05332562171306)",27,0.735543,"[(40.66159176560929, -73.97624275690423), (40....",34,5.223922
3,22,2023-10-25 19:08:00,"(40.67443162160435, -74.06538855248644)","(40.70683640200687, -73.90830743674526)",55,0.586689,"[(40.783945544999014, -73.97395921062886), (40...",23,5.528615
4,69,2023-10-25 17:15:00,"(40.68286579638996, -73.99399431938936)","(40.69859986732159, -74.06742181456444)",43,0.306856,"[(40.74300268649984, -74.09193735215302), (40....",12,5.335785


### Model Training and Prediction

In this section, we define the features (independent variables) and the target (dependent variable) for our model. We split the dataset into training and testing sets and create a Decision Tree Classifier to make predictions. We evaluate the model's accuracy and demonstrate how to use the trained model to make route suggestions for new data.


##### Adding a new field 'Route_Suggestion'

In [58]:
def suggest_alternate_route(row):
    if row['Traffic_Congestion'] > 0.5:  # Adjust the congestion threshold as needed
        return "Use alternate route"
    else:
        return "No need for an alternate route"


data['Route_Suggestion'] = data.apply(suggest_alternate_route, axis=1)

data.head()

Unnamed: 0,User_ID,Timestamp,Current_Location,Destination,Traffic_Speed,Traffic_Congestion,Alternate_Route,Estimated_Time_Alternate,Estimated_Distance_Alternate,Route_Suggestion
0,40,2023-10-25 18:34:00,"(40.731946762364, -73.97689108595226)","(40.65202465495198, -73.99705913358297)",29,0.150436,"[(40.75264623900403, -74.04194228967836), (40....",35,7.354829,No need for an alternate route
1,46,2023-10-25 14:21:00,"(40.787256677981254, -74.04123601177999)","(40.66387859192334, -73.98899876851507)",21,0.403945,"[(40.688941841590676, -74.00870877368276), (40...",37,9.584325,No need for an alternate route
2,67,2023-10-25 00:26:00,"(40.658331198652526, -73.99842534605598)","(40.686967622122104, -74.05332562171306)",27,0.735543,"[(40.66159176560929, -73.97624275690423), (40....",34,5.223922,Use alternate route
3,22,2023-10-25 19:08:00,"(40.67443162160435, -74.06538855248644)","(40.70683640200687, -73.90830743674526)",55,0.586689,"[(40.783945544999014, -73.97395921062886), (40...",23,5.528615,Use alternate route
4,69,2023-10-25 17:15:00,"(40.68286579638996, -73.99399431938936)","(40.69859986732159, -74.06742181456444)",43,0.306856,"[(40.74300268649984, -74.09193735215302), (40....",12,5.335785,No need for an alternate route


##### Label Encoding

In [47]:
route_suggestion = data['Route_Suggestion']
le = LabelEncoder()
route_target = le.fit_transform(route_suggestion)

##### Define the features (independent variables) and the target (dependent variable)

In [48]:
features = data[['Traffic_Speed', 'Traffic_Congestion']]
time_target = data['Estimated_Time_Alternate']
distance_target = data['Estimated_Distance_Alternate']
route_target = le.fit_transform(route_suggestion)

##### Split the dataset into training and testing sets

In [49]:
X_train, X_test, y_train_route, y_test_route = train_test_split(features, route_target, test_size=0.2, random_state=42)
X_train, X_test, y_train_time, y_test_time = train_test_split(features, time_target, test_size=0.2, random_state=42)
X_train, X_test, y_train_distance, y_test_distance = train_test_split(features, distance_target, test_size=0.2, random_state=42)

##### Create and train the dataset using Linear Regression

In [50]:
time_model = LinearRegression()
distance_model = LinearRegression()
route_model = LinearRegression()

time_model.fit(X_train, y_train_time)
distance_model.fit(X_train, y_train_distance)
route_model.fit(X_train, y_train_route)

##### Make predictions on the test set

In [51]:
time_predictions = time_model.predict(X_test)
distance_predictions = distance_model.predict(X_test)
route_predictions = route_model.predict(X_test)

##### Display the model performance

In [52]:
time_r2 = time_model.score(X_test, y_test_time)
distance_r2 = distance_model.score(X_test, y_test_distance)
route_r2 = route_model.score(X_test, y_test_route)

print(f"Time Model R-squared: {time_r2}")
print(f"Distance Model R-squared: {distance_r2}")
print(f"Route Model R-squared: {route_r2}")

Time Model R-squared: 0.007041459219363166
Distance Model R-squared: -0.00027737819892381665
Route Model R-squared: 0.7476552542774048


##### Using the trained models to make predictions for new data

In [70]:
new_data = pd.DataFrame({
    'Traffic_Speed': [45],
    'Traffic_Congestion': [0.6],
})
# import math

time_prediction = time_model.predict(new_data)
distance_prediction = distance_model.predict(new_data)
route_prediction = route_model.predict(new_data)

if route_prediction > 0.5:
    route_prediction = 'Use alternate route'
else:
    route_prediction = 'No need for alternate route'

print(f'Predicted time : {round(time_prediction[0], 2)} minutes\nPredicted distance : {round(distance_prediction[0], 2)} km\nPredicted Route : {route_prediction}')

Predicted time : 34.43 minutes
Predicted distance : 6.04 km
Predicted Route : Use alternate route
