# Fuel-Efficient Route Model Training and Saving

This notebook will:
1. Load and preprocess vehicle and traffic data.
2. Merge the datasets and prepare features/targets.
3. Split the data, scale the features.
4. Build and train two neural network models:
   - Fuel Consumption Model
   - CO₂ Emissions Model
5. Save the trained models as `.h5` files in the `models/` folder.


In [15]:
import os
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam

print("Libraries imported successfully.")


Libraries imported successfully.


## 1. Load Your Data

We will load the following CSV files:
- `data/vehicle data/vehicles.csv`
- `data/traffic data/Traffic_Volumes.csv`
- `data/traffic data/Bottlenecks.csv`

In [None]:
# Load the data files
vehicle_data1 = pd.read_csv('../data/vehicle data/CO2Emissions_Canada.csv')
vehicle_data2 = pd.read_csv('../data/vehicle data/vehicles.csv')
traffic_volumes = pd.read_csv('../data/traffic data/Traffic_Volumes_AADT.csv')
bottlenecks = pd.read_csv('../data/traffic data/Bottlenecks.csv')

print("Vehicle Data Samples:")
print(vehicle_data1.head())
print(vehicle_data2.head())
print("\nTraffic Volumes Sample:")
print(traffic_volumes.head())
print("\nBottlenecks Sample:")
print(bottlenecks.head())


## 2. Data Cleaning and Preprocessing

- Remove any missing values.
- Rename columns in the vehicle data for clarity.
- Compute a combined fuel efficiency as a weighted average.
- Normalize the traffic severity in the bottlenecks data.


In [None]:
# Drop missing values from all datasets
vehicle_data1.dropna(inplace=True)
vehicle_data2.dropna(inplace=True)
traffic_volumes.dropna(inplace=True)
bottlenecks.dropna(inplace=True)

# Rename columns for clarity
vehicle_data1.rename(columns={
    'Fuel Consumption City (L/100 km)': 'city_fuel_efficiency',
    'Fuel Consumption Hwy (L/100 km)': 'highway_fuel_efficiency',
    'CO2 Emissions(g/km)': 'co2_emissions'
}, inplace=True)

vehicle_data2.rename(columns={
    'make': 'Make'
}, inplace=True)

traffic_volumes.rename(columns={
    'COUNTY': 'County'
}, inplace=True)

vehicle_data = pd.merge(
    vehicle_data1,
    vehicle_data2,
    on='Make',
    how='left'
)

# Compute combined fuel efficiency (55% city, 45% highway)
vehicle_data['combined_fuel_efficiency'] = (
    vehicle_data['city_fuel_efficiency'] * 0.55 +
    vehicle_data['highway_fuel_efficiency'] * 0.45
)

# Normalize traffic severity (0 to 1)
bottlenecks['normalized_traffic_severity'] = (
    bottlenecks['Total_Delay__veh_hrs_'] / bottlenecks['Total_Delay__veh_hrs_'].max()
)

print("Data cleaning and preprocessing complete.")


## 3. Merge Datasets

- Merge the traffic volumes with the bottlenecks on 'County'.
- Merge the resulting dataset with the vehicle data on 'fuelType'.
- Compute the fuel consumption: In progress


In [None]:
# Merge traffic volumes with bottlenecks on 'County'
traffic_data = pd.merge(
    traffic_volumes,
    bottlenecks[['County', 'normalized_traffic_severity']],
    on='County',
    how='left'
)

print(vehicle_data.columns)
print("Traffic Data Sample:")
print(traffic_data.shape)
print(traffic_data.head())

'''
traffic_summary = traffic_data.groupby('County').agg({
    'normalized_traffic_severity': 'mean'
}).reset_index()
'''

vehicle_summary = vehicle_data.groupby('fuelType').agg({
    'combined_fuel_efficiency': 'mean'
}).reset_index()

print("Traffic Summary Shape:", traffic_summary.shape)
print("Vehicle Summary Shape:", vehicle_summary.shape)


# Merge summarized data on 'County'
merged_data = pd.merge(traffic_summary, vehicle_summary, how='cross')  # Cartesian merge (all traffic data with all vehicles)

# Sample fuel consumption formula (proxy)
merged_data['fuel_consumption'] = merged_data['normalized_traffic_severity'] / merged_data['combined_fuel_efficiency']

print("Merged Data Sample:")
print(merged_data.head())


## 4. Prepare Features and Targets

- **Features (X):** distance, normalized_traffic_severity, combined_fuel_efficiency
- **Targets:**
  - y_fuel for fuel consumption prediction.
  - y_co2 for CO₂ emissions prediction.


In [None]:
# Define features and target variables
X = merged_data[['distance', 'normalized_traffic_severity', 'combined_fuel_efficiency']]
y_fuel = merged_data['fuel_consumption']
y_co2 = merged_data['co2_emissions']

print("Features shape:", X.shape)
print("Fuel Consumption target shape:", y_fuel.shape)
print("CO₂ Emissions target shape:", y_co2.shape)


## 5. Train-Test Split and Scaling

- Split the data into training and testing sets.
- Scale the features using StandardScaler for each model.


In [None]:
# Split data for fuel consumption
X_train, X_test, y_fuel_train, y_fuel_test = train_test_split(X, y_fuel, test_size=0.2, random_state=42)
# Use the same split for CO₂ emissions target
_, _, y_co2_train, y_co2_test = train_test_split(X, y_co2, test_size=0.2, random_state=42)

# Scale features for the fuel model
scaler_fuel = StandardScaler()
X_train_fuel = scaler_fuel.fit_transform(X_train)
X_test_fuel = scaler_fuel.transform(X_test)

# Scale features for the CO₂ model
scaler_co2 = StandardScaler()
X_train_co2 = scaler_co2.fit_transform(X_train)
X_test_co2 = scaler_co2.transform(X_test)

print("Train-Test split and scaling complete.")


## 6. Define a Model Builder Function

This function creates a simple feedforward neural network for regression.


In [None]:
def build_model(input_dim):
    model = Sequential([
        Dense(64, activation='relu', input_shape=(input_dim,)),
        Dropout(0.3),
        Dense(32, activation='relu'),
        Dropout(0.3),
        Dense(1)
    ])
    model.compile(optimizer=Adam(learning_rate=0.001), loss='mse', metrics=['mse'])
    return model

print("Model builder function defined.")


## 7. Train the Models

We train two models:
- Fuel Consumption Model
- CO₂ Emissions Model


In [None]:
# Train Fuel Consumption Model
print("Training Fuel Consumption Model...")
model_fuel = build_model(X_train_fuel.shape[1])
model_fuel.fit(X_train_fuel, y_fuel_train, validation_split=0.2, epochs=30, batch_size=32, verbose=1)

# Train CO₂ Emissions Model
print("\nTraining CO₂ Emissions Model...")
model_co2 = build_model(X_train_co2.shape[1])
model_co2.fit(X_train_co2, y_co2_train, validation_split=0.2, epochs=30, batch_size=32, verbose=1)


## 8. Save the Trained Models

The models will be saved as `.h5` files in the `models/` folder.


In [None]:
os.makedirs('models', exist_ok=True)
model_fuel.save('models/fuel_consumption_model.h5')
model_co2.save('models/co2_emissions_model.h5')

print("Models saved in the 'models/' folder.")
