# Fuel-Efficient Route Model Training

This notebook trains **two neural network models**:

1. **Fuel Consumption Model**
2. **CO₂ Emissions Model**

using vehicle and traffic data to estimate the most fuel-efficient route.

## Instructions

1. Ensure the required CSV files are in the correct locations:
   - `../data/vehicle data/vehicles.csv`
   - `../data/traffic data/Traffic_Volumes.csv`
   - `../data/traffic data/Bottlenecks.csv`
2. Install the necessary dependencies (e.g., `pip install pandas numpy scikit-learn tensorflow`).
3. Run the cells sequentially.
4. After training, the models will be saved as `.h5` files in the `../models/` directory.


In [None]:
# 1. Import Libraries
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Dropout
from tensorflow.keras.optimizers import Adam

print("Libraries imported successfully.")


## 2. Load and Inspect Datasets

We load the CSV files containing:
- Vehicle data (fuel efficiency & CO₂ emissions)
- Traffic volumes (historical traffic patterns)
- Bottlenecks (areas of high congestion)


In [None]:
# 2. Load and Inspect Datasets
vehicle_data = pd.read_csv('../data/vehicle data/vehicles.csv')
traffic_volumes = pd.read_csv('../data/traffic data/Traffic_Volumes.csv')
bottlenecks = pd.read_csv('../data/traffic data/Bottlenecks.csv')

print("Vehicle Data Sample:")
print(vehicle_data.head())

print("\nTraffic Volumes Sample:")
print(traffic_volumes.head())

print("\nBottlenecks Sample:")
print(bottlenecks.head())


## 3. Data Cleaning and Preprocessing

Steps:
- Drop rows with missing values.
- Rename columns in the vehicle data for clarity.
- Compute a combined fuel efficiency (weighted average: 55% city, 45% highway).
- Normalize the traffic congestion severity.


In [None]:
# 3. Data Cleaning and Preprocessing

# Drop rows with missing values
vehicle_data.dropna(inplace=True)
traffic_volumes.dropna(inplace=True)
bottlenecks.dropna(inplace=True)

# Rename columns in vehicle_data for clarity
vehicle_data.rename(
    columns={
        'Fuel_Consumption_City': 'city_fuel_efficiency',
        'Fuel_Consumption_Hwy': 'highway_fuel_efficiency',
        'CO2_Emissions': 'co2_emissions'
    },
    inplace=True
)

# Compute combined fuel efficiency (weighted: 55% city, 45% highway)
vehicle_data['combined_fuel_efficiency'] = (
    vehicle_data['city_fuel_efficiency'] * 0.55 +
    vehicle_data['highway_fuel_efficiency'] * 0.45
)

# Normalize traffic congestion severity (scale 0 to 1)
bottlenecks['normalized_traffic_severity'] = (
    bottlenecks['traffic_severity'] / bottlenecks['traffic_severity'].max()
)

print("Data cleaning and preprocessing complete.")


## 4. Merge Datasets

Steps:
1. Merge traffic volumes with bottlenecks on `location`.
2. Merge the resulting traffic data with vehicle data on `Fuel_Type`.
3. Compute `fuel_consumption` as: distance / combined_fuel_efficiency.


In [None]:
# 4. Merge Datasets

# Merge traffic volumes with bottlenecks (on 'location')
traffic_data = pd.merge(
    traffic_volumes,
    bottlenecks[['location', 'normalized_traffic_severity']],
    on='location',
    how='left'
)

# Merge the traffic data with vehicle_data on 'Fuel_Type'
merged_data = traffic_data.merge(
    vehicle_data,
    on='Fuel_Type',
    how='left'
)

# Compute estimated fuel consumption: Fuel Consumption = distance / combined_fuel_efficiency
merged_data['fuel_consumption'] = merged_data['distance'] / merged_data['combined_fuel_efficiency']

print("Merged Data Sample:")
print(merged_data.head())


## 5. Select Features and Targets

- **Features (X):** `distance`, `normalized_traffic_severity`, `combined_fuel_efficiency`
- **Targets:**
  - `y_fuel` for fuel consumption prediction
  - `y_co2` for CO₂ emissions prediction


In [None]:
# 5. Select Features and Targets
X = merged_data[['distance', 'normalized_traffic_severity', 'combined_fuel_efficiency']]
y_fuel = merged_data['fuel_consumption']
y_co2 = merged_data['co2_emissions']

print("Features (X) shape:", X.shape)
print("Fuel Consumption (y_fuel) shape:", y_fuel.shape)
print("CO₂ Emissions (y_co2) shape:", y_co2.shape)


## 6. Train-Test Split

We split the data into training (80%) and testing (20%) sets for both targets.


In [None]:
# 6. Train-Test Split
X_train, X_test, y_fuel_train, y_fuel_test = train_test_split(
    X, y_fuel,
    test_size=0.2,
    random_state=42
)

# For CO₂ Emissions, using the same split indices
_, _, y_co2_train, y_co2_test = train_test_split(
    X, y_co2,
    test_size=0.2,
    random_state=42
)

print("Fuel data:", X_train.shape, X_test.shape, y_fuel_train.shape, y_fuel_test.shape)
print("CO₂ data:", y_co2_train.shape, y_co2_test.shape)


## 7. Scale the Features

We apply **StandardScaler** separately for each model.


In [None]:
# 7. Scale the Features

# Scale for Fuel Consumption Model
scaler_fuel = StandardScaler()
X_train_fuel = scaler_fuel.fit_transform(X_train)
X_test_fuel = scaler_fuel.transform(X_test)

# Scale for CO₂ Emissions Model
scaler_co2 = StandardScaler()
X_train_co2 = scaler_co2.fit_transform(X_train)
X_test_co2 = scaler_co2.transform(X_test)

print("Feature scaling complete.")


## 8. Define the Keras Model

We build a simple feedforward neural network with two hidden layers and dropout.


In [None]:
# 8. Define a Reusable Keras Model
def build_model(input_dim):
    """
    Creates a simple feedforward neural network for regression tasks.
    
    Args:
        input_dim (int): Number of input features.
    
    Returns:
        model (Sequential): Compiled Keras model.
    """
    model = Sequential([
        Dense(64, activation='relu', input_shape=(input_dim,)),
        Dropout(0.3),
        Dense(32, activation='relu'),
        Dropout(0.3),
        Dense(1)
    ])
    model.compile(
        optimizer=Adam(learning_rate=0.001),
        loss='mse',
        metrics=['mse']
    )
    return model

print("Model builder function defined.")


## 9. Train the Fuel Consumption Model

We train the model using `X_train_fuel` and `y_fuel_train` with a validation split of 20%.


In [None]:
# 9. Train the Fuel Consumption Model
print("Training Fuel Consumption Model...")
model_fuel = build_model(input_dim=X_train_fuel.shape[1])
history_fuel = model_fuel.fit(
    X_train_fuel, y_fuel_train,
    validation_split=0.2,
    epochs=50,
    batch_size=32,
    verbose=1
)


## 10. Train the CO₂ Emissions Model

We train a second model using `X_train_co2` and `y_co2_train`.


In [None]:
# 10. Train the CO₂ Emissions Model
print("\nTraining CO₂ Emissions Model...")
model_co2 = build_model(input_dim=X_train_co2.shape[1])
history_co2 = model_co2.fit(
    X_train_co2, y_co2_train,
    validation_split=0.2,
    epochs=50,
    batch_size=32,
    verbose=1
)


## 11. Evaluate Model Performance

We evaluate both models on the test set and print the Mean Squared Error (MSE).


In [None]:
# 11. Evaluate Model Performance
fuel_loss, fuel_mse = model_fuel.evaluate(X_test_fuel, y_fuel_test, verbose=0)
co2_loss, co2_mse = model_co2.evaluate(X_test_co2, y_co2_test, verbose=0)

print(f"\nFuel Consumption Model MSE: {fuel_mse:.4f}")
print(f"CO₂ Emissions Model MSE: {co2_mse:.4f}")


## 12. Save Trained Models

The models are saved as `.h5` files in the `../models/` directory.


In [None]:
# 12. Save Trained Models
model_fuel.save('../models/fuel_consumption_model.h5')
model_co2.save('../models/co2_emissions_model.h5')

print("\n✅ Models saved successfully!")
