In [2]:
import pandas as pd

### Load Charging Power and Duration Data

We load and merge the following:

- **`ev_charger_power_summary_excl_other.csv`** – average charger power (kW) by charging level,
- **`charging_duration_by_level_clean.csv`** – average charging session duration (hours) by level.

After standardizing charging level names and cleaning whitespace, we compute the average **energy consumption per session**:

\[
\text{Energy (kWh)} = \text{Power (kW)} \times \text{Duration (hours)}
\]

This provides an overview of energy delivered by charging sessions for different charger levels.


In [4]:
# Load power and duration data
power_df = pd.read_csv("C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\data\\ev_charger_power_summary_excl_other.csv")
duration_df = pd.read_csv("C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\data\\charging_duration_by_level_clean.csv")

# Clean column names to match
duration_df = duration_df.rename(columns={duration_df.columns[0]: 'charging_level'})

# Standardize names
clean_name_map = {
    'Level 1 (Avg. 2.91 kW)': 'Level 1',
    'Level 2 Fast (22 kW)': 'Level 2 Fast',
    'Level 2 Slow (7 kW)': 'Level 2 Slow',
    'Level 3 (87.09 kW, Port Adelaide)': 'Level 3'
}
duration_df['charging_level'] = duration_df['charging_level'].map(clean_name_map)

# Strip whitespaces
power_df['charging_level'] = power_df['charging_level'].str.strip()
duration_df['charging_level'] = duration_df['charging_level'].str.strip()

# Merge
merged_energy_df = pd.merge(power_df, duration_df, on='charging_level', how='inner')

# Calculate energy consumption
merged_energy_df['energy_kWh'] = merged_energy_df['avg_power_kw'] * merged_energy_df['duration_hours']

display(merged_energy_df[['charging_level', 'avg_power_kw', 'duration_hours', 'energy_kWh']])


Unnamed: 0,charging_level,avg_power_kw,duration_hours,energy_kWh
0,Level 1,2.910714,8.42,24.508214
1,Level 2 Fast,20.481777,2.0,40.963554
2,Level 2 Slow,7.0,3.5,24.5
3,Level 3,87.090873,0.62,53.996341


### Define National Grid Emission Factors (2020–2024)

This dictionary maps each year to a national average Scope 2 emission factor (kg CO₂-e/kWh).

- **2022 and 2023 values** were published in NGAF reports.
- **2020 and 2021 values** were calculated as weighted averages based on electricity consumption by state and grid emission factors.
- **2024** is directly from NGAF.

These values will be used to calculate EV emissions for each year.


In [6]:
# National Scope 2 EF by year
national_ef_by_year = {
    2020: 0.779,
    2021: 0.761,
    2022: 0.679,
    2023: 0.650,
    2024: 0.63
}


### National EV Emissions Calculation Using 150 km Standard Trip

To ensure consistency with the state-level dataset, we calculate national EV emissions using a **fixed energy assumption** for a 150 km trip based on real-world efficiency data.

#### Energy Efficiency Assumption (Electric Vehicle Council of Australia)

> "A typical passenger EV, driven 12,000 km per year, will consume about 2,000 kWh annually."

From this, we derive:

\[
\text{EV Efficiency} = \frac{2000 \text{ kWh}}{12000 \text{ km}} = 0.1667 \text{ kWh/km}
\]

Using this efficiency, we estimate total energy consumed for a 150 km trip:

\[
\text{Energy for 150 km} = 0.1667 \times 150 = 25.005 \text{ kWh}
\]

#### Why Use a Fixed 150 km Trip?

This standardization:
- Reflects realistic Australian EV driving patterns (based on AGL & Chargefox trials),
- Allows fair comparison between ICE and EV emissions,
- Matches the logic and modeling structure used in the state-level emissions notebook.

#### Grid Emission Factors

The national Scope 2 grid emission factors used (2020–2024) are sourced from:
- **NGAF reports** (2022–2024), and
- Weighted average estimates (2020, 2021) using state grid intensity and electricity consumption data.

These national factors are multiplied by 25.005 kWh to compute `EV_emissions_kg` for each charging level and year.

---

This method guarantees that the emissions calculations for national-level training are **directly comparable** to those used in the state-level model.


In [8]:
# Define trip distance and energy efficiency assumption
ev_efficiency_kwh_per_km = 2000 / 12000  # = 0.1667
distance_km = 150
energy_150km = distance_km * ev_efficiency_kwh_per_km  # = 25.005 kWh

# Calculate EV emissions for national level using fixed 150 km energy estimate
ev_emissions_records = []

for year, national_grid_ef in national_ef_by_year.items():
    for _, level_row in merged_energy_df.iterrows():
        charging_level = level_row['charging_level']

        ev_emissions = energy_150km * national_grid_ef

        ev_emissions_records.append({
            'year': year,
            'state': 'National',
            'charging_level': charging_level,
            'distance_km': distance_km,
            'energy_kWh_for_trip': energy_150km,
            'grid_emission_factor': national_grid_ef,
            'EV_emissions_kg': ev_emissions
        })

# Create DataFrame
ev_emissions_national_df = pd.DataFrame(ev_emissions_records)
display(ev_emissions_national_df)

Unnamed: 0,year,state,charging_level,distance_km,energy_kWh_for_trip,grid_emission_factor,EV_emissions_kg
0,2020,National,Level 1,150,25.0,0.779,19.475
1,2020,National,Level 2 Fast,150,25.0,0.779,19.475
2,2020,National,Level 2 Slow,150,25.0,0.779,19.475
3,2020,National,Level 3,150,25.0,0.779,19.475
4,2021,National,Level 1,150,25.0,0.761,19.025
5,2021,National,Level 2 Fast,150,25.0,0.761,19.025
6,2021,National,Level 2 Slow,150,25.0,0.761,19.025
7,2021,National,Level 3,150,25.0,0.761,19.025
8,2022,National,Level 1,150,25.0,0.679,16.975
9,2022,National,Level 2 Fast,150,25.0,0.679,16.975


In [9]:
# Load ICE vehicle emissions data
ice_df = pd.read_csv("C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\data\\ICE_vehicle_emissions_by_segment_2023.csv")

# Clean and compute emissions over 150 km
ice_df = ice_df.rename(columns={"Average Emissions Intensity (g/km, 2023)": "emissions_g_per_km"})
ice_df['emissions_kg_per_km'] = ice_df['emissions_g_per_km'] / 1000
ice_df['ICE_emissions_kg'] = ice_df['emissions_kg_per_km'] * distance_km

# Use 'Segment' as the correct column
display(ice_df[['Segment', 'ICE_emissions_kg']])


Unnamed: 0,Segment,ICE_emissions_kg
0,SUV Medium,20.25
1,Pick-up/Chassis 4x4,33.3
2,SUV Small,21.6
3,SUV Large,28.8
4,Small,20.25
5,SUV Light,20.7
6,Medium,11.55
7,Light,20.4
8,Pick-up/Chassis 4x2,32.25
9,SUV Upper Large,39.75


### Merge EV and ICE Emissions, Calculate CO₂ Savings

We perform a cartesian join between:

- EV emission values (by charging level and year), and
- ICE emission values (by vehicle segment).

Then we calculate:

\[
\text{CO₂ Saved (kg)} = \text{ICE Emissions} - \text{EV Emissions}
\]

We add a placeholder EV adoption rate (same as used in state model) and export the result to a CSV for future modeling.


In [11]:
# Cross join EV emissions with ICE segments
ev_national_expanded = ev_emissions_national_df.assign(key=1)
ice_expanded = ice_df[['Segment', 'ICE_emissions_kg']].assign(key=1)

national_full_df = pd.merge(ev_national_expanded, ice_expanded, on='key').drop(columns=['key'])
national_full_df = national_full_df.rename(columns={'Segment': 'ICE_segment'})

# Calculate CO₂ savings
national_full_df['CO2_saved_kg'] = national_full_df['ICE_emissions_kg'] - national_full_df['EV_emissions_kg']

# Add adoption rate (same as original)
national_full_df['ev_adoption_rate'] = 0.0097  # or your actual rate

# Final columns order
national_final_df = national_full_df[['year', 'state', 'charging_level', 'ICE_segment',
                                      'EV_emissions_kg', 'ICE_emissions_kg', 'CO2_saved_kg',
                                      'grid_emission_factor', 'ev_adoption_rate', 'distance_km']]

# Save
national_final_df.to_csv('co2_savings_model_data_national.csv', index=False)

display(national_final_df.head())
print("National dataset saved as 'co2_savings_model_data_national.csv'")


Unnamed: 0,year,state,charging_level,ICE_segment,EV_emissions_kg,ICE_emissions_kg,CO2_saved_kg,grid_emission_factor,ev_adoption_rate,distance_km
0,2020,National,Level 1,SUV Medium,19.475,20.25,0.775,0.779,0.0097,150
1,2020,National,Level 1,Pick-up/Chassis 4x4,19.475,33.3,13.825,0.779,0.0097,150
2,2020,National,Level 1,SUV Small,19.475,21.6,2.125,0.779,0.0097,150
3,2020,National,Level 1,SUV Large,19.475,28.8,9.325,0.779,0.0097,150
4,2020,National,Level 1,Small,19.475,20.25,0.775,0.779,0.0097,150


National dataset saved as 'co2_savings_model_data_national.csv'


### Correcting EV Adoption Rates in the National Dataset

Originally, the EV adoption rate in the national dataset was hardcoded as `0.0097` for all years. This was incorrect and did not reflect real-world trends.

To address this:

- We mapped actual EV market share data from 2011 to 2024,
- The mapping was applied to each row in the dataset based on the `year`,
- The corrected dataset was saved back to the same CSV file.

This ensures that the regression model will be trained on historically accurate adoption rates.

Data Source:
- EV Council of Australia, State of EVs Report (2024)


In [22]:
# Load the original national dataset that used static 0.0097 value
df_fix = pd.read_csv("C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\notebooks\\co2_savings_model_data_national.csv")

# Define corrected EV adoption rate mapping from 2011 to 2024
ev_adoption_by_year = {
    2011: 0.0000,
    2012: 0.0002,
    2013: 0.0002,
    2014: 0.0012,
    2015: 0.0015,
    2016: 0.0012,
    2017: 0.0019,
    2018: 0.0021,
    2019: 0.0065,
    2020: 0.0078,
    2021: 0.0195,
    2022: 0.0381,
    2023: 0.0845,
    2024: 0.0953
}

# Apply the correct adoption rate based on year
df_fix['ev_adoption_rate'] = df_fix['year'].map(ev_adoption_by_year)

# Save corrected dataset
corrected_path = "C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\notebooks\\co2_savings_model_data_national.csv"
df_fix.to_csv(corrected_path, index=False)

corrected_path

'C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\notebooks\\co2_savings_model_data_national.csv'

### Creating Future Trend Data for National-Level Predictions (2025–2040)

To forecast CO₂ savings in future years, we prepare a dataset of projected trends from 2025 to 2040, including:

- **EV adoption rate** under a "Moderate Intervention" scenario, ranging from 12% in 2025 to 56% by 2040.
- **Grid emission intensity**, derived from national electricity emission forecasts.

#### Sources:
- **EV adoption rates (2025–2040):** Derived from industry projections in the *State of EVs 2024* report by the Electric Vehicle Council.
- **Grid emissions (2025–2040):** Estimated from NGAF 2023 data and Figure 20 in the emissions forecast chart, which projects national electricity emissions in Mt CO₂.

To align with our training data (2020–2024), we convert the forecasted **total grid emissions (Mt CO₂)** into **grid emission factors (kg CO₂/kWh)** using:

\[
\text{Grid EF}_t = \frac{\text{Grid Mt CO₂}_t}{\text{Grid Mt CO₂}_{2023}} \times \text{Grid EF}_{2023}
\]

This results in year-specific grid emission intensities used for national model predictions.


In [26]:
# Define future years
future_years = list(range(2025, 2041))

# EV adoption rate (Moderate Intervention scenario)
future_ev_adoption = [
    0.120, 0.145, 0.170, 0.200, 0.230, 0.260,
    0.290, 0.320, 0.350, 0.380, 0.410, 0.440,
    0.470, 0.500, 0.530, 0.560
]

# Projected national electricity emissions in Mt CO2
future_grid_mt = [
    135.67, 129.71, 116.19, 110.98, 91.00,
    60.71, 54.49, 51.26, 53.92, 38.93,
    37.12, 35.43, 34.95, 35.39, 31.20, 29.23
]

# 2023 baseline values
baseline_mt_2023 = 144.28
baseline_grid_ef_2023 = 0.65  # kg CO2/kWh

# Compute grid EF projections
future_grid_ef = [
    (mt / baseline_mt_2023) * baseline_grid_ef_2023
    for mt in future_grid_mt
]

# Create future trend DataFrame
df_future_trends = pd.DataFrame({
    'year': future_years,
    'ev_adoption_rate': future_ev_adoption,
    'grid_emission_factor': future_grid_ef
})

# Save future trend data to CSV
future_csv_path = "C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\notebooks\\future_ev_trends_2025_2040.csv"
df_future_trends.to_csv(future_csv_path, index=False)