In [1]:
import requests
import json
import time
import pandas as pd

API_KEY = "7d31b043-1ef4-47a0-980d-0364ec4fc207"

## Data from Open Street Map

In [2]:
all_data = []

offset = 0
maxresults = 1000
total_expected = 2644  # Based on Open Charge Map count

while True:
    url = f"https://api.openchargemap.io/v3/poi/?output=json&countrycode=AU&maxresults={maxresults}&offset={offset}&key={API_KEY}"
    print(f"Fetching data with offset {offset}...")
    
    response = requests.get(url)
    
    if response.status_code == 200:
        batch = response.json()
        all_data.extend(batch)
        print(f"Retrieved {len(batch)} records.")
        
        if len(batch) < maxresults or len(all_data) >= total_expected:
            print("Reached end of available data.")
            break
    else:
        print(f"Error {response.status_code} at offset {offset}")
        break
    
    offset += maxresults
    time.sleep(1)

Fetching data with offset 0...
Retrieved 1000 records.
Fetching data with offset 1000...
Retrieved 1000 records.
Fetching data with offset 2000...
Retrieved 1000 records.
Reached end of available data.


In [3]:
# Save the full data to a JSON file
output_path = "../data/ev_charger_AU_full.json"
with open(output_path, "w") as f:
    json.dump(all_data, f, indent=4)

print(f"Done! Total records saved: {len(all_data)}")
print(f"File saved to: {output_path}")

Done! Total records saved: 3000
File saved to: ../data/ev_charger_AU_full.json


### Grid Emission Factors
To estimate the CO₂ emissions associated with EV charging, I used the *National Greenhouse Accounts Factors (2024)* report published by the Australian Government. This report provides **Scope 2 emission factors**, which represent indirect emissions from the generation of purchased electricity — specifically, the CO₂ emitted by power stations when generating electricity. I extracted **state-level grid emission factors (kg CO₂-e/kWh)** to enable region-specific comparisons. Scope 2 data is appropriate for this analysis because it reflects the emissions directly attributable to EV electricity usage, excluding indirect factors like fuel transport or grid losses (Scope 3), which are outside the scope of this study.


In [9]:
# Data from NGAF 2024 Table 1
data = {
    "State": [
        "NSW & ACT", "VIC", "QLD", "SA",
        "WA - SWIS", "WA - NWIS", "TAS", "NT - DKIS", "National Avg"
    ],
    "Scope 2 Emission Factor (kg CO₂/kWh)": [
        0.66, 0.77, 0.71, 0.23,
        0.51, 0.61, 0.15, 0.56, 0.63
    ]
}

# Create DataFrame
df_emissions = pd.DataFrame(data)

# Save as CSV 
df_emissions.to_csv("NGAF2024_emission_factors.csv", index=False)

# Display the table
df_emissions

Unnamed: 0,State,Scope 2 Emission Factor (kg CO₂/kWh)
0,NSW & ACT,0.66
1,VIC,0.77
2,QLD,0.71
3,SA,0.23
4,WA - SWIS,0.51
5,WA - NWIS,0.61
6,TAS,0.15
7,NT - DKIS,0.56
8,National Avg,0.63


### Average Emissions Intensity
To establish a realistic benchmark for CO₂ emissions from internal combustion engine (ICE) vehicles, I extracted segment-specific emissions data from the *Light Vehicle Emissions Intensity in Australia – Trends Over Time* report. Rather than using a single national average, I used detailed 2023 values that break down average CO₂ emissions (g/km) by vehicle type, such as small cars, SUVs, and pick-up trucks. This allows for more accurate comparisons between different vehicle classes and their electric vehicle (EV) alternatives, improving the relevance of emissions savings calculations in real-world scenarios.


In [16]:
# Emissions data by car segment from Table 10
data = {
    "Segment": [
        "SUV Medium", "Pick-up/Chassis 4x4", "SUV Small", "SUV Large", "Small",
        "SUV Light", "Medium", "Light", "Pick-up/Chassis 4x2", "SUV Upper Large",
        "Vans/Cab Chassis", "People Movers", "Sports", "Micro", "Large", "Upper Large"
    ],
    "Average Emissions Intensity (g/km, 2023)": [
        135, 222, 144, 192, 135,
        138, 77, 136, 215, 265,
        199, 181, 219, 115, 154, 154  # Note: Large and Upper Large both listed as 154
    ]
}

# Create DataFrame
df_ice_segments = pd.DataFrame(data)

# Save as CSV
save_path = r"C:\Users\soohx\OneDrive\Documents\homework\2025 T1\Capstone 1\EVAT-Environmental-Impact\data\ICE_vehicle_emissions_by_segment_2023.csv"
df_ice_segments.to_csv(save_path, index=False)

# Display the table
df_ice_segments

Unnamed: 0,Segment,"Average Emissions Intensity (g/km, 2023)"
0,SUV Medium,135
1,Pick-up/Chassis 4x4,222
2,SUV Small,144
3,SUV Large,192
4,Small,135
5,SUV Light,138
6,Medium,77
7,Light,136
8,Pick-up/Chassis 4x2,215
9,SUV Upper Large,265


### Charging Level Classification and Data Cleaning

To estimate average charger power by level, we used data from the `ev_charger_AU_full.json` dataset. Each charger was categorized into charging levels based on **Transport for NSW's EV Charging Levels and Range Chart**:

- **Level 1**: 1.4 – 3.7 kW
- **Level 2 Slow**: exactly 7 kW
- **Level 2 Fast**: 11 – 22 kW
- **Level 3 (DC fast charging)**: 25 – 350 kW

Some chargers fell outside these ranges, specifically:
- Between **3.8 – 6.9 kW**
- Between **23 – 24.9 kW**
- Other values not conforming to published standards

These were grouped into an **“Other”** category. However, to maintain clarity and consistency with recognized charging level definitions, we chose to **exclude these “Other” entries** from the analysis. This accounted for only a small fraction of the dataset (2.8%), and the exclusion helps ensure that all included values reflect industry-standard ch>rr types.

> Final analysis only includes well-classified chargers from Level 1, Level 2 Slow, Level 2 Fast, and Level 3.

In [9]:
# Load the JSON data
json_path = "C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\data\\ev_charger_AU_full.json"
with open(json_path, "r") as file:
    data = json.load(file)

# Extract PowerKW from Connections
power_data = []
for station in data:
    station_id = station.get("ID")
    connections = station.get("Connections", [])
    for conn in connections:
        power = conn.get("PowerKW")
        if power is not None:
            power_data.append({
                "station_id": station_id,
                "power_kw": power
            })

# Create DataFrame
df_power = pd.DataFrame(power_data)
df_power.dropna(subset=["power_kw"], inplace=True)

# Updated classification based on Transport for NSW
def classify_power_level(power):
    if 1.4 <= power <= 3.7:
        return "Level 1"
    elif power == 7:
        return "Level 2 Slow"
    elif 11 <= power <= 22:
        return "Level 2 Fast"
    elif 25 <= power <= 350:
        return "Level 3"
    else:
        return "Other"

df_power["charging_level"] = df_power["power_kw"].apply(classify_power_level)

# Group and summarize stats
summary_by_level = df_power.groupby("charging_level").agg(
    station_count=("power_kw", "count"),
    min_power_kw=("power_kw", "min"),
    max_power_kw=("power_kw", "max"),
    avg_power_kw=("power_kw", "mean")
).reset_index()

# Export the full summary including 'Other'
summary_with_other_path = "ev_charger_power_summary_incl_other.csv"
summary_by_level.to_csv(summary_with_other_path, index=False)

# Create a second summary excluding 'Other'
df_power_filtered = df_power[df_power["charging_level"] != "Other"]

summary_filtered = df_power_filtered.groupby("charging_level").agg(
    station_count=("power_kw", "count"),
    min_power_kw=("power_kw", "min"),
    max_power_kw=("power_kw", "max"),
    avg_power_kw=("power_kw", "mean")
).reset_index()

# Export the filtered summary (no "Other")
summary_filtered_path = "ev_charger_power_summary_excl_other.csv"
summary_filtered.to_csv(summary_filtered_path, index=False)

# Display both tables to compare
print("🔹 Summary Including 'Other':")
display(summary_by_level)

print("🔹 Summary Excluding 'Other':")
display(summary_filtered)

🔹 Summary Including 'Other':


Unnamed: 0,charging_level,station_count,min_power_kw,max_power_kw,avg_power_kw
0,Level 1,84,1.5,3.6,2.910714
1,Level 2 Fast,1317,11.0,22.0,20.481777
2,Level 2 Slow,255,7.0,7.0,7.0
3,Level 3,3780,25.0,350.0,87.090873
4,Other,156,4.0,24.0,15.961538


🔹 Summary Excluding 'Other':


Unnamed: 0,charging_level,station_count,min_power_kw,max_power_kw,avg_power_kw
0,Level 1,84,1.5,3.6,2.910714
1,Level 2 Fast,1317,11.0,22.0,20.481777
2,Level 2 Slow,255,7.0,7.0,7.0
3,Level 3,3780,25.0,350.0,87.090873


### Charging Frequency and Energy Use (from UQ Teslascope Study)

Charging frequency and average energy use data was collected from the report:
**"Charging behaviour by UoQ.pdf"**, based on telemetry data gathered via **Teslascope**.

- **Page**: 8
- **Sample size**: 239 EVs in Australia
- **Sessions observed**: 19,575 charging events
- **Time period**: November 2021 to May 2022

#### Data Collected:
- **Average sessions per day**: 0.463
- **Average sessions per week**: ~3.24
- **Average energy per session**: 12.7 kWh
- **Average daily energy use per EV**: 6.0 kWh

This data is used to model **real-world EV charging frequency** and energy consumption for home and light public charging in Australia.

In [16]:
# UQ Charging Frequency Data (from Teslascope API dataset)
uq_charging_frequency = {
    "charging_sessions_per_day": 0.463,
    "charging_sessions_per_week": round(0.463 * 7, 2),  # 3.24
    "avg_energy_per_session_kwh": 12.7,
    "avg_daily_energy_kwh": 6.0,
    "source": "Charging behaviour by UoQ (Page 8)"
}

import pandas as pd
df_uq_frequency = pd.DataFrame([uq_charging_frequency])
df_uq_frequency

Unnamed: 0,charging_sessions_per_day,charging_sessions_per_week,avg_energy_per_session_kwh,avg_daily_energy_kwh,source
0,0.463,3.24,12.7,6.0,Charging behaviour by UoQ (Page 8)


### Level 3 Charging Duration Analysis (Port Adelaide Dataset)

To represent real-world Level 3 (DC fast charging) session durations, we used the raw session data collected from the **Port Adelaide Plaza Chargefox site** (2022–2023).

This dataset includes detailed records of over 2,200 EV charging sessions conducted at a **120 kW DC fast charging station**, aligning well with our definition of **Level 3 charging (≥25 kW)**.

#### Cleaning Method:
- Converted `Session duration (mins)` to numeric values
- Excluded sessions under 5 minutes to remove potential errors or test connections

#### Why This Dataset?
- Collected over a full year (longitudinal coverage)
- Reflects public, real-world charging behavior
- Uses high-power infrastructure (Level 3 category)

This cleaned dataset gives us a strong basis for estimating average Level 3 session duration and supports our energy consumption calculations later on.

In [22]:
# Load the Port Adelaide Charge Session data
file_path = "C:\\Users\\soohx\\OneDrive\\Documents\\homework\\2025 T1\\Capstone 1\\EVAT-Environmental-Impact\\data\\Port Adelaide Plaza Charge Session Data - 6_12_22 to 6_12_23 - Port Adelaide Plaza Charge Session Data - 6_12_22 to 6_12_23.csv"
df_port_adelaide = pd.read_csv(file_path)

# Convert session duration to numeric and handle any non-numeric values
df_port_adelaide["Session duration (mins)"] = pd.to_numeric(df_port_adelaide["Session duration (mins)"], errors='coerce')

# Filter out invalid or very short sessions (less than 5 minutes)
df_clean = df_port_adelaide[df_port_adelaide["Session duration (mins)"] >= 5].copy()

# Calculate key statistics
duration_stats = {
    "session_count": len(df_clean),
    "mean_duration_mins": df_clean["Session duration (mins)"].mean(),
    "median_duration_mins": df_clean["Session duration (mins)"].median(),
    "min_duration_mins": df_clean["Session duration (mins)"].min(),
    "max_duration_mins": df_clean["Session duration (mins)"].max(),
    "std_dev_duration_mins": df_clean["Session duration (mins)"].std()
}

# Convert to DataFrame for display or export
df_duration_stats = pd.DataFrame([duration_stats])
df_duration_stats

Unnamed: 0,session_count,mean_duration_mins,median_duration_mins,min_duration_mins,max_duration_mins,std_dev_duration_mins
0,2246,37.370436,37.0,5,101,16.281349


### Final Charging Duration Estimates by Level (Level 1, Level 2 Slow, Level 2 Fast)

This section stores the average **charging session durations (in hours)** for three levels of charging. The values are used for estimating energy consumption based on power ratings.

#### Data Sources & Methodology:

- **Level 2 Slow (7 kW)**:  
  Duration = **3.5 hours**  
  Source: Reported in the University of Melbourne's EV uptake and charging review.

- **Level 2 Fast (22 kW)**:  
  Duration = **2.0 hours**  
  Source: Also reported in the same University of Melbourne report.

- **Level 1 (2.91 kW average)**:  
  Duration = **8.42 hours**  
  Estimated using the same energy delivered during a Level 2 Slow session:  
  \[
  \text{Energy} = 7 \times 3.5 = 24.5 \text{ kWh}
  \]  
  This energy is assumed to be the same in a Level 1 session, so:  
  \[
  \text{Duration (Level 1)} = \frac{24.5}{2.91} \approx 8.42 \text{ hours}
  \]

#### Summary:
These durations will be used to estimate energy consumption and CO₂ emissions per charging session across Levels 1 and 2. Using actual reported values where available, and estimating Level 1 based on the same energy transfer allows for consistent and realistic modeling across levels.


In [32]:
# Recalculate the Level 1 estimated duration
energy_consumed_reference_kwh = 7 * 3.5  # from Level 2 Slow reference
estimated_level1_duration = round(energy_consumed_reference_kwh / 2.91, 2)

# Store the final charging durations
charging_duration_data_updated = {
    "Level 1 (Avg. 2.91 kW)": {
        "method": "estimated from 7kW reference session",
        "power_kw": 2.91,
        "duration_hours": estimated_level1_duration
    },
    "Level 2 Slow (7 kW)": {
        "method": "reported (Unimelb)",
        "power_kw": 7.0,
        "duration_hours": 3.5
    },
    "Level 2 Fast (22 kW)": {
        "method": "reported (Unimelb)",
        "power_kw": 22.0,
        "duration_hours": 2.0
    }
}

# Convert to DataFrame and display
df_charging_durations_final = pd.DataFrame.from_dict(charging_duration_data_updated, orient='index')
df_charging_durations_final  # This will automatically render the table in a Jupyter notebook

Unnamed: 0,method,power_kw,duration_hours
Level 1 (Avg. 2.91 kW),estimated from 7kW reference session,2.91,8.42
Level 2 Slow (7 kW),reported (Unimelb),7.0,3.5
Level 2 Fast (22 kW),reported (Unimelb),22.0,2.0


### Combined Charging Duration Table

I combined the average charging session durations for all EV charging levels (Level 1 to Level 3) into a single table. The data includes:

- Reported values from the University of Melbourne for Level 2
- Calculated value from Port Adelaide data for Level 3
- Estimated value for Level 1 based on energy equivalence with Level 2 Slow

This unified format will be used for energy consumption and emissions calculations.

The data is saved as `charging_duration_by_level_clean.csv`


In [39]:
# Add Level 3 (Port Adelaide) entry in the same format for a clean, unified table
level3_entry = {
    "method": "calculated from Port Adelaide dataset",
    "power_kw": 87.09,  # avg power from your charger classification
    "duration_hours": round(37.37 / 60, 2)  # convert from minutes to hours
}

# Convert previous Level 1, 2 dataset into a DataFrame
df_level1_2_clean = pd.DataFrame({
    "method": [
        "estimated from 7kW reference session",
        "reported (Unimelb)",
        "reported (Unimelb)"
    ],
    "power_kw": [2.91, 7.0, 22.0],
    "duration_hours": [8.42, 3.5, 2.0]
}, index=["Level 1 (Avg. 2.91 kW)", "Level 2 Slow (7 kW)", "Level 2 Fast (22 kW)"])

# Append Level 3
df_combined_clean = df_level1_2_clean.copy()
df_combined_clean.loc["Level 3 (87.09 kW, Port Adelaide)"] = level3_entry

# Save the final duration table to CSV
csv_path = "charging_duration_by_level_clean.csv"
df_combined_clean.to_csv(csv_path)

df_combined_clean

Unnamed: 0,method,power_kw,duration_hours
Level 1 (Avg. 2.91 kW),estimated from 7kW reference session,2.91,8.42
Level 2 Slow (7 kW),reported (Unimelb),7.0,3.5
Level 2 Fast (22 kW),reported (Unimelb),22.0,2.0
"Level 3 (87.09 kW, Port Adelaide)",calculated from Port Adelaide dataset,87.09,0.62
