In [8]:
# Load datasets directly from data/raw folder
import pandas as pd

# Load Global EV Sales Dataset (IEA Global EV Data 2024)
df_sales = pd.read_csv("data/raw/IEA Global EV Data 2024.csv")
print(f"Shape: {df_sales.shape}")
print("\nSales dataset — first 5 records:")
print(df_sales.head())

# Load Global EV Charging Stations Dataset
df_stations = pd.read_csv("data/raw/detailed_ev_charging_stations.csv")
print(f"Shape: {df_stations.shape}")
print("\nCharging stations dataset — first 5 records:")
print(df_stations.head())
df_stations.info()

Shape: (12654, 8)

Sales dataset — first 5 records:
      region    category       parameter  mode powertrain  year      unit  \
0  Australia  Historical  EV stock share  Cars         EV  2011   percent   
1  Australia  Historical  EV sales share  Cars         EV  2011   percent   
2  Australia  Historical        EV sales  Cars        BEV  2011  Vehicles   
3  Australia  Historical        EV stock  Cars        BEV  2011  Vehicles   
4  Australia  Historical        EV stock  Cars        BEV  2012  Vehicles   

       value  
0    0.00039  
1    0.00650  
2   49.00000  
3   49.00000  
4  220.00000  
Shape: (5000, 17)

Charging stations dataset — first 5 records:
  Station ID   Latitude   Longitude                                Address  \
0   EVS00001 -33.400998   77.974972       4826 Random Rd, City 98, Country   
1   EVS00002  37.861857 -122.490299  8970 San Francisco Ave, San Francisco   
2   EVS00003  13.776092  100.412776              5974 Bangkok Ave, Bangkok   
3   EVS00004  43.62

## Visualization Objectives

This notebook explores the relationship between EV adoption and charging infrastructure (2010-2024) through multiple visualization types:

### Planned Visualizations

1. **Time-Series Line Chart** (Exploratory)
   - Track EV sales/population growth by region with trendlines
   - *Goal:* Identify regional growth patterns and post-incentive adoption spikes

2. **Geographic Heatmap** (Explanatory)
   - Choropleth map showing sales density with station count overlay
   - *Goal:* Reveal geographic gaps between high sales regions and station availability

3. **Scatter Plot with Regression** (Analysis)
   - Correlate station density vs. sales growth with regression line
   - *Goal:* Quantify infrastructure impact on adoption (target R² > 0.7)

4. **Stacked Bar Chart** (Exploratory)
   - Compare BEV vs. PHEV adoption share by year and region
   - *Goal:* Understand vehicle type preferences across markets

5. **Interactive Dashboard** (Artifact)
   - Dash/Streamlit app combining all views with year/region filters
   - *Goal:* Enable stakeholder drill-down analysis (e.g., rural station gaps)

6. **Timeline Chart** (Explanatory)
   - Horizontal bars showing growth periods with incentive annotations
   - *Goal:* Link policy changes to adoption surges

### Key Questions to Answer
- How do regional incentives correlate with sales growth?
- What's the relationship between station density and EV adoption?
- Where are the infrastructure gaps in high-demand markets?
- How have BEV vs. PHEV preferences evolved over time?