# EV Charging Analysis - Feature Engineering & SA Market Translation

**Author:** Luqmaan
**Date:** December 2025
**Purpose:** Calculate efficiency metrics and translate US charging data to SA market context

---

## Project Context
This notebook takes cleaned EV charging data from US cities and:
1. Calculates efficiency and cost metrics
2. Translates costs to South African context (GridCars rates, Eskom rates)
3. Creates comparison scenarios for BYD Dolphin Surf
4. Compares EV costs to petrol equivalent

**Key SA Market Data (Dec 2025):**
- DC Fast Charging (GridCars): R7.35/kWh
- AC Charging (GridCars): R5.88/kWh
- Home Charging (Eskom): R3.00/kWh
- Petrol: R21/liter, ~7L/100km = R1.47/km
- BYD Dolphin Surf: 30-38 kWh battery, R339,900-R389,900

## 1. Setup & Imports

In [1]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from datetime import datetime

pd.set_option('display.max_columns', None)
pd.set_option('display.precision', 2)

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (12, 6)

## 2. Define SA Market Constants

In [2]:
GRIDCARS_DC_FAST = 7.35
GRIDCARS_AC = 5.88
HOME_CHARGING = 3.00
USD_TO_ZAR = 17.00
BYD_BATTERY_STANDARD = 30.08
BYD_RANGE_STANDARD = 232
BYD_BATTERY_PREMIUM = 38.88
BYD_RANGE_PREMIUM = 295
BYD_PRICE_MIN = 339900
BYD_PRICE_MAX = 389900
REAL_WORLD_EFFICIENCY = BYD_RANGE_STANDARD/BYD_BATTERY_STANDARD
PETROL_PRICE_PER_LITER = 21.00
PETROL_CONSUMPTION = 7.0
PETROL_COST_PER_KM = (PETROL_CONSUMPTION / 100) * PETROL_PRICE_PER_LITER

## 3. Load Data & Basic Metrics

In [3]:
df = pd.read_csv('ev_charging_patterns_CLEAN.csv')
df_complete = df[df['has_distance_data'] & df['has_energy_data']].copy()

df_complete['km_per_kwh'] = df_complete['distance_km'] / df_complete['energy_consumed_kwh']
df_complete['cost_per_km_usd'] = df_complete['cost_usd'] / df_complete['distance_km']
df_complete['cost_per_kwh_usd'] = df_complete['cost_usd'] / df_complete['energy_consumed_kwh']
df_complete['kwh_per_hour (kW)'] = df_complete['energy_consumed_kwh'] / df_complete['duration_hours']
df_complete['battery_used_pct'] = df_complete['soc_end_pct'] - df_complete['soc_start_pct']

## 4. SA Charging Costs & Comparison

In [4]:
def calculate_sa_cost(row):
    energy = row['energy_consumed_kwh']
    if row['charger_type'] == 'DC Fast Charger':
        return energy * GRIDCARS_DC_FAST
    elif row['charger_type'] == 'Level 2':
        return energy * GRIDCARS_AC
    else:
        return energy * HOME_CHARGING

df_complete['cost_sa_zar'] = df_complete.apply(calculate_sa_cost, axis=1)
df_complete['cost_per_km_sa_zar'] = df_complete['cost_sa_zar'] / df_complete['distance_km']
df_complete['petrol_cost_equivalent_zar'] = df_complete['distance_km'] * PETROL_COST_PER_KM
df_complete['savings_vs_petrol_zar'] = df_complete['petrol_cost_equivalent_zar'] - df_complete['cost_sa_zar']
df_complete['savings_pct'] = (df_complete['savings_vs_petrol_zar'] / df_complete['petrol_cost_equivalent_zar']) * 100

## 5. BYD Dolphin Surf - Annual Cost Scenarios

In [5]:
BATTERY_CAPACITY = 30.08
RANGE_WLTP = 232
REAL_EFFICIENCY = RANGE_WLTP / BATTERY_CAPACITY

cost_per_km_home = HOME_CHARGING / REAL_EFFICIENCY
cost_per_km_ac = GRIDCARS_AC / REAL_EFFICIENCY
cost_per_km_dc = GRIDCARS_DC_FAST / REAL_EFFICIENCY

scenarios = {
    'Commuter (My Colleague)': {
        'description': '60km daily commute, charges at home + occasional AC top-ups',
        'annual_km': 15600
    }
}

for name, details in scenarios.items():
    print(f"Scenario: {name}")
    print(f"Description: {details['description']}")
    print(f"Annual Kilometers: {details['annual_km']} km")