<a href="https://colab.research.google.com/github/Naeima/ev/blob/main/Comparative_Visualisation_of_South_Wales_EV_Charging_Infrastructure__Open_Charge_Map_vs_ONS_2024_Dataset.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Comparative Visualisation of South Wales EV Charging Infrastructure: Open Charge Map vs. ONS 2024 Dataset (Provided by Paul Hagger)**


In [None]:
# Install required libraries

!pip install folium geopandas ipywidgets folium

Collecting jedi>=0.16 (from ipython>=4.0.0->ipywidgets)
  Downloading jedi-0.19.2-py2.py3-none-any.whl.metadata (22 kB)
Downloading jedi-0.19.2-py2.py3-none-any.whl (1.6 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m28.4 MB/s[0m eta [36m0:00:00[0m
[?25hInstalling collected packages: jedi
Successfully installed jedi-0.19.2


🔽 **Loading and Filtering Community-Curated EV Charging Data for South Wales**

This code loads a dataset of electric vehicle (EV) charging stations sourced from the [Open Charge Map](https://map.openchargemap.io) API — a community-curated and openly licensed global database of EV charge points. It normalises the data and filters it to include only selected towns and urban areas across South Wales.

In [None]:
import pandas as pd

# Load South Wales dataset from Google Drive
south_file_id = "1UAwZ6NoOXUtSA2Y-LEH6vCb-akPqdxz0"
south_url = f"https://drive.google.com/uc?id={south_file_id}"
south = pd.read_csv(south_url)

# Normalize column names and filter towns
south.columns = south.columns.str.lower()
south['town'] = south['town'].str.lower()
target_towns = ['cardiff', 'swansea', 'newport', 'bridgend', 'merthyr tydfil', 'llanelli',
                'carmarthen', 'abergavenny', 'pontypridd', 'neath', 'port talbot', 'barry', 'tenby']
south = south[south['town'].isin(target_towns)].copy()

🔽 **Loading and Filtering ONS-Sourced EV Charging Data for South Wales**

This code loads a dataset of electric vehicle (EV) charging points sourced from the UK Office for National Statistics (ONS) and provided by Paul Hagger on Monday 28 July. It normalises the column names and filters the data to include only designated towns and urban areas across South Wales.


In [None]:
# Load and filter ONS dataset
ons_file_id = "16xhVfgn4T4MEET_8ziEdBhs3nhpc_0PL"
ons_url = f"https://drive.google.com/uc?id={ons_file_id}"
ons = pd.read_csv(ons_url)

# Normalize columns and filter
ons.columns = ons.columns.str.lower()
ons['town'] = ons['town'].str.lower()
target_towns = ['cardiff', 'swansea', 'newport', 'bridgend', 'merthyr tydfil', 'llanelli',
                'carmarthen', 'abergavenny', 'pontypridd', 'neath', 'port talbot', 'barry', 'tenby']
ons = ons[ons['town'].isin(target_towns)].copy()


  ons = pd.read_csv(ons_url)


🔽 **Exploring EV Charging Point Coverage Across South Wales**

This code summarises the two EV charging datasets—**the first from Open Charge Map (a community-curated global EV infrastructure database)** and **the second from the Office for National Statistics (ONS) provided on 28 July by Paul Hagger**. It shows their size, missing data, and a few sample entries, and compares the number of charge points listed per town to understand how coverage varies across the region.



In [None]:
# 📊 Dataset Exploration and Descriptive Summary

# 1. South Wales Dataset Overview
print("🔹 South Wales Charging Points")
print("Rows:", len(south))
print("Columns:", len(south.columns))
print("Missing values (South):")
print(south.isnull().sum().sort_values(ascending=False).head(10))
print("\nSample rows:")
display(south.head(3))

# 2. ONS Oct24 Dataset Overview
print("\n🔸 ONS Oct24 Charging Points (Filtered for South Wales)")
print("Rows:", len(ons))
print("Columns:", len(ons.columns))
print("Missing values (ONS Oct24):")
print(ons.isnull().sum().sort_values(ascending=False).head(10)) # show the first 10 rows.
print("\nSample rows:")
display(ons.head(3))

# 3. Common Columns
common_cols = set(south.columns).intersection(set(ons.columns))
print("\n🧩 Common Columns between both datasets:")
print(sorted(common_cols))

# 4. Unique counts by town
print("\n Distribution by Town (South Wales):")
print(south['town'].value_counts())

print("\n Distribution by Town (ONS Oct24):")
print(ons['town'].value_counts())


🔹 South Wales Charging Points
Rows: 129
Columns: 100
Missing values (South):
mediaitems                            129
addressinfo.accesscomments            129
addressinfo.contactemail              129
usercomments                          129
operatorinfo.phonesecondarycontact    129
addressinfo.contacttelephone2         129
comments                              129
operatorsreference                    129
generalcomments                       128
addressinfo.contacttelephone1         128
dtype: int64

Sample rows:


Unnamed: 0,isrecentlyverified,datelastverified,id,uuid,dataproviderid,operatorid,usagetypeid,usagecost,numberofpoints,statustypeid,...,level.isfastchargecapable,level.id,powerlevel,currenttype.description,currenttype.id,currenttype,amperage,voltage,reference,comments
0,False,2023-12-07T08:12:00Z,286787,42A00D66-EA94-417F-A317-F029E5F5C133,18,3509.0,1,,,0.0,...,False,2.0,Level 2 : Medium (Over 2kW),Alternating Current - Single Phase,10.0,AC (Single-Phase),32.0,230.0,5205,
1,False,2023-12-07T08:12:00Z,286787,42A00D66-EA94-417F-A317-F029E5F5C133,18,3509.0,1,,,0.0,...,False,2.0,Level 2 : Medium (Over 2kW),Alternating Current - Single Phase,10.0,AC (Single-Phase),32.0,230.0,5206,
2,False,2023-12-07T08:12:00Z,286785,1419A724-FF8C-461B-8DBE-3B6EE5F3AE8F,18,3509.0,1,,,0.0,...,False,2.0,Level 2 : Medium (Over 2kW),Alternating Current - Three Phase,20.0,AC (Three-Phase),32.0,400.0,3063,



🔸 ONS Oct24 Charging Points (Filtered for South Wales)
Rows: 388
Columns: 158
Missing values (ONS Oct24):
subbuildingname                388
dependantlocality              388
doubledependantlocality        388
devicecontrollercontactname    388
uprn                           388
devicedescription              388
deviceownercontactname         388
accesswednesdayto              388
accesstuesdayto                388
accesswednesdayfrom            388
dtype: int64

Sample rows:


Unnamed: 0,chargedeviceid,reference,name,latitude,longitude,subbuildingname,buildingname,buildingnumber,thoroughfare,street,...,connector8type,connector8ratedoutputkw,connector8outputcurrent,connector8ratedvoltage,connector8chargemethod,connector8chargemode,connector8tetheredcable,connector8status,connector8description,connector8validated
3212,a6ade5aa93b826f8de63c663e1159bf7,PP-12398,G24 Innovations,51.506864,-3.101256,,,,Wentloog Avenue,CF3 2GH,...,,,,,,,,,,
3253,8dcf2420e78a64333a59674678fb283b,PP-12311,Wessex Garages Cardiff,51.468083,-3.206283,,,24.0,Hadfield Road,,...,,,,,,,,,,
3343,cab73666e96e6d796b7d69fbe67d87a4,PP-5112119,Bassetts Nissan,51.657556,-3.925563,,,,,"Neath Road, Morriston, Swansea",...,,,,,,,,,,



🧩 Common Columns between both datasets:
['datecreated', 'latitude', 'longitude', 'postcode', 'reference', 'town']

 Distribution by Town (South Wales):
town
cardiff    90
newport    31
swansea     8
Name: count, dtype: int64

 Distribution by Town (ONS Oct24):
town
cardiff           130
newport           108
swansea            34
barry              30
bridgend           19
tenby              17
carmarthen         14
llanelli           12
merthyr tydfil      8
pontypridd          6
neath               4
abergavenny         4
port talbot         2
Name: count, dtype: int64
Error: Runtime no longer has a reference to this dataframe, please re-run this cell and try again.


🔽 **Fuzzy Spatial Matching of EV Charging Points (Within 200 Metres)**

This code compares charging point locations from the two datasets by checking if any points from the Open Charge Map data (`south`) are located within 200 metres of those in the ONS dataset (`ons`). It returns pairs of matched coordinates and a list of unmatched points to identify spatial overlaps or discrepancies between the two sources.


In [None]:
#  Approximate spatial match using geopy (within 200m) # to be revised

from geopy.distance import geodesic

def match_by_location(df1, df2, tolerance_km=0.2):
    matches = []
    unmatched = []
    for i, row1 in df1.iterrows():
        lat1, lon1 = row1['latitude'], row1['longitude']
        found = False
        for j, row2 in df2.iterrows():
            lat2, lon2 = row2['latitude'], row2['longitude']
            if geodesic((lat1, lon1), (lat2, lon2)).km < tolerance_km:
                matches.append((i, j))
                found = True
                break
        if not found:
            unmatched.append(i)
    return matches, unmatched

# Run fuzzy match
matches, unmatched = match_by_location(south, ons)

# Enhanced output
print("🔎 Approximate Spatial Match Summary (within 200 metres):\n")
print(f"✅ Matched charging points: {len(matches)}")
print(f"❌ Unmatched charging points from Open Charge Map (south): {len(unmatched)}")
print(f"📊 Total points in Open Charge Map dataset: {len(south)}")
print(f"📊 Total points in ONS dataset: {len(ons)}\n")

# Preview sample matches and unmatched
if matches:
    print("📌 Example match:")
    i, j = matches[0]
    print("- OCM entry:", south.loc[i, ['town', 'latitude', 'longitude']].to_dict())
    print("- ONS entry:", ons.loc[j, ['town', 'latitude', 'longitude']].to_dict())

if unmatched:
    print("\n⚠️ Example unmatched entry from Open Charge Map:")
    print(south.loc[unmatched[0], ['town', 'latitude', 'longitude']].to_dict())



🔎 Approximate Spatial Match Summary (within 200 metres):

✅ Matched charging points: 128
❌ Unmatched charging points from Open Charge Map (south): 1
📊 Total points in Open Charge Map dataset: 129
📊 Total points in ONS dataset: 388

📌 Example match:
- OCM entry: {'town': 'cardiff', 'latitude': 51.82588, 'longitude': -3.017314}
- ONS entry: {'town': 'abergavenny', 'latitude': 51.824154, 'longitude': -3.017032}

⚠️ Example unmatched entry from Open Charge Map:
{'town': 'swansea', 'latitude': 51.61982557450335, 'longitude': -3.875323821305528}


🔽 **Interactive Map of ONS EV Charging Points Provided by Paul Hagger**

Creates an interactive Folium map of EV charging sites in South Wales, colour-coded by operational status (green = operational, red = not, grey = unknown). Each marker includes a detailed hover tooltip displaying metadata such as town, power rating, current type, fast-charge capability, operator, last update, and coastal proximity.


In [None]:
import folium
import geopandas as gpd

# Normalize column names
ons.columns = ons.columns.str.lower()

# Standardize operational-related fields
column_map = {
    'isoperational': 'operational',
    'status': 'operational',
    'level.isfastchargecapable': 'fastcharge',
    'datelaststatusupdate': 'lastupdate'
}
for old, new in column_map.items():
    if old in ons.columns:
        ons.rename(columns={old: new}, inplace=True)

# Drop rows missing coordinates
map_data = ons.dropna(subset=['latitude', 'longitude']).copy()

# GeoDataFrame
geometry = gpd.points_from_xy(map_data['longitude'], map_data['latitude'])
gdf = gpd.GeoDataFrame(map_data, geometry=geometry, crs="EPSG:4326")

# Add coastal flag
def is_near_sea(lat, lon):
    return (51.3 <= lat <= 51.7) and (lon <= -3.5)

gdf['nearsea'] = gdf.apply(lambda row: is_near_sea(row['latitude'], row['longitude']), axis=1)

# Build map
m = folium.Map(location=[51.5, -3.1], zoom_start=9, tiles='CartoDB Positron')

# Auto-display all columns (except lat/lon/geometry)
excluded = {'latitude', 'longitude', 'geometry'}
hover_cols = [col for col in gdf.columns if col not in excluded]

for _, row in gdf.iterrows():
    status = str(row.get('operational', '')).lower()
    color = 'green' if status == 'yes' else 'red' if status == 'no' else 'gray'

    # Auto-format tooltip
    tooltip_html = ""
    for col in hover_cols:
        val = row.get(col)
        if pd.notnull(val):
            tooltip_html += f"<b>{col.title()}:</b> {val}<br>"

    folium.Marker(
        location=[row['latitude'], row['longitude']],
        popup=folium.Popup(tooltip_html, max_width=300),
        tooltip=row.get('town', row.get('name', 'Point')),
        icon=folium.Icon(color=color, icon='charging-station', prefix='fa')
    ).add_to(m)
m

🔽 **Sankey Diagram**

This code visualises the flow of EV charging points in South Wales (Cardiff, Swansea, Newport) using a pastel-coloured Sankey diagram. It traces each point from town, through operational status, precise location, power level, current type, and finally to its operator.

In [None]:
#  Install dependencies
!pip install plotly pandas

# Load CSV from Google Drive
import pandas as pd
import plotly.graph_objects as go

# Electric vehicle (EV) charging stations sourced from the Open Charge Map API
file_id = '1UAwZ6NoOXUtSA2Y-LEH6vCb-akPqdxz0'
url = f'https://drive.google.com/uc?id={file_id}'

df = pd.read_csv(url)

#  Clean & filter data
df = df.dropna(subset=['Town', 'IsOperational', 'PowerLevel', 'CurrentType', 'Operator', 'Latitude', 'Longitude', 'Postcode'])
df = df[df['Town'].isin(['Cardiff', 'Swansea', 'Newport'])]

#  Convert 1/0 to readable labels
df['IsOperational'] = df['IsOperational'].map({1: 'Operational', 0: 'Not Operational'})

#  Add location string
df['Location'] = df.apply(lambda row: f"{row['Postcode']} ({row['Latitude']:.2f}, {row['Longitude']:.2f})", axis=1)

# Define Sankey flow stages
stages = ['Town', 'IsOperational', 'Location', 'PowerLevel', 'CurrentType', 'Operator']

#  Generate unique labels
all_labels = pd.Series(dtype="str")
for i in range(len(stages) - 1):
    grouped = df.groupby([stages[i], stages[i+1]]).size().reset_index(name='count')
    all_labels = pd.concat([all_labels, grouped[stages[i]], grouped[stages[i+1]]])
all_labels = pd.Series(all_labels.unique())
label_idx = {label: i for i, label in enumerate(all_labels)}
labels = all_labels.tolist()

#  Create links
sources, targets, values, link_towns = [], [], [], []
for i in range(len(stages) - 1):
    stage_from = stages[i]
    stage_to = stages[i + 1]
    group_keys = ['Town'] if stage_from != 'Town' else []
    group_keys += [stage_from, stage_to]
    grouped = df.groupby(group_keys).size().reset_index(name='count')

    for _, row in grouped.iterrows():
        town = row['Town']
        src = label_idx[row[stage_from]]
        tgt = label_idx[row[stage_to]]
        sources.append(src)
        targets.append(tgt)
        values.append(row['count'])
        link_towns.append(town)

#  Pastel town colours for flow
town_colors = {
    "Cardiff": "#FFD3E0",  # light pink
    "Swansea": "#D0F0FD",  # light sky blue
    "Newport": "#E8E8E8"   # light grey
}
link_colors = [town_colors.get(town, "#cccccc") for town in link_towns]

#  All nodes in black
node_colors = ["#000000" for _ in labels]

#  Draw Sankey
fig = go.Figure(data=[go.Sankey(
    arrangement="snap",
    node=dict(
        pad=25,
        thickness=20,
        line=dict(color="white", width=0.5),
        label=labels,
        color=node_colors
    ),
    link=dict(
        source=sources,
        target=targets,
        value=values,
        color=link_colors
    )
)])

fig.update_layout(
    title_text="🔌 South Wales EV Charging Flow: Town → Status → Location → Specs",
    font=dict(size=10, color='black'),
    height=900,
    paper_bgcolor="white",
    plot_bgcolor="white"
)

fig.show()




🔽 **Detailed Interactive Map of Open Charge Map EV Data with Operational and Location Attributes**

This code visualises the South Wales subset of the Open Charge Map dataset on an interactive map, enriched with operational status and technical details. It:

* Normalises and standardises key column names for consistency,
* Filters out entries without valid geographic coordinates,
* Builds a GeoDataFrame including only relevant attributes (e.g. power, operator, fast-charging capability),
* Flags whether each point lies near the South Wales coast,
* Uses colour-coded markers to indicate operational status (`green` = operational, `red` = not operational, `gray` = unknown),
* Displays detailed popups with power rating, current type, operator, last update, and proximity to the sea.

This provides a clear and information-rich spatial overview of community-curated EV charging infrastructure in the region.


In [None]:
import folium
import geopandas as gpd
import pandas as pd

# 1. Normalize columns
south.columns = south.columns.str.lower()

# 2. Standardize key column names if present
column_map = {
    'isoperational': 'operational',
    'status': 'operational',
    'level.isfastchargecapable': 'fastcharge',
    'datelaststatusupdate': 'lastupdate'
}
for old, new in column_map.items():
    if old in south.columns:
        south.rename(columns={old: new}, inplace=True)

# 3. Check for lat/lon columns and drop missing ones
if 'latitude' not in south.columns or 'longitude' not in south.columns:
    raise ValueError("Missing 'latitude' or 'longitude' columns in the dataset.")

# 4. Select available columns for the map
required = ['latitude', 'longitude', 'town', 'powerkw', 'powerlevel', 'currenttype',
            'operational', 'fastcharge', 'operator', 'lastupdate']
available = [col for col in required if col in south.columns]

# 5. Drop rows with null coordinates
map_data = south.dropna(subset=['latitude', 'longitude']).copy()

# 6. Create geometry
geometry = gpd.points_from_xy(map_data['longitude'], map_data['latitude'])

# 7. Convert to GeoDataFrame
gdf = gpd.GeoDataFrame(map_data[available], geometry=geometry, crs="EPSG:4326")

# 8. Add near sea filter
def is_near_sea(lat, lon):
    return (51.3 <= lat <= 51.7) and (lon <= -3.5)

gdf['nearsea'] = gdf.apply(lambda row: is_near_sea(row['latitude'], row['longitude']), axis=1)

# 9. Create map
m = folium.Map(location=[51.5, -3.1], zoom_start=9, tiles='CartoDB Positron')

for _, row in gdf.iterrows():
    # Color by operational status
    status = str(row.get('operational', '')).lower()
    color = 'green' if status == 'yes' else 'red' if status == 'no' else 'gray'

    tooltip = (
        f"Town: {row.get('town', 'N/A')}<br>"
        f"Power: {row.get('powerkw', 'N/A')} kW ({row.get('powerlevel', '-')})<br>"
        f"Current: {row.get('currenttype', '-')}" + "<br>"
        f"Fast Charge: {row.get('fastcharge', '-')}" + "<br>"
        f"Operator: {row.get('operator', '-')}" + "<br>"
        f"Updated: {row.get('lastupdate', '-')}" + "<br>"
        f"Near Sea: {'Yes' if row.get('nearsea') else 'No'}"
    )

    folium.Marker(
        location=[row['latitude'], row['longitude']],
        popup=folium.Popup(tooltip, max_width=300),
        tooltip=row.get('town', 'Point'),
        icon=folium.Icon(color=color, icon='charging-station', prefix='fa')
    ).add_to(m)

m

# 📊 Comparative Analysis of South Wales and ONS EV Charging Datasets
This notebook compares two datasets:
- **South Wales local dataset**
- **ONS (Oct24) dataset filtered for South Wales towns**

It includes:
1. Descriptive statistics and data quality summary
2. Comparative tables of structure and coverage
3. Spatial matching of locations (within 200m)
4. Identification and visualisation of unmatched charging points
5. Recommendations on dataset selection based on completeness and coverage

In [None]:
# Descriptive Summary of Both Datasets
def describe_dataset(df, name):
    print(f'📌 {name}')
    print('Total Rows:', len(df))
    print('Columns:', list(df.columns))
    print('Missing Values:')
    print(df.isnull().sum())
    print('\nSample Rows:')
    display(df.head(3))
    print('\nTowns Covered:', df['town'].nunique())
    print('Town Distribution:')
    print(df['town'].value_counts())
    print('\n' + '-'*60 + '\n')

describe_dataset(south, 'South Wales Dataset')
describe_dataset(ons, 'ONS Oct24 South Wales Dataset')

📌 South Wales Dataset
Total Rows: 129
Columns: ['isrecentlyverified', 'datelastverified', 'id', 'uuid', 'dataproviderid', 'operatorid', 'usagetypeid', 'usagecost', 'numberofpoints', 'statustypeid', 'lastupdate', 'dataqualitylevel', 'datecreated', 'submissionstatustypeid', 'dataprovider.websiteurl', 'dataprovider.dataproviderstatustype.isproviderenabled', 'dataprovider.dataproviderstatustype.id', 'dataprovider.dataproviderstatustype.title', 'dataprovider.isrestrictededit', 'dataprovider.isopendatalicensed', 'dataprovider.isapprovedimport', 'dataprovider.license', 'dataprovider.id', 'dataprovider.title', 'operatorinfo.websiteurl', 'operatorinfo.phoneprimarycontact', 'operatorinfo.isprivateindividual', 'operatorinfo.contactemail', 'operatorinfo.isrestrictededit', 'operatorinfo.id', 'operator', 'usagetype.ispayatlocation', 'usagetype.ismembershiprequired', 'usagetype.isaccesskeyrequired', 'usagetype.id', 'usagetype', 'operational', 'statustype.isuserselectable', 'statustype.id', 'statustyp

Unnamed: 0,isrecentlyverified,datelastverified,id,uuid,dataproviderid,operatorid,usagetypeid,usagecost,numberofpoints,statustypeid,...,fastcharge,level.id,powerlevel,currenttype.description,currenttype.id,currenttype,amperage,voltage,reference,comments
0,False,2023-12-07T08:12:00Z,286787,42A00D66-EA94-417F-A317-F029E5F5C133,18,3509.0,1,,,0.0,...,False,2.0,Level 2 : Medium (Over 2kW),Alternating Current - Single Phase,10.0,AC (Single-Phase),32.0,230.0,5205,
1,False,2023-12-07T08:12:00Z,286787,42A00D66-EA94-417F-A317-F029E5F5C133,18,3509.0,1,,,0.0,...,False,2.0,Level 2 : Medium (Over 2kW),Alternating Current - Single Phase,10.0,AC (Single-Phase),32.0,230.0,5206,
2,False,2023-12-07T08:12:00Z,286785,1419A724-FF8C-461B-8DBE-3B6EE5F3AE8F,18,3509.0,1,,,0.0,...,False,2.0,Level 2 : Medium (Over 2kW),Alternating Current - Three Phase,20.0,AC (Three-Phase),32.0,400.0,3063,



Towns Covered: 3
Town Distribution:
town
cardiff    90
newport    31
swansea     8
Name: count, dtype: int64

------------------------------------------------------------

📌 ONS Oct24 South Wales Dataset
Total Rows: 388
Columns: ['chargedeviceid', 'reference', 'name', 'latitude', 'longitude', 'subbuildingname', 'buildingname', 'buildingnumber', 'thoroughfare', 'street', 'doubledependantlocality', 'dependantlocality', 'town', 'county', 'postcode', 'countrycode', 'uprn', 'devicedescription', 'locationshortdescription', 'locationlongdescription', 'devicemanufacturer', 'devicemodel', 'deviceownername', 'deviceownerwebsite', 'deviceownertelephoneno', 'deviceownercontactname', 'devicecontrollername', 'devicecontrollerwebsite', 'devicecontrollertelephoneno', 'devicecontrollercontactname', 'devicenetworks', 'chargedevicestatus', 'publishstatus', 'devicevalidated', 'datecreated', 'dateupdated', 'moderated', 'lastupdated', 'lastupdatedby', 'attribution', 'datedeleted', 'paymentrequired', 'payme

Unnamed: 0,chargedeviceid,reference,name,latitude,longitude,subbuildingname,buildingname,buildingnumber,thoroughfare,street,...,connector8type,connector8ratedoutputkw,connector8outputcurrent,connector8ratedvoltage,connector8chargemethod,connector8chargemode,connector8tetheredcable,connector8status,connector8description,connector8validated
3212,a6ade5aa93b826f8de63c663e1159bf7,PP-12398,G24 Innovations,51.506864,-3.101256,,,,Wentloog Avenue,CF3 2GH,...,,,,,,,,,,
3253,8dcf2420e78a64333a59674678fb283b,PP-12311,Wessex Garages Cardiff,51.468083,-3.206283,,,24.0,Hadfield Road,,...,,,,,,,,,,
3343,cab73666e96e6d796b7d69fbe67d87a4,PP-5112119,Bassetts Nissan,51.657556,-3.925563,,,,,"Neath Road, Morriston, Swansea",...,,,,,,,,,,



Towns Covered: 13
Town Distribution:
town
cardiff           130
newport           108
swansea            34
barry              30
bridgend           19
tenby              17
carmarthen         14
llanelli           12
merthyr tydfil      8
pontypridd          6
neath               4
abergavenny         4
port talbot         2
Name: count, dtype: int64

------------------------------------------------------------



In [None]:
# Comparative Summary Table
summary = pd.DataFrame({
    'Dataset': ['South Wales', 'ONS Oct24 (South Wales)'],
    'Source': ['Local Authority', 'ONS National'],
    'Total Charging Points': [len(south), len(ons)],
    'Missing Values': [south.isnull().sum().sum(), ons.isnull().sum().sum()],
    'Number of Columns': [len(south.columns), len(ons.columns)],
    'Towns Covered': [south['town'].nunique(), ons['town'].nunique()]
})
summary

Unnamed: 0,Dataset,Source,Total Charging Points,Missing Values,Number of Columns,Towns Covered
0,South Wales,Local Authority,129,2821,100,3
1,ONS Oct24 (South Wales),ONS National,388,40578,158,13


In [None]:
# Spatial Matching within 200m
matches, unmatched = match_by_location(south, ons)
matched_ons_indices = {j for i, j in matches}
unmatched_ons = ons[ons.index.isin(matched_ons_indices)]
print(f'Total Matches: {len(matches)}')
print(f'Total Unmatched in ONS (extra points): {len(unmatched_ons)}')
unmatched_ons

Total Matches: 128
Total Unmatched in ONS (extra points): 61


Unnamed: 0,chargedeviceid,reference,name,latitude,longitude,subbuildingname,buildingname,buildingnumber,thoroughfare,street,...,connector8type,connector8ratedoutputkw,connector8outputcurrent,connector8ratedvoltage,connector8chargemethod,connector8chargemode,connector8tetheredcable,connector8status,connector8description,connector8validated
4066,4a6d0b5bc39ed0a26b04afec1026b984,PG-81030,Tesco Extra - Llansamlet,51.658701,-3.901185,,,,,Nantyffin Road,...,,,,,,,,,,
5098,910db7dbc1ab3938b6d0662c93d96938,PG-83820,Tesco Superstore - St Mellons,51.524775,-3.103756,,,,,Crickhowell Road,...,,,,,,,,,,
6728,e3a9682e949423ecdcbe7e0a0b2ff990,ENG00456,King Brychan Pub & Restaurant,51.737890,-3.378350,,King Brychan Pub & Restaurant,,,Rhydycar Leisure Park,...,,,,,,,,,,
7415,95c6345b5fd08f41600f910e97b50b4c,GP11432,Morrisons Abergavenny,51.824154,-3.017032,,Wm Morrison Supermarkets PLC,,Park Road,,...,,,,,,,,,,
11688,454cba7bd267c3f60d982416d06516f6,SEC20002,Anglesey Street,51.483130,-3.207070,,,,Anglesey Street,,...,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
27188,9205ff0c6662aff0bce23adfc2d803ac,001662,The Strand Car Park,51.619952,-3.939262,,,,,The Strand,...,,,,,,,,,,
27191,31d0c3e205bd5ddb71c43e67e7015870,001749,Blackpill Car Park,51.599391,-3.994256,,,,,Derwen Fawr Road,...,,,,,,,,,,
33427,9a6a35909bc1777ae285a67541020893,DRAX00165,DRAX00165,51.473010,-3.166070,,Novotel Cardiff,,,Schooner Way,...,,,,,,,,,,
38032,121d5ac2ef4dae330b0635c52f8025d6,GBCPIE28657651,BEMIS-SWAN STATION 1,51.648968,-3.919454,,,,,Siemens Way Enterprise Park,...,,,,,,,,,,


🔽 **Plots unmatched ONS EV charging points not found in the Open Map dataset, using light blue markers and detailed popups on a South Wales map.**


In [None]:
import folium
import pandas as pd

# Ensure column names are lowercase
unmatched_ons.columns = unmatched_ons.columns.str.lower()

# Columns to exclude from popup rendering
excluded = {'latitude', 'longitude', 'geometry'}

# Use all other columns dynamically
hover_cols = [col for col in unmatched_ons.columns if col not in excluded]

# Create map
m = folium.Map(location=[51.5, -3.1], zoom_start=9, tiles='CartoDB Positron')

# Add each unmatched charging point with detailed popup
for _, row in unmatched_ons.iterrows():
    # Build tooltip from available fields
    tooltip_html = ""
    for col in hover_cols:
        val = row.get(col)
        if pd.notnull(val):
            tooltip_html += f"<b>{col.title()}:</b> {val}<br>"

    folium.Marker(
        location=[row['latitude'], row['longitude']],
        popup=folium.Popup(tooltip_html, max_width=300),
        tooltip=row.get('town', row.get('name', 'Point')),
        icon=folium.Icon(color='lightblue', icon='charging-station', prefix='fa')
    ).add_to(m)

m

Recommendation
- The **ONS dataset includes more charging points** and broader spatial coverage.
- **South Wales data** might be locally curated but **misses at least one point** present in ONS.
- For operational planning or monitoring availability, the **ONS dataset is recommended** due to its better completeness and national standardisation.