## Data Processing 

- This notebook processes two datasets required to visualise coal tip locations in Wales:
1. Local Authorities Boundaries (LA)
2. Welsh Coal Tip Locations (tips)
- It includes basic processing to filter the boundary file to include only Welsh Local Authorities, and to convert the coordiantes reference system of the coal tips data to ESPG4326

### Import dependencies and read files

In [None]:
import geopandas as gpd
# boundaries data
LA = gpd.read_file("./LA_boundaries_initial.geojson")
# tip data
tips = gpd.read_file("./coal_tip_centroids_initial.gpkg")

### Local Authorities Boundaries Data 

In [None]:
LA.head()

# filtering to only Wales - "LAD25NMW" is the column for Welsh names
WLA = LA[LA["LAD25NMW"].notnull() & (LA["LAD25NMW"].str.strip() != "")]

# checking there's 22 LA (the number in Wales)
WLA.shape

(22, 10)

### Coal Tip Data

In [7]:
# converting coordinates system to ESPG: 4326
print("Initial CRS:", tips.crs) # initial
tips = tips.to_crs(epsg=4326)
print("New CRS:", tips.crs) # updated

Initial CRS: EPSG:27700
New CRS: EPSG:4326


In [8]:
# checking all tip categorisations (returns None as one of the values)
tips["cat"].unique() 
# finding null row
tips[tips["cat"].isnull()]
# removing empty row 
tips.dropna(how='all')

Unnamed: 0,UID,cat,awdurdod_cymreig,authority_english,last_inspected,arolygiad_diwethaf,tipX,tipY,geometry
0,T92814,B,Powys,Powys,09/05/2024,09/05/2024,277943.654971,208023.135150,POINT (-3.76981 51.75755)
1,T38748,B,Pen-y-bont ar Ogwr,Bridgend,15/08/2023,15/08/2023,286554.488192,194019.095634,POINT (-3.64056 51.63351)
2,T71213,A,Rhondda Cynon Taf,Rhondda Cynon Taf,Pending Inspection,Disgwyl Arolygiad,300741.027220,199293.553911,POINT (-3.43715 51.6836)
3,T47637,B,Caerdydd,Cardiff,17/08/2023,17/08/2023,308438.219001,183016.239662,POINT (-3.32159 51.53859)
4,T25623,A,Merthyr Tudful,Merthyr Tydfil,Pending Inspection,Disgwyl Arolygiad,305500.960704,203229.704540,POINT (-3.36939 51.7198)
...,...,...,...,...,...,...,...,...,...
2569,T93419,C,Castell-nedd Port Talbot,Neath Port Talbot,06/11/2024,06/11/2024,281399.265369,202704.768464,POINT (-3.71796 51.7105)
2570,T32475,C,Rhondda Cynon Taf,Rhondda Cynon Taf,19/02/2025,19/02/2025,292553.469084,201140.879480,POINT (-3.5561 51.69869)
2571,T66850,C,Rhondda Cynon Taf,Rhondda Cynon Taf,19/02/2025,19/02/2025,292630.036481,200317.246145,POINT (-3.55474 51.69131)
2572,T43455,C,Rhondda Cynon Taf,Rhondda Cynon Taf,19/02/2025,19/02/2025,292507.847789,200586.271646,POINT (-3.55659 51.6937)


### Save files

In [None]:
WLA.to_file("WLAs.geojson", driver="GeoJSON") #further processing required to turn into topodata. the final file ready for visualisation will be put in the processed_data folder
tips.to_file("processed_data/tips_processed.geojson", driver ="GeoJSON")