# Reduce size of geojson files

When plotting geojson polygons in folium, all of the information in the geojson is stored in the map even though only some of it is needed to produce an accurate image. To reduce the file size and so have faster-loading maps, this notebook does two things to the features in a geojson file:

+ Reduce the precision of the coordinates to 6 decimal places (from typically 15, e.g. [-0.099722888517677, 51.5167693121822]).
+ Remove any field except coordinates and an identifier, in this case LSOA11CD.

## Setup

In [26]:
# For importing geojson:
import json

# For saving new geojson:
from geojson import FeatureCollection

import copy

## Import the geojson

In [17]:
with open('./LSOA_(Dec_2011)_Boundaries_Super_Generalised_Clipped_(BSC)_EW_V3.geojson') as f:
    geojson_ew = json.load(f)

Check which fields are stored in the geojson:

In [18]:
geojson_ew.keys()

dict_keys(['type', 'name', 'crs', 'features'])

In [19]:
geojson_ew['type']

'FeatureCollection'

In [20]:
geojson_ew['name']

'LSOA_(Dec_2011)_Boundaries_Super_Generalised_Clipped_(BSC)_EW_V3'

In [21]:
geojson_ew['crs']

{'type': 'name', 'properties': {'name': 'urn:ogc:def:crs:OGC:1.3:CRS84'}}

The above fields can remain as they are necessary and/or don't take up much space. The data we'll cut is anything extraneous in the features. Have a look at the first feature:

In [22]:
geojson_ew['features'][0]

{'type': 'Feature',
 'properties': {'OBJECTID': 1,
  'LSOA11CD': 'E01000001',
  'LSOA11NM': 'City of London 001A',
  'LSOA11NMW': 'City of London 001A',
  'BNG_E': 532129,
  'BNG_N': 181625,
  'LONG': -0.09706,
  'LAT': 51.5181,
  'Shape__Area': 157794.481079102,
  'Shape__Length': 1685.39177789522,
  'GlobalID': 'b12173a3-5423-4672-a5eb-f152d2345f96'},
 'geometry': {'type': 'Polygon',
  'coordinates': [[[-0.094744468765127, 51.5205961026855],
    [-0.095455174414778, 51.5154416842748],
    [-0.099722888517677, 51.5167693121822],
    [-0.098498304750799, 51.5205398973512],
    [-0.097265555652221, 51.5215848107683],
    [-0.094744468765127, 51.5205961026855]]]}}

We must retain the `type`, `properties` and `geometry` fields but can reduce the content within them. 

+ From properties, we only need to keep one identifier so we'll keep `LSOA11CD` and ditch the rest. 
+ From coordinates, we don't need as much precision as that so we'll round all values to six decimal places.

In [40]:
geojson_new = copy.deepcopy(geojson_ew)

In [41]:
for f, feature in enumerate(geojson_new['features']):
    new_properties = {'LSOA11CD': feature['properties']['LSOA11CD']}
        
    # Get lists of coordinates, one list per separate polygon
    # in the LSOA.
    # (Have MultiPolygon e.g. when an LSOA on the mainland coastline
    # also contains an island, so there's a gap between areas.)
    lists_of_polygon_coordinates = feature['geometry']['coordinates']

    new_coords = []
    for lc, list_of_coords in enumerate(lists_of_polygon_coordinates):
        if feature['geometry']['type'] == 'MultiPolygon':
            # For MultiPolygons the coords are nested an extra time.
            list_of_coords = list_of_coords[0]
        new_coords_here = [
            [round(coords[0], 4), round(coords[1], 4)]
            for coords in list_of_coords
        ]
        if feature['geometry']['type'] == 'MultiPolygon':
            # For MultiPolygons the coords are nested an extra time.
            new_coords_here = [new_coords_here]

        new_coords.append(new_coords_here)

    # Overwrite the old data with the new:
    geojson_new['features'][f]['properties'] = new_properties
    geojson_new['features'][f]['geometry']['coordinates'] = new_coords

In [42]:
geojson_new['features'][0]

{'type': 'Feature',
 'properties': {'LSOA11CD': 'E01000001'},
 'geometry': {'type': 'Polygon',
  'coordinates': [[[-0.0947, 51.5206],
    [-0.0955, 51.5154],
    [-0.0997, 51.5168],
    [-0.0985, 51.5205],
    [-0.0973, 51.5216],
    [-0.0947, 51.5206]]]}}

Write new geojson to file:

In [43]:
# Save as a .geojson:
save_name = 'LSOA_(Dec_2011)_Boundaries_Super_Generalised_Clipped_(BSC)_EW_V3_reduced2.geojson'
with open('./'+save_name, 'w', encoding='utf-8') as f:
    json.dump(geojson_new, f, ensure_ascii=False)

# Reduce size of hospital coordinates

In [44]:
import pandas as pd

In [47]:
df_hospitals = pd.read_csv("./stroke_hospitals_2022.csv")

df_hospitals.head()

Unnamed: 0,Postcode,Hospital_name,Use_IVT,Use_MT,Use_MSU,Country,Strategic Clinical Network,Health Board / Trust,Stroke Team,SSNAP name,...,Thrombolysis,ivt_rate,Easting,Northing,long,lat,Neuroscience,30 England Thrombectomy Example,hospital_city,Notes
0,RM70AG,RM70AG,1,1,1,England,London SCN,Barking,Havering and Redbridge University Hospitals N...,Queens Hospital Romford HASU,...,117,11.9,551118,187780,0.179031,51.568647,1,0,Romford,
1,E11BB,E11BB,1,1,1,England,London SCN,Barts Health NHS Trust,The Royal London Hospital,Royal London Hospital HASU,...,115,13.4,534829,181798,-0.058133,51.519018,1,1,Royal London,
2,SW66SX,SW66SX,1,1,1,England,London SCN,Imperial College Healthcare NHS Trust,"Charing Cross Hospital, London",Charing Cross Hospital HASU,...,113,9.9,524226,176487,-0.212736,51.473717,1,1,Charing Cross,
3,SE59RW,SE59RW,1,1,1,England,London SCN,King's College Hospital NHS Foundation Trust,"King's College Hospital, London",King's College Hospital HASU,...,124,15.0,532536,176228,-0.093251,51.469505,1,0,Kings College,
4,BR68ND,BR68ND,1,0,0,England,London SCN,King's College Hospital NHS Foundation Trust,Princess Royal University Hospital,Princess Royal University Hospital HASU,...,113,13.3,543443,165032,0.059146,51.366243,0,0,Princess Royal,


In [50]:
df_hospitals['long'] = df_hospitals['long'].round(4)
df_hospitals['lat'] = df_hospitals['lat'].round(4)

In [51]:
df_hospitals.to_csv('stroke_hospitals_22_reduced.csv', index=False)