This notebook processes the sample world bank compound risk assessment data into GeoJSON to be used in a mockup.


In [None]:
from io import StringIO

import pandas as pd
import geopandas as gpd

Sample data exported from a Google Sheets import of the Excel file sent over by L.Jones3@lse.ac.uk

In [None]:
# Need to remove the first two lines of the export

with open('/opt/src/data/world_bank_compound_risk_sample_data.csv') as f:
    csv_text = '\n'.join(f.read().split('\n')[2:])

wb_df = pd.read_csv(StringIO(csv_text))

In [None]:
wb_df

Read in the countries as a GeoPandas DataFrame, and check to see if there's any country names that don't match between the datasets.

In [None]:
countries_gdf = gpd.read_file('/opt/src/data/countries.geojson')

In [None]:
c_names = set(countries_gdf['ADMIN'].values)
wb_names = set(wb_df['COUNTRY'].values)

wb_names - c_names


Try to find any country names that don't match. Rename to force match, or choose to skip.

One method of checking is to load the countries in geojson.io and find the regions refered to above.

In [None]:
countries_renamed_gdf = countries_gdf.replace({
    'The Bahamas': 'Bahamas',
    'Brunei': 'Brunei Darussalam',
    'Republic of the Congo': 'Congo',
    'Democratic Republic of the Congo': 'Congo DR',    
    'Czechia': 'Czech Republic',
    "Ivory Coast": "Côte d'Ivoire",     
    'Swaziland': 'Eswatini',
    'North Korea': 'Korea DPR',
    'South Korea': 'Korea Republic of',
    'Laos': 'Lao PDR',
    'Federated States of Micronesia': 'Micronesia',
    'Moldova': 'Moldova Republic of',
    'Macedonia': 'North Macedonia',
    'Russia': 'Russian Federation',
    'Republic of Serbia': 'Serbia',
    'United Republic of Tanzania': 'Tanzania',
    'East Timor': 'Timor-Leste',
    'Vietnam': 'Viet Nam',
    })

Recheck to makeasure we have everything accounted for.

In [None]:
set(wb_df['COUNTRY'].values) - set(countries_renamed_gdf['ADMIN'].values)

Merge the geometry and save off the new GeoJSON.

In [None]:
merged_df = wb_df.merge(countries_renamed_gdf[['ADMIN', 'geometry']], 
                        left_on='COUNTRY', 
                        right_on='ADMIN')
merged_gdf = gpd.GeoDataFrame(merged_df, crs='epsg:4326')

merged_gdf.to_file('/opt/src/data/output/wb-compound-risk-sample.geojson', 
                   encoding='utf-8',
                   driver='GeoJSON')