LBTH spatial datasets
==

We create several variables that can be used in other notebooks using the `%run` jupyter magic:

1. ward boundaries and save to geojson
2. oa/lsoa/msoa boundaries and save to geojson
3. a lookup of OAs (OA, MSOA, LSOA, LA)
4. a lookup of OAs to Wards
5. population weighted centroids

in your other notebook, use the `%run` magic to load the following variables created in this notebook: `wards, oa, lsoa, msoa, london boroughs lbth_oa21_lsoa21_msoa21_lad21, oa_ward_lookup, oa_centroids`

```python
# get the spatial dataset variables from the first notebook
%run "./0.0-lbth-spatial-datasets.ipynb"
```

In [5]:
import os
import pandas as pd
import geopandas as gpd

### Boundaries


please see https://data-hamlets.github.io/open-data-tower-hamlets/datasets/ons-2021-census-boundaries/

In [7]:
# WARDS
try:
  wards = gpd.read_file('../data/external/tower-hamlets-wards.geojson')
except Exception:
  wards = gpd.read_file('https://gist.github.com/joel-lbth/6d2c78c52163b7da1d91089c9bd849cf/raw/1ece7857aff95a3729d76313d63bd9cbb495491a/lbth-wards.geojson')
  wards.to_file('../data/external/tower-hamlets-wards.geojson', driver='GeoJSON')

In [6]:
# OA
try:
  oa = gpd.read_file('../data/external/lbth_oa21.geojson')
except Exception:
  oa = gpd.read_file('https://gist.githubusercontent.com/joel-lbth/8afdafefe431f6e1508cf59993a5b0d8/raw/bc7ce874ab02e6ae8b4b418cf16eff6386cf34c3/lbth_oa21.geojson')
  oa.to_file('../data/external/lbth_oa21.geojson', driver='GeoJSON')

In [None]:
# LSOA
try:
  lsoa = gpd.read_file('../data/external/lbth_lsoa21.geojson')
except Exception:
  lsoa = gpd.read_file('https://gist.github.com/joel-lbth/f2d748b99ee7bfe43384d1a80694038a/raw/5ea02c312cc712ca7e74c818148e7ed47e3e4c90/lbth_lsoa11.geojson')
  lsoa.to_file('../data/external/lbth_lsoa21.geojson', driver='GeoJSON')

In [None]:
# MSOA
try:
  msoa = gpd.read_file('../data/external/lbth_msoa21.geojson')
except Exception:
  msoa = gpd.read_file('https://gist.githubusercontent.com/joel-lbth/8afdafefe431f6e1508cf59993a5b0d8/raw/bc7ce874ab02e6ae8b4b418cf16eff6386cf34c3/lbth_msoa21.geojson')
  msoa.to_file('../data/external/lbth_msoa21.geojson', driver='GeoJSON')

In [None]:
# borough boundaries
try:
  boroughs = gpd.read_file("../data/external/boroughs.geojson")
except Exception:
  boroughs=gpd.read_file('https://skgrange.github.io/www/data/london_boroughs.json')
  boroughs.to_file('../data/external/boroughs.geojson', driver='GeoJSON')


### Lookups

In [None]:
# 2011 Admin lookup, from Dec 2020
# Source:
# https://geoportal.statistics.gov.uk/datasets/ons::output-area-to-lower-layer-super-output-area-to-middle-layer-super-output-area-to-local-authority-district-december-2020-lookup-in-england-and-wales/explore
# !curl "https://opendata.arcgis.com/api/v3/datasets/65664b00231444edb3f6f83c9d40591f_0/downloads/data?format=csv&spatialRefId=4326&where=1%3D1" --output "{lookup_dir}/oa11_lsoa11_msoa11_lad20_rgn20.csv"

In [None]:
# 2021 Admin lookup
# Source:
# https://geoportal.statistics.gov.uk/datasets/output-area-to-lower-layer-super-output-area-to-middle-layer-super-output-area-to-local-authority-district-december-2021-lookup-in-england-and-wales-v2-1/about
# !curl "https://www.arcgis.com/sharing/rest/content/items/792f7ab3a99d403ca02cc9ca1cf8af02/data" --output "{lookup_dir}/oa21_lsoa21_msoa21_lad21.csv"

try:
  df = pd.read_csv('../data/external/lbth_oa21_lsoa21_msoa21_lad21.csv')
except Exception:
  df = pd.read_csv("https://www.arcgis.com/sharing/rest/content/items/792f7ab3a99d403ca02cc9ca1cf8af02/data", encoding = 'latin')
  df = df.query('lad22nm == "Tower Hamlets"')
  df.drop(columns=['lad22cd',	'lad22nm', 'lad22nmw', 'msoa21nm', 'lsoa21nm'], inplace=True)
  df.to_csv('../data/external/lbth_oa21_lsoa21_msoa21_lad21.csv', index=False)

lbth_oa21_lsoa21_msoa21_lad21 = df


In [None]:
# https://www.arcgis.com/sharing/rest/content/items/5fc4ff82228846c2b893e3beba0e3751/data

# 2011-21 LSOA Best-Fit Lookup
# A best-fit lookup file between Lower layer Super Output Areas (LSOA) as at December 2011 and LSOAs as at December 2021 in England and Wales.
# The lookup contains all the 2011 LSOAs (34,753) and these are point-in-polygon to the 2021 LSOA full extent boundaries (which contains 34,628 records, so 1,044 LSOAs are missing from the 2021 LSOAs)

# !curl "https://www.arcgis.com/sharing/rest/content/items/5fc4ff82228846c2b893e3beba0e3751/data" --output "{lookup_dir}/LSOA2011-to-LSOA-2021-to-Local-Authority-District-2022.csv"

In [8]:
# join oa to ward using pre-prepared lookup
# based on population weighted oa centroids intersecting ward boundaries (in case we want to aggregate to ward level area plotting)
# prepare to merge oas to wards by getting a lookup
try:
  oa_ward_lookup = pd.read_csv('../data/external/lbth_oa21_ward_lookup.csv')
except Exception:
  oa_ward_lookup = pd.read_csv('https://gist.githubusercontent.com/joel-lbth/da1b6f54f1cd076bf7336be7336de04b/raw/2e25cc5c14ae4a7914dcfc0c0a64dc131bab84bd/lbth_oa21_ward_lookup.csv')
  oa_ward_lookup.to_csv('../data/external/lbth_oa21_ward_lookup.csv')

### Centroid downloads

these are used to spatial join census areas to ward in the absence of an official lookup

In [9]:
try:
  oa_centroids = gpd.read_file('../data/external/lbth_oa21_pop_centroids.geojson')
except Exception:
  # OA population-weighted centroids, 2021 census boundaries
  # Source: 
  # https://geoportal.statistics.gov.uk/search?collection=Dataset&sort=name&tags=all(CTD_OA)
  oa_centroids = gpd.read_file("https://opendata.arcgis.com/api/v3/datasets/d966943bc42b4efb8bd9233016163379_0/downloads/data?format=geojson&spatialRefId=4326&where=1%3D1")
  oa_centroids = oa_centroids.drop(columns=['FID','GlobalID'])
  oa_centroids = oa_centroids.merge(lbth_oa21_lsoa21_msoa21_lad21, left_on='OA21CD', right_on='oa21cd')
  oa_centroids = oa_centroids.drop(columns=['OA21CD', 'lsoa21cd', 'msoa21cd'])
  oa_centroids.to_file('../data/external/lbth_oa21_pop_centroids.geojson', driver='GeoJSON')