# Green space data cleaning

Data: Ordnance Survey greenspace sites

Available layers: sites (areas, polygons), access points



**Workflow**

... ** 1 . filter out unwanted categories from OS layer (`allottments`, `golf course`)<br/>
**NOTE:** better get rid of these categories after the integration with OSM, or we'd re-introduce golf courses areas from OSM (sometimes classified as "forest")

2 . clean OS data:
 - filter the entrances (only the ones on the edges??)
 - overlay / dissolve / intersect to get rid of overlying polygons (ex sport pitch inside a park)

3 . add “missing” parks: prepare OSM data (entrances?)

** 4 . put together the last and OS

5 . get areas size

6 . run accessibility analysis (ready)
 - filter entrances (one per park)
 - assign metric as sum[park size]

## Variables definition and data import

1. Variables definition

In [4]:
import sys
import numpy as np
import pandas as pd
import geopandas as gpd
import datetime as dt
import tracc
from r5py import TransportNetwork, TravelTimeMatrixComputer, TransitMode, LegMode
from datetime import timedelta
import matplotlib.pyplot as plt
sys.argv.append(["--max-memory", "8G"])


data_folder = "/Users/azanchetta/OneDrive - The Alan Turing Institute/demoland_data"


# Ordnance Survey (OS) Greenspace data
# (using Tyne and Wear data for now, generated previously in QGis)
# greenspace_file = f"{data_folder}/raw/accessibility/OS Open Greenspace (GPKG) GB/data/opgrsp_gb.gpkg"
greenspace_sites_file = f"{data_folder}/processed/accessibility/greenspace-sites_tynewear.gpkg"
accesspoints_file = f"{data_folder}/processed/accessibility/accessTOgs_tynewear.gpkg"

# OSM landuse data (Tyne and Wear data)
osm_landuse_file = f"{data_folder}/raw/OSM_tynewear/tyne-and-wear-latest-free.shp/gis_osm_landuse_a_free_1.shp"

# if needed for mapping purposes (?)
region_lads_file = f"{data_folder}/processed/authorities/LADs_tynewear.shp" # needed in order to filter greenspace data within the regional boundaries

2. Data import

/Users/azanchetta/OneDrive - The Alan Turing Institute/demoland_data/processed/accessibility

In [11]:

greenspace_sites = gpd.read_file(greenspace_sites_file,
                                 layer = "grenspace-sites_tynewear")
greenspace_sites.head()

accesspoints = gpd.read_file(accesspoints_file,
                        layer = "pointsaccessTOgs_tynewear")
accesspoints.head()

# for mapping:
region_lads = gpd.read_file(region_lads_file)
region_lads.head()

Unnamed: 0,OBJECTID,LAD20CD,LAD20NM,LAD20NMW,BNG_E,BNG_N,LONG,LAT,Shape__Are,Shape__Len,label,geometry
0,265,E08000021,Newcastle upon Tyne,,422287,569662,-1.65297,55.02101,113461900.0,65202.925674,Newcastle upon Tyne\nE08000021,"POLYGON ((422592.399 576160.095, 422618.297 57..."
1,266,E08000022,North Tyneside,,431471,570602,-1.50923,55.02896,82313730.0,65337.781081,North Tyneside\nE08000022,"MULTIPOLYGON (((435203.599 575441.701, 435209...."
2,267,E08000023,South Tyneside,,435514,564057,-1.44679,54.96988,64428420.0,51370.230506,South Tyneside\nE08000023,"POLYGON ((438030.200 568413.300, 438021.350 56..."
3,268,E08000024,Sunderland,,436470,551524,-1.43344,54.85719,137441200.0,99737.411804,Sunderland\nE08000024,"MULTIPOLYGON (((441259.800 557854.000, 441252...."
4,281,E08000037,Gateshead,,420168,559658,-1.6868,54.9312,142369100.0,90476.826397,Gateshead\nE08000037,"POLYGON ((415042.801 565083.296, 415104.202 56..."


In [13]:
greenspace_sites.explore(column="function")

In [14]:
greenspace_sites["function"].unique()

array(['Allotments Or Community Growing Spaces', 'Playing Field',
       'Play Space', 'Other Sports Facility', 'Public Park Or Garden',
       'Religious Grounds', 'Bowling Green', 'Cemetery', 'Tennis Court',
       'Golf Course'], dtype=object)

Working on OS data before understanding how to integrate OSM data<br/>
IE points 1, 2, 5, 6 from **Workflow** above