# Sorting Greenspace Data

Data is loaded in at LSOA level. Greenspace data was downloaded as individual polygons from OS. Greenspace was divided up into 4 types: natural, parks, sports, others. It was intersected with LSOAs, to identify which greenspaces fell within each LSOA. This now needs to be aggregated, to calculate the amount of greenspace within each LSOA.

1. Separate out into 4 types
2. Dissolve, to remove overlapping polygons
3. Intersect with LSOAs
4. Calculate areas
5. Export as csv
6. Load into Python
7. Aggregate to LSOA
8. Join back up with LSOA geometry
9. Calculate proportions

## Reading in Data

In [1]:
import pandas as pd 
import numba
import seaborn as sns 
import matplotlib.pyplot as plt
import geopandas as gpd
import palettable as pltt
import descartes
from pysal.viz import mapclassify 
import numpy as np
import statsmodels.api as sm
import scipy.stats as stats



In [2]:
import shapely
import rtree
from shapely.geometry import Polygon

In [3]:
parks = gpd.read_file('GS_Parks.shp')

In [4]:
parks.head()

Unnamed: 0,priFunc,Type,OBJECTID,Area,geometry
0,Public Park Or Garden,Park,,16.76488,"POLYGON Z ((399649.300 653037.950 0.000, 39964..."
1,Public Park Or Garden,Park,,169.0226,"POLYGON Z ((399664.550 653075.840 0.000, 39967..."
2,Public Park Or Garden,Park,,202.42683,"POLYGON Z ((399686.900 653074.660 0.000, 39968..."
3,Public Park Or Garden,Park,,10.07841,"POLYGON Z ((399665.160 653076.510 0.000, 39967..."
4,Public Park Or Garden,Park,,89.11763,"POLYGON Z ((399665.160 653076.510 0.000, 39967..."


In [5]:
lsoas = gpd.read_file('LSOAs_fixed.shp')

In [6]:
lsoas.head()

Unnamed: 0,OBJECTID,LSOA11CD,LSOA11NM,LSOA11NMW,Shape__Are,Shape__Len,geometry
0,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,"POLYGON ((532095.563 181577.351, 532095.125 18..."
1,2,E01000002,City of London 001B,City of London 001B,228419.333099,2708.05204,"POLYGON ((532267.728 181643.781, 532262.875 18..."
2,3,E01000003,City of London 001C,City of London 001C,59054.013168,1224.770897,"POLYGON ((532105.312 182010.574, 532104.872 18..."
3,4,E01000005,City of London 001E,City of London 001E,189577.165154,2275.832056,"POLYGON ((533610.974 181410.968, 533615.622 18..."
4,5,E01000006,Barking and Dagenham 016A,Barking and Dagenham 016A,146536.52047,1966.162225,"POLYGON ((544817.826 184346.261, 544815.791 18..."


<BR><BR><BR>
    
## Dissolve the greenspace polygons

In [7]:
#Leeds['outline'] = 1
parks_outline = parks.dissolve()

In [None]:
#check it worked
parks_outline.head()

In [None]:
#save the dissolved polygons
parks_outline.to_file('Parks_outline.shp')

In [None]:
#read back in the dissolved polygons

<br><br><br>

## Intersect LSOAs with parks

In [None]:
lsoas_parks_int = gpd.overlay(lsoas, parks_outline, how='intersection')



In [None]:
print('hello')

In [3]:
sports = pd.read_csv('LSOA_Sports.csv')

In [4]:
sports.head()

Unnamed: 0,OBJECTID,LSOA11CD,LSOA11NM,LSOA11NMW,Shape__Are,Shape__Len,priFunc,Type,OBJECTID_2,Area
0,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Other Sports Facility,Sports,244444.0,1452.36345
1,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Tennis Court,Sports,244600.0,1083.82321
2,3,E01000003,City of London 001C,City of London 001C,59054.013168,1224.770897,Other Sports Facility,Sports,243763.0,404.91844
3,3,E01000003,City of London 001C,City of London 001C,59054.013168,1224.770897,Other Sports Facility,Sports,243764.0,11.38765
4,3,E01000003,City of London 001C,City of London 001C,59054.013168,1224.770897,Other Sports Facility,Sports,243765.0,60.95829


In [6]:
natural = pd.read_csv('LSOA_Natural.csv')

In [7]:
natural.head()

Unnamed: 0,OBJECTID,LSOA11CD,LSOA11NM,LSOA11NMW,Shape__Are,Shape__Len,priFunc,Type,OBJECTID_2,Area
0,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Natural,Natural,244452.0,345.61772
1,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Natural,Natural,244454.0,1431.4181
2,2,E01000002,City of London 001B,City of London 001B,228419.333099,2708.05204,Natural,Natural,244452.0,345.61772
3,2,E01000002,City of London 001B,City of London 001B,228419.333099,2708.05204,Natural,Natural,244454.0,1431.4181
4,2,E01000002,City of London 001B,City of London 001B,228419.333099,2708.05204,Natural,Natural,244456.0,13.4562


In [8]:
parks = pd.read_csv('LSOA_Parks.csv')

In [9]:
parks.head()

Unnamed: 0,OBJECTID,LSOA11CD,LSOA11NM,LSOA11NMW,Shape__Are,Shape__Len,priFunc,Type,OBJECTID_2,Area
0,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Public Park Or Garden,Park,244885.0,163.33355
1,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Public Park Or Garden,Park,244886.0,1176.13865
2,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Public Park Or Garden,Park,244887.0,404.31132
3,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Public Park Or Garden,Park,244888.0,123.60648
4,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Public Park Or Garden,Park,244889.0,308.30816


In [10]:
other = pd.read_csv('LSOA_Other.csv')

In [11]:
other.head()

Unnamed: 0,OBJECTID,LSOA11CD,LSOA11NM,LSOA11NMW,Shape__Are,Shape__Len,priFunc,Type,OBJECTID_2,Area
0,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Amenity - Residential Or Business,Other,244310.0,49.21836
1,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Amenity - Residential Or Business,Other,244332.0,81.64385
2,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Amenity - Residential Or Business,Other,244340.0,288.20022
3,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Amenity - Residential Or Business,Other,244443.0,966.65698
4,1,E01000001,City of London 001A,City of London 001A,129865.337669,2635.781429,Amenity - Residential Or Business,Other,244445.0,4112.00553


<br> <br> <br>
 
# Keeping only Required Columns 

In [13]:
sports.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 263539 entries, 0 to 263538
Data columns (total 10 columns):
 #   Column      Non-Null Count   Dtype  
---  ------      --------------   -----  
 0   OBJECTID    263539 non-null  int64  
 1   LSOA11CD    263539 non-null  object 
 2   LSOA11NM    263539 non-null  object 
 3   LSOA11NMW   263539 non-null  object 
 4   Shape__Are  263539 non-null  float64
 5   Shape__Len  263539 non-null  float64
 6   priFunc     263539 non-null  object 
 7   Type        263539 non-null  object 
 8   OBJECTID_2  250115 non-null  float64
 9   Area        263539 non-null  float64
dtypes: float64(4), int64(1), object(5)
memory usage: 20.1+ MB


In [21]:
sports_simp = sports[['LSOA11CD', 'Area']]

In [22]:
natural_simp = natural[['LSOA11CD','Area']]

In [23]:
parks_simp = parks[['LSOA11CD', 'Area']]

In [24]:
other_simp = other[['LSOA11CD', 'Area']]

<br><br><br>

# Aggregating by LSOA

In [34]:
sports_agg = sports_simp.groupby(['LSOA11CD'], as_index = False).sum()

In [35]:
sports_agg.head()

Unnamed: 0,LSOA11CD,Area
0,E01000001,2536.18666
1,E01000003,3877.40433
2,E01000005,247.36394
3,E01000007,197.71641
4,E01000008,464.64908


In [36]:
natural_agg = natural_simp.groupby(['LSOA11CD'], as_index = False).sum()

In [37]:
natural_agg.head()

Unnamed: 0,LSOA11CD,Area
0,E01000001,1777.03582
1,E01000002,8895.51245
2,E01000003,366.13136
3,E01000008,7356.64962
4,E01000011,12734.10944


In [38]:
parks_agg = parks_simp.groupby(['LSOA11CD'], as_index = False).sum()

In [39]:
parks_agg.head()

Unnamed: 0,LSOA11CD,Area
0,E01000001,2773.00031
1,E01000005,2356.83701
2,E01000009,3804.20931
3,E01000010,38468.78043
4,E01000011,277167.28424


In [40]:
other_agg = other_simp.groupby(['LSOA11CD'], as_index = False).sum()

In [41]:
other_agg.head()

Unnamed: 0,LSOA11CD,Area
0,E01000001,23491.60793
1,E01000002,29354.94223
2,E01000003,9251.91892
3,E01000005,23929.46463
4,E01000006,12688.95516


In [42]:
other_agg.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 27503 entries, 0 to 27502
Data columns (total 2 columns):
 #   Column    Non-Null Count  Dtype  
---  ------    --------------  -----  
 0   LSOA11CD  27503 non-null  object 
 1   Area      27503 non-null  float64
dtypes: float64(1), object(1)
memory usage: 644.6+ KB


<br> <br> <br>
# Save Outputs

In [43]:
sports_agg.to_csv('Sports_agg.csv')

In [44]:
natural_agg.to_csv('Natural_agg.csv')

In [45]:
parks_agg.to_csv('Parks_agg.csv')

In [46]:
other_agg.to_csv('Other_agg.csv')