
# 01 Read OS Open Greenspace
---

A general purpose approach to exploratory data analysis given a new geospatial dataset.

### GeoPackage
---

The [GeoPackage](https://www.geopackage.org/) (GPKG) is an open geospatial data format enabling the storage of geospatial data within a [SQLite database](https://www.sqlite.org/index.html). The SQLite database povides a lightweight and portal database storage solution. Each dataset is stored as a feature table within the database and can be queried directly using SQL or via spatial applications including the Python geospatial vector package [GeoPandas](https://geopandas.org/index.html). 

### GeoPandas
---

[GeoPandas](https://geopandas.org/index.html) extends the datatypes used by [pandas](https://pandas.pydata.org/) to allow spatial operations on geometric types. Geometric operations are performed by [shapely](https://shapely.readthedocs.io/en/stable/manual.html). Geopandas further depends on [fiona](https://github.com/Toblerity/Fiona) for file access and [matplotlib](https://matplotlib.org/) for plotting.

### OS Open Greenspace
---

The OS OpenData product [OS Open Greenspace](https://www.ordnancesurvey.co.uk/business-government/products/open-map-greenspace) depicts the location and extent of spaces such as parks and sports facilities that are likely to be accessible to the public. Where appropriate, it also includes Access Points to show how people get into these sites. Its primary purpose is to enable members of the public to find and access greenspaces near them for exercise and recreation.

<img width="500"
     src="https://beta.ordnancesurvey.co.uk/img-assets/products/greenspace-open-london.x5201e7a5.jpg?w=1242&h=828&crop=828%2C828%2C207%2C0&f=webp?q=100&crop=2270,1422,0,0&w=1000"
     alt="OS Open Greenspace London"
     align="centre" />

In [1]:
import fiona
import geopandas as gpd

ERROR 1: PROJ: proj_create_from_database: Open of /cloud/lib/envs/training/share/proj failed


### List layers in GeoPackage (GPKG)

In [2]:
path = "../../data/ordnance-survey/os-open-greenspace-gb.gpkg"

In [3]:
fiona.listlayers(fp=path)

['access_point', 'greenspace_site']

###  Create GeoDataFrame from GPKG

In [4]:
# Create a GeoPandas GeoDataFrame from a GeoPackage (GPKG)
osogs = gpd.read_file(
    filename=path,
    # GPKG layer
    layer="greenspace_site",
)

### Return top n rows

In [5]:
osogs.head(n=5)

Unnamed: 0,id,function,distinctive_name_1,distinctive_name_2,distinctive_name_3,distinctive_name_4,geometry
0,0295ED12-FCD6-5C37-E063-AAEFA00A445E,Play Space,,,,,"MULTIPOLYGON (((296898.000 668572.930, 296898...."
1,0295ED00-FA59-5C37-E063-AAEFA00A445E,Religious Grounds,Grangemouth Gospel Trust Church,,,,"MULTIPOLYGON (((293715.730 679185.850, 293712...."
2,0295ED69-4CAF-5C37-E063-AAEFA00A445E,Golf Course,Renfrew Golf Course,,,,"MULTIPOLYGON (((249732.430 668113.570, 249743...."
3,0295ECD4-44A8-5C37-E063-AAEFA00A445E,Playing Field,,,,,"MULTIPOLYGON (((337165.470 951341.430, 337169...."
4,0295ECF7-B076-5C37-E063-AAEFA00A445E,Play Space,,,,,"MULTIPOLYGON (((260972.210 666461.830, 260965...."


### Return columns

In [6]:
osogs.columns

Index(['id', 'function', 'distinctive_name_1', 'distinctive_name_2',
       'distinctive_name_3', 'distinctive_name_4', 'geometry'],
      dtype='object')

### Count rows

In [7]:
osogs.shape[0]

150415

### Check Coordinate Reference System (CRS) assignment

In [8]:
osogs.crs

<Projected CRS: EPSG:27700>
Name: OSGB36 / British National Grid
Axis Info [cartesian]:
- E[east]: Easting (metre)
- N[north]: Northing (metre)
Area of Use:
- name: United Kingdom (UK) - offshore to boundary of UKCS within 49°45'N to 61°N and 9°W to 2°E; onshore Great Britain (England, Wales and Scotland). Isle of Man onshore.
- bounds: (-9.01, 49.75, 2.01, 61.01)
Coordinate Operation:
- name: British National Grid
- method: Transverse Mercator
Datum: Ordnance Survey of Great Britain 1936
- Ellipsoid: Airy 1830
- Prime Meridian: Greenwich

### Count rows by geometry type

In [9]:
osogs["geometry"].geom_type.value_counts()

MultiPolygon    150415
Name: count, dtype: int64

### Check geometry validity

In [10]:
osogs["geometry"].is_valid.value_counts()

True    150415
Name: count, dtype: int64

### Geospatial Data Abstraction Library (GDAL)

A wide range of geospatial tooling is built ontop of the [Geospatial Data Abstraction Library (GDAL)](https://gdal.org/index.html) a translator library for raster and vector geospatial data formats.

Support for raster and vector processing is compartmentalised into the OGR and GDAL library components respectively.

In [11]:
# Equivalent dataset inspection using OGR component of GDAL
# https://gdal.org/programs/ogrinfo.html
!/cloud/lib/envs/training/bin/ogrinfo --help

Usage: ogrinfo [--help-general] [-ro] [-q] [-where restricted_where|@filename]
               [-spat xmin ymin xmax ymax] [-geomfield field] [-fid fid]
               [-sql statement|@filename] [-dialect sql_dialect] [-al] [-rl] [-so] [-fields={YES/NO}]
               [-geom={YES/NO/SUMMARY}] [[-oo NAME=VALUE] ...]
               [-nomd] [-listmdd] [-mdd domain|`all`]*
               [-nocount] [-noextent] [-nogeomtype] [-wkt_format WKT1|WKT2|...]
               [-fielddomain name]
               datasource_name [layer [layer ...]]


In [12]:
!/cloud/lib/envs/training/bin/ogrinfo ../../data/ordnance-survey/os-open-greenspace-gb.gpkg

INFO: Open of `../../data/ordnance-survey/os-open-greenspace-gb.gpkg'
      using driver `GPKG' successful.
1: access_point (Point)
2: greenspace_site (Multi Polygon)


In [13]:
# -so provides a 'summary only' output, surpressing feature-level inspection
!/cloud/lib/envs/training/bin/ogrinfo ../../data/ordnance-survey/os-open-greenspace-gb.gpkg greenspace_site  -so

INFO: Open of `../../data/ordnance-survey/os-open-greenspace-gb.gpkg'
      using driver `GPKG' successful.

Layer name: greenspace_site
Geometry: Multi Polygon
Feature Count: 150415
Extent: (9819.840000, 8274.570000) - (655229.500000, 1214133.190000)
Layer SRS WKT:
PROJCRS["OSGB 1936 / British National Grid",
    BASEGEOGCRS["OSGB 1936",
        DATUM["OSGB_1936",
            ELLIPSOID["Airy 1830",6377563.396,299.3249646,
                LENGTHUNIT["metre",1]]],
        PRIMEM["Greenwich",0,
            ANGLEUNIT["degree",0.0174532925199433]],
        ID["EPSG",4277]],
    CONVERSION["unnamed",
        METHOD["Transverse Mercator",
            ID["EPSG",9807]],
        PARAMETER["Latitude of natural origin",49,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8801]],
        PARAMETER["Longitude of natural origin",-2,
            ANGLEUNIT["degree",0.0174532925199433],
            ID["EPSG",8802]],
        PARAMETER["Scale factor at na