## Reducing geometry to single point

To "step out" of the geometrical calculations we'll add columns for X and Y coordinates, storing the center point of the area and will use that to calculate bounding box (viewport) area. This is an allowable approximation, especially since the alternative is running a computationally expensive spatial intersect operation for a user interaction that's very repeatable and frequently ran (moving, panning, zooming the map).

This also allows us to entirely forego geospatial databases and use DuckDB - a highly efficient SQLite implementation that supports file-based mode and parquet file binding.

We need to convert the datasets from CRS 27700 to 4326 that's used in the projection. We can safely disregard the `Geometry is in a geographic CRS. Results from 'centroid' are likely incorrect. Use 'GeoSeries.to_crs()' to re-project geometries to a projected CRS before this operation.` User Warning, as the approximation of the centroid is accurate enough for our purposes.

### Notes

All runtimes are indicated for Apple M1 Max 64GB. 

In [None]:
import pandas as pd
import geopandas as gpd
gpd.options.io_engine = 'pyogrio'

In [None]:
# Runtime: 2s, RAM: 1.5GB max
awi_dataset = gpd.read_parquet('../data/processed/gb_awi_dataset.parquet')
awi_dataset = awi_dataset.to_crs(epsg=4326)

awi_dataset_points = pd.DataFrame(awi_dataset)

awi_dataset_points['x'] = awi_dataset.geometry.centroid.x
awi_dataset_points['y'] = awi_dataset.geometry.centroid.y
awi_dataset_points = awi_dataset_points[['type_combined', 'type_aggregate', 'type_source', 'area_ha', 'x', 'y']]
awi_dataset_points.to_parquet('../data/area/gb_awi_dataset_points.parquet')

del awi_dataset, awi_dataset_points

In [None]:
# Runtime: 1m20s, RAM: 8GB max
for year in range(2012, 2023):
    nfi_dataset = gpd.read_parquet(f'../data/processed/gb_nfi_dataset_{year}.parquet')
    nfi_dataset = nfi_dataset.to_crs(epsg=4326)

    nfi_dataset_points = pd.DataFrame(nfi_dataset)
    nfi_dataset_points['x'] = nfi_dataset.geometry.centroid.x
    nfi_dataset_points['y'] = nfi_dataset.geometry.centroid.y
    nfi_dataset_points = nfi_dataset_points[['type_combined', 'type_aggregate', 'type_source', 'area_ha', 'x', 'y']]
    nfi_dataset_points.to_parquet(f'../data/area/gb_nfi_dataset_{year}_points.parquet')

del nfi_dataset, nfi_dataset_points

In [None]:
# Runtime: 2m, RAM: 12GB max
for year in range(2012, 2023):
    nfi_awi_overlay = gpd.read_parquet(f'processed/gb_nfi_awi_overlay_{year}.parquet')
    nfi_awi_overlay = nfi_awi_overlay.to_crs(epsg=4326)
    
    nfi_awi_overlay_points = pd.DataFrame(nfi_awi_overlay)
    nfi_awi_overlay_points['x'] = nfi_awi_overlay.geometry.centroid.x
    nfi_awi_overlay_points['y'] = nfi_awi_overlay.geometry.centroid.y
    nfi_awi_overlay_points = nfi_awi_overlay_points[['type_overlay', 'type_combined', 'type_aggregate', 'type_source', 'area_ha', 'x', 'y']]
    nfi_awi_overlay_points.to_parquet(f'../data/area/gb_nfi_awi_overlay_{year}_points.parquet')

#del nfi_awi_overlay, nfi_awi_overlay_points