## Example notebook
Reading in Spatio-Temporal Asset Catalogs (STAC) and performing zonal statistics on target areas through time.

Firstly we import the modules we need for this notebook to work. Run `pip install -r requirements.txt` from the root directory if you need to.

In [1]:
1+1

2

In [2]:
import geopandas as gpd
import sys
sys.path.append("..") # this is only required when the imports are a level above the current file, typically not required
import utilities
import zonalStatistics

  from .autonotebook import tqdm as notebook_tqdm


Have a quick look at the geopackage the holds our polygon layer.

In [3]:
display(utilities.list_all_layers_in_geopackage('./hexgrids_example.gpkg'))

['tesselated_10ha_hexagons_on_bribie_island']

Then we identify our target polygons, these will form the underlying rows of our analysis.

In [4]:
gdf = gpd.read_file('./hexgrids_example.gpkg', layer='tesselated_10ha_hexagons_on_bribie_island')
gdf.head()

Unnamed: 0,GRID_ID,intersects_plantation,geometry
0,DIN-3658,1,"MULTIPOLYGON (((17047280.13 -3126517.541, 1704..."
1,DIO-3658,1,"MULTIPOLYGON (((17047574.413 -3126687.446, 170..."
2,DIL-3657,1,"MULTIPOLYGON (((17046691.564 -3126177.733, 170..."
3,DIM-3657,1,"MULTIPOLYGON (((17046985.847 -3126347.637, 170..."
4,DIN-3657,1,"MULTIPOLYGON (((17047280.13 -3126177.733, 1704..."


Excellent! We can see our features and now we are ready to analyse satellite data in these areas. Firstly lets define the target STAC dataset we are looking for. [../resources.json](../resources.json) is a file made by Ben Ross that just defines a few resources. You can certainly modify these attributes are you please.

In [5]:
# This fetches the STAC API URL from the resource metadata.
url = utilities.fetch_resource_metadata("../resources.json")['url']

# This fetches the name of the first sensor defined in the resource metadata.
sensor_name = utilities.fetch_resource_metadata("../resources.json")['sensors'][1]['name']

# This defines which bands to fetch from the STAC API based on the first sensor's band definitions.
bands = list(utilities.fetch_resource_metadata("../resources.json")['sensors'][1]['bands'][0].values())

# Bounds must be in EPSG 4326 for the STAC API search.
bounds = gdf.to_crs('EPSG:4326').total_bounds.tolist()

Now that we know what we are searching for lets generate the virtual array the contains the data by conducting a search.

In [6]:
# This searches for data within the bounding box of the gdf and within the specified time range.
data = utilities.get_data_from_stac(url, bounds, sensor_name, bands, time_range="2025-01-01/2025-12-31")

Unless we want to download data for each and every day lets resample the data to a monthly median so there are less rows to download and calculate.

In [7]:
# This resamples the fetched data to monthly frequency.
data_monthly = utilities.resample_stac_data_to_data_monthly(data)

Now all we have to do is run the primary `zonalStatistics.compute_zonal_stats_bands()` function and then we have our results as a list of csv's in a folder.

In [8]:
zonalStatistics.compute_zonal_stats_bands_vectorized(
    data_monthly=data_monthly,
    gdf=gdf,
    key_column_name='GRID_ID',
    bands=bands,
    output_dir="./example_outputs",
    overwrite=True)



Processing 12 time steps for 505 features
Bands: ['nbart_blue', 'nbart_green', 'nbart_red', 'nbart_nir_1', 'nbart_swir_2', 'nbart_swir_3']


  dest = _reproject(
  dest = _reproject(
100%|██████████| 12/12 [13:15<00:00, 66.31s/it]


Complete. Processed: 6060, Errors: 0, No data: 0, Skipped: 0


Now that all of our data is downloaded and calculated lets combine the files into a single large file which is much easier to work with.

In [9]:
import combineCSV

In [10]:
combineCSV.compile_csvs(
    output_dir="./example_outputs",
    pattern="BANDS*.csv",
    combined_filename="combined.csv",
    key_column_name='GRID_ID',
    recursive=False,
    verbose=False)