# Demo: Spatial tools

This notebook demonstrates how to vectorize raster datasets and rasterize vector datasets.
It has been adapted from a [Digital Earth Australia notebook](https://knowledge.dea.ga.gov.au/notebooks/How_to_guides/Rasterize_vectorize/) that demonstrates the same functionality.

This notebook demonstrates

1. How to load water observations data
1. How to apply a threshold to identify regularly occurring water bodies
1. How to convert the raster water bodies into a vector dataset
1. How to convert the vector water bodies back to a raster dataset

## Set up
The following cell should be uncommented and run if you installed the package in editable mode and are actively developing and testing modules.
Otherwise, it can be left commented.

In [None]:
# %load_ext autoreload
# %autoreload 2

### Enable logging

This will allow you to see info and warning messages from the package.

In [None]:
import logging
import sys

logging.basicConfig(
    format="%(asctime)s | %(levelname)s : %(message)s",
    level=logging.INFO,
    stream=sys.stdout,
)

### Import the relevant packages

The `xr_rasterize` and `xr_vectorize` functions are loaded from the `spatial_tools` module.

This demo uses the Digital Earth Australia Water Observations product, and hence imports the `de_australia_stac_config`.
For more information on available configurations, see [configuration_demo.ipynb](configuration_demo.ipynb)

In [None]:
import matplotlib.pyplot as plt

from eo_insights.raster_base import RasterBase, QueryParams, LoadParams
from eo_insights.spatial import xr_rasterize, xr_vectorize
from eo_insights.stac_configuration import de_australia_stac_config

You can check the available collections using the `.list_collections()` method.

In [None]:
de_australia_stac_config.list_collections()

## Load data
### Set up query and load parameters

Date range and bounding box are set as part of the `QueryParams` class.
CRS, resolution, and desired bands are set as part of the `LoadParams` class.

It is worth noting that this is a summary product, and has no associated pixel quality masks.
To learn more about masking, see [masking_demo.ipynb](masking_demo.ipynb). 

In [None]:
query_params = QueryParams(
    bbox=(142.1, -32.6, 142.80, -32.1),
    start_date="2000",
    end_date="2000",
)

load_params = LoadParams(
    crs="EPSG:3577",
    resolution=10,
    bands=("frequency"),
)

stac_raster = RasterBase.from_stac_query(
    config=de_australia_stac_config,
    collections=["ga_ls_wo_fq_cyear_3"],
    query_params=query_params,
    load_params=load_params,
)

## Apply a frequency threshold to get water bodies

The next step selects all pixels that were recorded as wet 25% of the time.

In [None]:
# Select pixels that are classified as water > 25 % of the time
water_bodies = stac_raster.data.frequency > 0.25

# Plot the data
water_bodies.plot(size=5)

## Convert raster water bodies to a vector dataset

This function takes the array of waterbodies, as well as an argument called `mask` which specifies which items should be vectorized.
The function returns a geopandas GeoDataFrame.

In [None]:
gdf = xr_vectorize(da=water_bodies, mask=water_bodies.data == 1)

gdf.explore()

## Convert vector water bodies to a raster dataset

This function takes the GeoDataFame of water bodies, as well as an argument called `da` which is an Xarray DataArray that acts as a template for the raster output, specifying the expected dimensions, coordinates and attributes.
The function returns an Xarray DataArray.

In [None]:
water_bodies_again = xr_rasterize(gdf=gdf, da=water_bodies)

fig, axes = plt.subplots(1, 2, figsize=(12, 4))
water_bodies.plot(ax=axes[0])
water_bodies_again.plot(ax=axes[1])
axes[0].set_title("Original waterbodies")
axes[1].set_title("Rasterized waterbodies")