# Land Accounts for Vanuatu

In this notebook, we will download some pre-processed satellite imagery,
and use it for producing annual land cover maps over the provinces of Vanuatu.

In [None]:
import geopandas as gpd
import odc.stac
import pystac_client
import rioxarray

Download "Province boundaries - Vanuatu 2016 Population and Housing Census" from https://pacificdata.org/data/dataset/9dba1377-740c-429e-92ce-6a484657b4d9/resource/3d490d87-99c0-47fd-98bd-211adaf44f71/download/2016_phc_vut_pid_4326.geojson

In [None]:
!wget -c https://pacificdata.org/data/dataset/9dba1377-740c-429e-92ce-6a484657b4d9/resource/3d490d87-99c0-47fd-98bd-211adaf44f71/download/2016_phc_vut_pid_4326.geojson

Open the admin boundaries GeoJSON file using
[`geopandas.read_file`](https://geopandas.org/en/v1.1.0/docs/reference/api/geopandas.read_file.html)

In [None]:
gdf = gpd.read_file(filename="2016_phc_vut_pid_4326.geojson")
gdf = gdf.set_index(keys="pname")  # set province name as the index

In [None]:
gdf

In [None]:
gdf.crs

## Step 1: Load GeoMedian from STAC

We will search for some annual composite satellite imagery
produced by Digital Earth Pacific for the years 2017-2024.
The imagery is processed using the GeoMedian algorithm,
making it relatively cloud-free and suitable for our
land cover classification task later.

References:
- Digital Earth Pacific web map - https://maps.digitalearthpacific.org/#share=s-gsmLTJGWiIev8i8UBl0R
- Open Data on AWS page - https://registry.opendata.aws/dep-s2-geomads
- GeoMedian (GeoMAD) algorithm details - https://docs.digitalearthafrica.org/en/latest/sandbox/notebooks/Frequently_used_code/Generating_geomedian_composites.html

In [None]:
YEAR = 2024
PROVINCE = "SHEFA"

In [None]:
# Get geometry of province
GEOM = gdf.loc[PROVINCE].geometry
GEOM

Make a spatiotemporal asset catalog (STAC) query to
https://radiantearth.github.io/stac-browser/#/external/https://stac.digitalearthpacific.org/collections/dep_s2_geomad?.language=en&.itemFilterOpen=1.
Refer to Lesson 1A and 1B on how to use `pystac_client` and `odc.stac`.

In [None]:
# STAC variables
STAC_URL = "http://stac.digitalearthpacific.org/"
stac_client = pystac_client.Client.open(url=STAC_URL)

In [None]:
# STAC search for GeoMedian composite images
s2_search = stac_client.search(
    collections=["dep_s2_geomad"], # Sentinel-2 Geometric Median and Absolute Deviations (GeoMAD) over the Pacific.
    intersects=GEOM,
    datetime=str(YEAR),
)

In [None]:
# Retrieve all items (still just metadata) from search results
s2_items = s2_search.item_collection()
s2_items

In [None]:
s2_data = odc.stac.load(
    items=s2_items,
    bands=["red", "green", "blue", "nir08"],  # TODO more bands
    # bbox=aoi,
    chunks={'x': 1024, 'y': 1024, 'bands': -1, 'time': -1},
    resolution=10,  # TODO higher resolution
)

In [None]:
s2_data

In [None]:
s2_data.odc.geobox

In [None]:
s2_array = s2_data.isel(time=0).to_array("band")
s2_array

In [None]:
s2_array.plot.imshow(
    col="band",
    size=4,
    vmin=0,
    vmax=4000,
)

## Step 2: Clip raster using buffered province polygon

Most of the raster is covered by ocean water.
We will clip the raster data to just land areas + 5km from the coastline.

In [None]:
# Reproject from EPSG:4326 to EPSG:3832
raster_crs = s2_array.rio.crs
gdf_reprojected = gdf.loc[[PROVINCE]].to_crs(crs=raster_crs)
geom_buffered = gdf_reprojected.buffer(distance=5000)

In [None]:
# Show buffered area around province
gdf_buffered

In [None]:
s2_geomad = s2_array.rio.clip(geometries=[gdf_buffered])

In [None]:
s2_geomad.plot.imshow(
    col="band",
    size=4,
    vmin=0,
    vmax=4000,
)

## Step 3: Save to Cloud-optimized GeoTIFF (optional)

The data is >200MB and you will have trouble saving it to
the limited disk space of your DEP Analytics Hub home folder.
However, you can save it to the `/tmp` drive temporarily.

In [None]:
s2_geomad.rio.to_raster(
    raster_path=f"/tmp/{YEAR}_{PROVINCE}_S2_GeoMAD.tif",
    driver="COG",
    compress="zstd",
)