# Indonesian ensemble rankings creation

This notebook creates a rankings file that ranks supplied tide models at specified coastal locations.
At each location, the NDWI inundation index from landsat and sentinal 2 satellite imagery is correlated against each tide
model, with the result used to rank the accuracy of each tide model.  A high correlation, i.e. a result that is
close to 1, has a higher ranking than a result that is closer to zero or a negative number.   

### Load packages

In [None]:
import pandas as pd
import geopandas as gpd
from datacube.utils.dask import start_local_dask
from eo_tides.validation import model_rankings_ndwi

import os
os.environ["USE_PYGEOS"] = "0"

## Dask client
Create local dask client for parallelisation

In [None]:
# Create local dask client for parallelisation
dask_client = start_local_dask(
    n_workers=16, threads_per_worker=8, mem_safety_margin="2GB"
)

print(
    dask_client.dashboard_link.replace(
        "/user", "https://hub.asia.easi-eo.solutions/user"
    )
)

### Tide Models

Set model and model data location

In [None]:
# Params
model_list =[
    'FES2014', 'FES2022', 'EOT20', 'TPXO9-atlas-v5-nc', 'TPXO10-atlas-v2-nc', 'GOT5.6'
]
model_directory = "../../tide_models_indo"

### Points of interest

Create a point every x kms along the Indonesian coastline

1. Download https://public.opendatasoft.com/api/explore/v2.1/catalog/datasets/world-administrative-boundaries/exports/geojson
2. Extract Indonesian boundary.
3. Reproject Indonesian boundary to EPSG:32651 - WGS 84 / UTM zone 51N.
4. Convert polygon to linestring.
5. Create points along linestring every 10km
6. Clip to aoi (using 4km buffer of baseline coastline (2021))

In [None]:
poi_file = "s3://files.auspatious.com/coastlines/indonesia_coastline_10km_points_aoi.geojson"

### Tide rankings
For each point/location, correlate NDWI inundation with tide model and rank

In [None]:
# Input tide ranking locations
poi = gpd.read_file(poi_file).to_crs('EPSG:4326')
coords = poi.geometry.get_coordinates()

out_list = []

# Loop through tide ranking locations and determine tide ranking
for index, row in coords.iterrows():

    print (f"Processing {row['x']}, {row['y']}")
    corr_df, _ = model_rankings_ndwi(
        x=row['x'],
        y=row['y'],
        time_range=("2020", "2022"),
        model=model_list,
        directory=model_directory,
    )

    out_list.append(corr_df)

### Data wrangling

Change the shape of the data to suit the tide model functions from eo-tides

In [None]:

# Concatenate outputs and move "x", "y", "statistic" to columns
df_reset = pd.concat(out_list).reset_index()
#print(df_reset)
# Pivot to get one row per (x, y), with columns for each model/statistic,
# and flatten the multi-index columns into a single string
df_wide = df_reset.pivot(index=["x", "y"], columns="statistic")
print(df_wide)
df_wide.columns = [f"{stat}_{col}" for col, stat in df_wide.columns]

# Create GeoDataFrame with geometry from x and y columns
model_rankings_gdf = gpd.GeoDataFrame(
    df_wide,
    geometry=gpd.points_from_xy(
        df_wide.index.get_level_values("x"), df_wide.index.get_level_values("y")
    ),
    crs="EPSG:4326",
)

### Output
Save the result as a flatgeobuf file

In [None]:
model_rankings_gdf_export = model_rankings_gdf.rename(columns={'rank_valid_perc': 'valid_perc'})
model_rankings_gdf_export.to_file ('indo_model_ranking.fgb', driver='FlatGeobuf')

### Close Dask client

In [None]:
dask_client.close()