# DE Africa Coastlines continental hotspots <img align="right" src="https://github.com/digitalearthafrica/deafrica-sandbox-notebooks/raw/main/Supplementary_data/DE_Africa_Logo_Stacked_RGB_small.jpg">

This code combines individual datasets into continental DE Africa Coastlines layers:
* Combines output shorelines and rates of change statistics point vectors into single continental datasets
* Aggregates this data to produce moving window hotspot datasets that summarise coastal change at regional and continental scale.

This is an interactive version of the code intended for prototyping; to run this analysis at scale, use the [command line tools](DEAfricaCoastlines_generation_CLI.ipynb).


---

## Getting started
Set working directory to top level of repo to ensure links work correctly:

In [None]:
cd ..

### Load packages

First we import the required Python packages, then we connect to the database, and load the catalog of virtual products.

In [None]:
pip install -r requirements.in --quiet

In [None]:
%matplotlib inline
%load_ext line_profiler
%load_ext autoreload
%autoreload 2

import os

import geopandas as gpd
from pathlib import Path

from coastlines.vector import points_on_line
from coastlines.utils import STYLES_FILE

## Load in data

In [None]:
vector_version = "cli_update"
continental_version = "cli_update"
baseline_year = 2020
hotspots_radius = [10000, 2000, 500]


## Make output directory and identify files to load

In [None]:
# Make output directory 
output_dir = Path(f"data/processed/{continental_version}")
output_dir.mkdir(exist_ok=True, parents=True)

# Setup input and output file paths
shoreline_paths = (
    f"data/interim/vector/{vector_version}/*/" f"annualshorelines*.shp"
)
ratesofchange_paths = (
    f"data/interim/vector/{vector_version}/*/" f"ratesofchange*.shp"
)

# Output path for geopackage
OUTPUT_FILE = output_dir / f"coastlines_{continental_version}.gpkg"

## Combine data
### Shorelines

In [None]:
os.system(
    f"ogrmerge.py -o "
    f"{OUTPUT_FILE} {shoreline_paths} "
    f"-single -overwrite_ds -t_srs epsg:6933 "
    f"-nln shorelines_annual"
)

### Rate of change points

In [None]:
os.system(
    f"ogrmerge.py "
    f"-o {OUTPUT_FILE} {ratesofchange_paths} "
    f"-single -update -t_srs epsg:6933 "
    f"-nln rates_of_change"
)

## Continental hotspots
### Prepare data

In [None]:
# Load continental shoreline and rates of change data
ratesofchange_gdf = gpd.read_file(OUTPUT_FILE, layer="rates_of_change")
shorelines_gdf = gpd.read_file(OUTPUT_FILE, layer="shorelines_annual")

# Set year index on coastlines
shorelines_gdf = shorelines_gdf.loc[shorelines_gdf.geometry.is_valid].set_index("year")

In [None]:
# Drop uncertain points from calculation
ratesofchange_gdf = ratesofchange_gdf.loc[
    ratesofchange_gdf.certainty == "good"
].reset_index(drop=True)

# Clip rates to remove extreme distances, as these are likely due to
# modelling errors, not true coastal change
ratesofchange_gdf["rate_time"] = ratesofchange_gdf.rate_time.clip(-250, 250)

### Calculate hotspots


In [None]:
# Convert radius to list if not already
hotspots_radius = (
    [hotspots_radius] if not isinstance(hotspots_radius, list) else hotspots_radius
)

for i, radius in enumerate(hotspots_radius):

    # Extract hotspot points
    print(f"Calculating hotspots at {radius} m")
    hotspots_gdf = points_on_line(
        shorelines_gdf, index=str(baseline_year), distance=radius
    )

    # Create polygon windows
    buffered_gdf = hotspots_gdf[["geometry"]].copy()
    buffered_gdf["geometry"] = buffered_gdf.buffer(radius)

    # Spatial join rate of change points to each polygon, then
    # aggregate/summarise values within each polygon
    hotspot_values = (
        ratesofchange_gdf.sjoin(buffered_gdf, predicate="within")
        .groupby("index_right")["rate_time"]
        .agg([("rate_time", "median"), ("n", "count")])
    )

    # Join aggregated values back to hotspot points
    hotspots_gdf = hotspots_gdf.join(hotspot_values)

    # Add hotspots radius attribute column
    hotspots_gdf["radius_m"] = radius

    # Drop any points with insufficient observations.
    # We can obtain a sensible threshold by dividing the hotspots
    # radius by 30 m along-shore rates of change point distance)
    hotspots_gdf = hotspots_gdf.loc[hotspots_gdf.n > (radius / 30)]

    # Export hotspots to file, incrementing name for each layer
    layer_name = f"hotspots_zoom_{range(0, 10)[i + 1]}"
    try:
        hotspots_gdf.to_file(OUTPUT_FILE, layer=layer_name)
    except ValueError as e:
        print(f"Failed to generate hotspots with error: {e}")

In [None]:
# Insert styles table into GeoPackage
styles = gpd.read_file(STYLES_FILE)
styles.to_file(OUTPUT_FILE, layer="layer_styles")

***

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Africa data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** For assistance with any of the Python code or Jupyter Notebooks in this repository, please post a [Github issue](https://github.com/GeoscienceAustralia/DEACoastLines/issues/new).

**Last modified:** May 2022