# DEA Coastlines continental hotspots <img align="right" src="https://github.com/GeoscienceAustralia/dea-notebooks/raw/develop/Supplementary_data/dea_logo.jpg">

This code combines individual datasets into continental DEA Coastlines layers:
* Combines output shorelines and rates of change statistics point vectors into single continental datasets
* Aggregates this data to produce moving window hotspot datasets that summarise coastal change at regional and continental scale.

This is an interactive version of the code intended for prototyping; to run this analysis at scale, use the [command line tools](DEACoastlines_generation_CLI.ipynb).


---

## Getting started
Set working directory to top level of repo to ensure links work correctly:

In [1]:
cd ..

/home/jovyan/Robbi/dea-coastlines


### Load packages

First we import the required Python packages, then we connect to the database, and load the catalog of virtual products.

In [None]:
pip install -r requirements.in --quiet

In [2]:
%matplotlib inline
%load_ext line_profiler
%load_ext autoreload
%autoreload 2

import os

import numpy as np
import geopandas as gpd
from pathlib import Path

from coastlines.vector import points_on_line, change_regress
from coastlines.utils import STYLES_FILE

## Load in data

In [3]:
vector_version = "testing"
continental_version = "testing"
baseline_year = 2020
hotspots_radius = [2000, 500, 100]


## Make output directory and identify files to load

In [4]:
# Make output directory 
output_dir = Path(f"data/processed/{continental_version}")
output_dir.mkdir(exist_ok=True, parents=True)

# Setup input and output file paths
shoreline_paths = (
    f"data/interim/vector/{vector_version}/*/" f"annualshorelines*.shp"
)
ratesofchange_paths = (
    f"data/interim/vector/{vector_version}/*/" f"ratesofchange*.shp"
)

# Output path for geopackage
OUTPUT_FILE = output_dir / f"coastlines_{continental_version}.gpkg"

## Combine data
### Shorelines

In [5]:
os.system(
    f"ogrmerge.py -o "
    f"{OUTPUT_FILE} {shoreline_paths} "
    f"-single -overwrite_ds -t_srs epsg:6933 "
    f"-nln shorelines_annual"
)



0

### Rate of change points

In [6]:
os.system(
    f"ogrmerge.py "
    f"-o {OUTPUT_FILE} {ratesofchange_paths} "
    f"-single -update -t_srs epsg:6933 "
    f"-nln rates_of_change"
)

0

## Continental hotspots
### Prepare data

In [7]:
# Load continental shoreline and rates of change data
ratesofchange_gdf = gpd.read_file(OUTPUT_FILE, layer="rates_of_change")
shorelines_gdf = gpd.read_file(OUTPUT_FILE, layer="shorelines_annual")

# Set year index on coastlines
shorelines_gdf = shorelines_gdf.loc[shorelines_gdf.geometry.is_valid].set_index("year")

In [8]:
# Drop uncertain points from calculation
ratesofchange_gdf = ratesofchange_gdf.loc[
    ratesofchange_gdf.certainty == "good"
].reset_index(drop=True)

### Calculate hotspots


In [9]:
# Convert radius to list if not already
hotspots_radius = (
    [hotspots_radius] if not isinstance(hotspots_radius, list) else hotspots_radius
)

for i, radius in enumerate(hotspots_radius):

    # Extract hotspot points
    print(f"Calculating hotspots at {radius} m")
    hotspots_gdf = points_on_line(
        shorelines_gdf, index=str(baseline_year), distance=int(radius / 2)
    )

    # Create polygon windows by buffering points
    buffered_gdf = hotspots_gdf[["geometry"]].copy()
    buffered_gdf["geometry"] = buffered_gdf.buffer(radius)

    # Spatial join rate of change points to each polygon
    hotspot_grouped = (
        ratesofchange_gdf.loc[
            :, ratesofchange_gdf.columns.str.contains("dist_|geometry")
        ]
        .sjoin(buffered_gdf, predicate="within")
        .groupby("index_right")
    )

    # Aggregate/summarise values by taking median of all points
    # within each buffered polygon
    hotspot_values = hotspot_grouped.median().round(2)

    # Extract year from distance columns (remove "dist_")
    x_years = hotspot_values.columns.str.replace("dist_", "").astype(int)

    # Compute coastal change rates by linearly regressing annual
    # movements vs. time
    rate_out = hotspot_values.apply(
        lambda row: change_regress(
            y_vals=row.values.astype(float), x_vals=x_years, x_labels=x_years
        ),
        axis=1,
    )

    # Add rates of change back into dataframe
    hotspot_values[
        ["rate_time", "incpt_time", "sig_time", "se_time", "outl_time"]
    ] = rate_out

    # Join aggregated values back to hotspot points after
    # dropping unused columns (regression intercept)
    hotspots_gdf = hotspots_gdf.join(hotspot_values.drop("incpt_time", axis=1))

    # Add hotspots radius attribute column
    hotspots_gdf["radius_m"] = radius

    # Drop any points with insufficient observations.
    # We can obtain a sensible threshold by dividing the hotspots
    # radius by 30 m along-shore rates of change point distance)
    hotspots_gdf["n"] = hotspot_grouped.size()
    hotspots_gdf = hotspots_gdf.loc[hotspots_gdf.n > (radius / 30)]

    # Export hotspots to file, incrementing name for each layer
    try:

        # Set up schema to optimise file size
        schema_dict = {
            key: "float:8.2" for key in hotspots_gdf.columns if key != "geometry"
        }
        schema_dict.update(
            {
                "sig_time": "float:8.3",
                "outl_time": "str:80",
                "radius_m": "int:5",
                "n": "int:4",
            }
        )
        col_schema = schema_dict.items()

        # Export file
        layer_name = f"hotspots_zoom_{range(0, 10)[i + 1]}"
        hotspots_gdf.to_file(
            OUTPUT_FILE,
            layer=layer_name,
            schema={"properties": col_schema, "geometry": "Point"},
        )

    except ValueError as e:

        print(f"Failed to generate hotspots with error: {e}")

Calculating hotspots at 2000 m
Calculating hotspots at 500 m
Calculating hotspots at 100 m


In [10]:
# Insert styles table into GeoPackage
styles = gpd.read_file(STYLES_FILE)
styles.to_file(OUTPUT_FILE, layer="layer_styles")

***

## Additional information

**License:** The code in this notebook is licensed under the [Apache License, Version 2.0](https://www.apache.org/licenses/LICENSE-2.0). 
Digital Earth Australia data is licensed under the [Creative Commons by Attribution 4.0](https://creativecommons.org/licenses/by/4.0/) license.

**Contact:** For assistance with any of the Python code or Jupyter Notebooks in this repository, please post a [Github issue](https://github.com/GeoscienceAustralia/dea-coastlines/issues/new).

**Last modified:** July 2022