# What is the Primary Shoreline-Change Signal for a Transect?

This process aims to determine the primary shoreline-change signal for a transect by following these steps:

1. Heuristic to estimate primary observation. Decide whether to use observations furthest offshore or closest to the transect origin (typically set at 1000 meters). This decision is based on the number of multiple intersections per year per transect. For transects with many multiple intersections, points furthest offshore are selected.

2. Outlier Detection and Removal: Recursively detect and remove outliers while identifying the primary observations using Median Absolute Deviation (MAD). For instance, following step 1, all shoreline observations furthest offshore are considered primary observations. If a primary observation is an outlier, it is removed. The process continues by identifying a new primary observation for the removed year and repeating the outlier detection.

3. Address Multiple Shorelines: Handle areas where the transect spans multiple shorelines. If certain observations are missing (e.g., due to cloud cover), other shorelines are included in the time series. This can result in time series with large step changes, leading to inaccurate trends. This step groups different series at the transect level by assigning observations to groups within a specified maximum step change range. The maximum step change used is 150m, which may still cause inaccuracies in narrower coastal landforms.

4. Group Selection Based on Relative Importance: Identify the most representative group from multiple shoreline series per transect by evaluating the relative importance of each group. Only groups with a significant number of observations are considered representative. For example, a transect may have several groups with varying importance.

5. Selection of Closest Shoreline Series: From the relatively important groups, select the shoreline series closest to the reference shoreline.

6. Computation of Statistics: Compute statistical measures on the primary observations.

By following these steps, the primary shoreline-change signal is accurately identified for each transect, ensuring that the most relevant and representative data is used for analysis. This process also provides statistics at the transect level for the primary observations.

## Instructions

1. Use the DynamicMap: By default, the map is set to Namibia. Select your area of interest on the DynamicMap. After selecting the area, proceed to the next step to store the spatial extent in variables and retrieve the data from cloud storage. 

2. Create Visualization Panels: Run the subsequent cells to generate panels that visualize different filtering techniques (nothing; step 1-2; step 3; step 1-6). By default, these panels open in a new tab in the browser for stability. If the plot does not display correctly, refresh the tab several times (up to 10 times may be needed).

3. Final Panel for Processing Steps: The final panel integrates all processing steps. Use the point draw tool in the upper toolbar to click on the map and display the time series of interest. The red series shows the observations that were used to derive the ambient change. 

By following these steps, you will effectively visualize and analyze the shoreline-change signals within your area of interest using the provided tools and panels.

In practice, once you have become familiar with the different visualiation tools, you probably want to explore several areas using the final dashboard (part 3) and see if those rates and series are representative. 

In [1]:
import sys

import dask

dask.config.set({"dataframe.query-planning": False})

import logging
import os
import pathlib

import coastpy
import colorcet as cc
import dask_geopandas
import duckdb
import geopandas as gpd
import geoviews as gv
import holoviews as hv
import hvplot.pandas
import numpy as np
import pandas as pd
import panel as pn
import pystac
import shapely
from dotenv import load_dotenv

from coastmonitor.shorelines.intersection import (
    add_transect_statistics,
    find_primary_signal_per_transect_group,
)

load_dotenv(override=True)

# NOTE: access tokens to the data are available upon request.
sas_token = os.getenv("AZURE_STORAGE_SAS_TOKEN")
account_name = os.getenv("AZURE_STORAGE_ACCOUNT_NAME")
storage_options = {"account_name": account_name, "credential": sas_token}

# These are the URL's to the STAC catalog that we can use to efficiently index the data
COCLICO_STAC_URL = "https://coclico.blob.core.windows.net/stac/v1/catalog.json"

# Global Coastal Transect System (publicly available and in review)
GCTS_COLLECTION_NAME = "gcts"

# Global Coastal Transect Repository (unreleased; access keys provided upon request). This dataset consists
# of GCTS + several other characteristics, such as intersection distance to nearest coastline.
GCTR_COLLECTION_NAME = "gctr"

# ShorelineMonitor Raw Series (unreleased; access keys provided upon request). This dataset consists
# ShorelineMonitor Shorlines that are mapped onto the Global Coastal Transect System (Raw Series) that
# have a wide range of additional statistics used to filter out the primary, high-quality observations.
SM_COLLECTION_NAME = "shorelinemonitor-raw-series"

# These are the transect columns required for the analysis
TRANSECT_COLUMNS = [
    "tr_name",
    "lon",
    "lat",
    "bearing",
    "geometry",
    "coastline_is_closed",
    "coastline_length",
    "utm_crs",
    "bbox",
    "quadkey",
    "country",
    "common_country_name",
    "dist_b0",
    "dist_b30",
    "dist_b330",
]

hv.extension("bokeh")
pn.extension()

## Read the STAC collections

In [2]:
coclico_catalog = pystac.Catalog.from_file(COCLICO_STAC_URL)
sm_collection = coclico_catalog.get_child(SM_COLLECTION_NAME)
gcts_collection = coclico_catalog.get_child(GCTR_COLLECTION_NAME)

## Show the spatial extents of both collections

In [3]:
sm_extents = coastpy.io.utils.read_items_extent(sm_collection)
gcts_extents = coastpy.io.utils.read_items_extent(gcts_collection)
sm_extents[["geometry"]].explore()

## Create a interactive map that we use to define our region of interest

In [7]:
from ipyleaflet import Map, basemaps

m = Map(basemap=basemaps.Esri.WorldImagery, scroll_wheel_zoom=True)
m.center = -22.946301, 14.410124
m.zoom = 12
m.layout.height = "800px"
m

Map(center=[-22.946301, 14.410124], controls=(ZoomControl(options=['position', 'zoom_in_text', 'zoom_in_title'…

In [8]:
# NOTE: these coordiantes are extracted from the interactive map above
minx, miny, maxx, maxy = m.west, m.south, m.east, m.north
roi = gpd.GeoDataFrame(geometry=[shapely.box(minx, miny, maxx, maxy)], crs=4326)

## Create a DuckDB query engine to retrieve data from cloud storage

In [9]:
from coastmonitor.shorelines.intersection import (
    clean_raw_series,
    compute_diffs,
    compute_ols_trend,
)

sds_ts_engine = coastpy.io.STACQueryEngine(
    stac_collection=sm_collection,
    storage_backend="azure",
)
sds_ts = sds_ts_engine.get_data_within_bbox(minx, miny, maxx, maxy)
transects_engine = coastpy.io.STACQueryEngine(
    stac_collection=gcts_collection, storage_backend="azure", columns=TRANSECT_COLUMNS
)
transects = transects_engine.get_data_within_bbox(minx, miny, maxx, maxy)
sds_ts_clean = clean_raw_series(
    sds_ts,
    transects,
    method="offshore",
    multi_obs_threshold=17.5,
    max_step_change=150,
    relative_importance_threshold=0.6,
)
# filter out the primary observations to get cleaner data
sds_ts_clean[sds_ts_clean["obs_is_primary"]].head()

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

FloatProgress(value=0.0, layout=Layout(width='auto'), style=ProgressStyle(bar_color='black'))

Unnamed: 0,time,tr_name,obs_group,source_file,utm_crs,geometry,lon,lat,shoreline_chainage,shoreline_position,...,min_distance,max_offshore_distance,series_group,obs_count,obs_baseline,obs_primary_count,tr_stdev,tr_range,qa_pct,obs_primary_mdn
0,1984-01-01,cl32202s00tr00110533,0,BOX_077_063,32733,POINT (14.54058 -22.82041),14.54058,-22.820408,1122.946655,38.198975,...,,,0,35,1084.747681,35.0,18.621044,77.489502,1.0,21.768311
1,1985-01-01,cl32202s00tr00110533,0,BOX_077_063,32733,POINT (14.54122 -22.82045),14.541222,-22.820446,1056.937012,-27.810669,...,,,0,35,1084.747681,35.0,18.621044,77.489502,1.0,21.768311
2,1986-01-01,cl32202s00tr00110533,0,BOX_077_063,32733,POINT (14.54122 -22.82045),14.541217,-22.820446,1057.440063,-27.307617,...,,,0,35,1084.747681,35.0,18.621044,77.489502,1.0,21.768311
3,1987-01-01,cl32202s00tr00110533,0,BOX_077_063,32733,POINT (14.54095 -22.82043),14.540953,-22.820431,1084.598999,-0.148682,...,,,0,35,1084.747681,35.0,18.621044,77.489502,1.0,21.768311
4,1989-01-01,cl32202s00tr00110533,0,BOX_077_063,32733,POINT (14.54095 -22.82043),14.540946,-22.820429,1085.326782,0.579102,...,,,0,35,1084.747681,35.0,18.621044,77.489502,1.0,21.768311


### Visualize in a small app

In [None]:
from coastmonitor.visualization.apptools import SpatialDataFrameApp

sds_ts_clean_ac = compute_ols_trend(
    sds_ts_clean[sds_ts_clean["obs_is_primary"]],
    transects,
    x="time",
    y="shoreline_position",
)

app = SpatialDataFrameApp(sds_ts_clean_ac, transects, sds_ts_clean)
app.create_view()
app.view.show()