# coastline-predictor

This library written in Python aims to map the coastline of a defined area and predict its future evolution using Google Earth Engine satellite images.

This project is heavily supported by the work of **Kilian Vos** who developed *CoastSat*, a Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery. Do not hesitate to view the github [page](https://github.com/kvos/CoastSat) of this library as well as the README which is very well detailed.

If you want to be notified when a new release of this package is made, you can tick the Releases box in the “Watch / Unwatch => Custom” menu at the top right of this page.

### Description

Satellite remote sensing can provide low-cost long-term shoreline data capable of resolving the temporal scales of interest to coastal scientists and engineers at sites where no in-situ field measurements are available. CoastSat enables the non-expert user to extract shorelines from Landsat 5, Landsat 7, Landsat 8 and Sentinel-2 images. The shoreline detection algorithm implemented in CoastSat is optimised for sandy beach coastlines. It combines a sub-pixel border segmentation and an image classification component, which refines the segmentation into four distinct categories such that the shoreline detection is specific to the sand/water interface. 

**coastline-predictor** uses this tool to extract the shorelines from GEE images and then obtain the time-series related to them. These time series are at the heart of this model as the prediction is made on them. The evolution of the time series is predicted using the **Holt's trend** model which is a popular smoothing model for forecasting data with trend.

### Table of Contents

* [1. Installation](#chapter1)
    * [1.0 Clone the repository from GitHub](#section_1_0)
    * [1.1 Create an environment with Anaconda](#section_1_1)
    * [1.2 Activate Google Earth Engine Python API](#section_1_2)
* [2. Usage](#chapter2)
    * [2.1 Retrieval of the satellite images](#section_2_1)
    * [2.2 Shoreline detection](#section_2_2)
    * [2.3 Shoreline change analysis](#section_2_3)
    * [2.4 Tidal correction](#section_2_4)
    * [2.5 Time series forecasting](#section_2_5)
    * [2.6 New shoreline extraction](#section_2_6)
    * [2.7 Computation of the prediction's generalization error](#section_2_7)
    * [2.8 Threatened population estimation](#section_2_8)

## 1. Installation <a class="anchor" id="chapter1"></a>

### 1.0 Clone the repository from GitHub <a class="anchor" id="section_1_0"></a>

Start by installing **GitHub desktop** on your machine.

Then open GitHub desktop and follow this [tutorial](https://docs.github.com/en/desktop/contributing-and-collaborating-using-github-desktop/adding-and-cloning-repositories/cloning-and-forking-repositories-from-github-desktop) to clone the repository on your machine.

### 1.1 Create an environment with Anaconda <a class="anchor" id="section_1_1"></a>

You first need to install the required Python libraries in an environment to run this package. To do this, we will use Anaconda, which can be downloaded freely [here](https://www.anaconda.com/products/individual).

Once you have it installed, open the **Anaconda PowerShell** prompt (a terminal window in MacOS and Linux) and use `cd` (change directory) and `ls` (list files and directories) commands to go to the folder where you have downloaded the **coastline-predictor** repository.

Create a new environment named `coastpred` that will contain all the required packages:

`conda env create -f environment.yml -n coastpred`

Now, activate the environment:

`conda activate coastpred`

To confirm that you have successfully activated coastpred, your command line prompt should begin with *coastpred*.
 
### 1.2 Activate Google Earth Engine Python API <a class="anchor" id="section_1_2"></a>

You first need to sign up to Google Earth Engine at https://signup.earthengine.google.com/. 

Once your request has been approved, with the `coastpred` environment activated, run the following command on the Anaconda Prompt to link your environment to the GEE server.

`earthengine authenticate`

A web browser will open, login with a gmail account and accept the terms and conditions. Then copy the authorization code into the Anaconda terminal.

Now you are ready to start using the `coastline-predictor` toolbox!

*Note 1: remember to always activate the environment with `conda activate coastpred` each time you are preparing to use the package.*

*Note 2: a script is being developed to automatically run these commands and make the installation easier.*

## 2. Usage <a class="anchor" id="chapter2"></a> 

An example of how to run the software in a Jupyter Notebook is provided in the repository (`example_notebook.ipynb`). To run this, first activate your `coastpred` environment with `conda activate coastpred` (if not already active), and then type:

`jupyter notebook`

You can also double click on the batch script `run_jupyter` which executes automatically the above commands if you prefer not to use the terminal.

A web browser window will open. Point to the directory where you downloaded this repository and click on `example_notebook.ipynb`.

The following sections guide the reader through the different functionalities of `coastline-predictor` with an example at West-Point, Liberia. If you prefer to use Spyder, PyCharm or other integrated development environments (IDEs), a Python script named example.py is also included in the repository.

A Jupyter Notebook combines formatted text and code. To run the code, place your cursor inside one of the code sections and click on the `run cell` button (or press `Shift` + `Enter`) and progress forward.

### 2.1 Retrieval of the satellite images <a class="anchor" id="section_2_1"></a>

To retrieve from the GEE server the available satellite images cropped around the user-defined region of coastline for the particular time period of interest, the following variables are required:

* `polygon`: the coordinates of the region of interest (longitude/latitude pairs in WGS84)
* `dates`: dates over which the images will be retrieved (e.g., dates = ['2017-12-01', '2018-01-01'])
* `sat_list`: satellite missions to consider (e.g., sat_list = ['L5', 'L7', 'L8', 'S2'] for Landsat 5, 7, 8 and Sentinel-2 collections)
* `sitename`: name of the site (this is the name of the subfolder where the images and other accompanying files will be stored)
* `filepath`: filepath to the directory where the data will be stored

The call `metadata = SDS_download.retrieve_images(inputs)` will launch the retrieval of the images and store them as .TIF files (under /filepath/sitename). The metadata contains the exact time of acquisition (in UTC time) of each image, its projection and its geometric accuracy. If the images have already been downloaded previously and the user only wants to run the shoreline detection, the metadata can be loaded directly by running `metadata = SDS_download.get_metadata(inputs)`.

The cell below shows an example of inputs that will retrieve all the images of West-Point, Liberia acquired by Sentinel-2 in 2015.

In [None]:
# region of interest (longitude, latitude)
polygon = [[[-10.812718, 6.334993],  
            [-10.802246, 6.334965], 
            [-10.802246, 6.321754],
            [-10.812718, 6.321754],
            [-10.812718, 6.334993]]] 
# it's recommended to convert the polygon to the smallest rectangle (sides parallel to coordinate axes)       
polygon = SDS_tools.smallest_rectangle(polygon)
# date range
dates = ['2015-12-01', '2015-12-31']
# satellite missions
sat_list = ['S2']
# name of the site
sitename = 'WestPoint'
# directory where the data will be stored
filepath = os.path.join(os.getcwd(), 'data')

**Note:** The area of the polygon should not exceed 100 km2, so for very long beaches split it into multiple smaller polygons.

### 2.2 Shoreline detection <a class="anchor" id="section_2_2"></a>

To map the shorelines, the following user-defined settings are needed:

* `cloud_thresh`: threshold on maximum cloud cover that is acceptable on the images (value between 0 and 1 - this may require some initial experimentation).
* `output_epsg`: epsg code defining the spatial reference system of the shoreline coordinates. It has to be a cartesian coordinate system (i.e. projected) and not a geographical coordinate system (in latitude and longitude angles). See http://spatialreference.org/ to find the EPSG number corresponding to your local coordinate system. If unsure, use 3857 which is the web-mercator.
* `check_detection`: if set to True the user can quality control each shoreline detection interactively (recommended when mapping shorelines for the first time) and accept/reject each shoreline.
* `adjust_detection`: in case users wants more control over the detected shorelines, they can set this parameter to True, then they will be able to manually adjust the threshold used to map the shoreline on each image.
* `save_figure`: if set to True a figure of each mapped shoreline is saved under /filepath/sitename/jpg_files/detection, even if the two previous parameters are set to False. Note that this may slow down the process.

There are additional parameters (`min_beach_size`, `buffer_size`, `min_length_sl`, `cloud_mask_issue` and `sand_color`) that can be tuned to optimise the shoreline detection (for Advanced users only). For the moment leave these parameters set to their default values, we will see later how they can be modified.

An example of settings is provided here:

In [None]:
settings = { 
    # general parameters:
    'cloud_thresh': 0.5,        # threshold on maximum cloud cover
    'output_epsg': 3857,        # epsg code of spatial reference system desired for the output   
    # quality control:
    'check_detection': True,    # if True, shows each shoreline detection to the user for validation
    'adjust_detection': False,  # if True, allows user to adjust the postion of each shoreline by changing the threhold
    'save_figure': True,        # if True, saves a figure showing the mapped shoreline for each image
    # [ONLY FOR ADVANCED USERS] shoreline detection parameters:
    'min_beach_area': 4500,     # minimum area (in metres^2) for an object to be labelled as a beach
    'buffer_size': 150,         # radius (in metres) for buffer around sandy pixels considered in the shoreline detection
    'min_length_sl': 1200,       # minimum length (in metres) of shoreline perimeter to be valid
    'cloud_mask_issue': False,  # switch this parameter to True if sand pixels are masked (in black) on many images  
    'sand_color': 'default',    # 'default', 'dark' (for grey/black sand beaches) or 'bright' (for white sand beaches)
    # add the inputs defined previously
    'inputs': inputs
}

Once all the settings have been defined, the batch shoreline detection can be launched by calling:

`output = extract_shorelines.extract_shorelines(metadata, settings, inputs)`

When `check_detection` is set to `True`, a figure like the one below appears and asks the user to manually accept/reject each detection by pressing **on the keyboard** the right arrow (⇨) to keep the shoreline or left arrow (⇦) to skip the mapped shoreline. The user can break the loop at any time by pressing escape (nothing will be saved though).

![Alt Text](https://user-images.githubusercontent.com/7217258/60766769-fafda480-a0f1-11e9-8f91-419d848ff98d.gif)

When `adjust_detection` is set to `True`, a figure like the one below appears and the user can adjust the position of the shoreline by clicking on the histogram of MNDWI pixel intensities. Once the threshold has been adjusted, press Enter and then accept/reject the image with the keyboard arrows.

![Alt Text](https://github.com/kvos/CoastSat/raw/master/doc/adjust_shorelines.gif)

Once all the shorelines have been mapped, the output is available in two different formats (saved under */filepath/data/sitename*):

* `sitename_output.pkl`: contains a list with the shoreline coordinates, the exact timestamp at which the image was captured (UTC time), the geometric accuracy and the cloud cover of each individual image. This list can be manipulated with Python, a snippet of code to plot the results is provided in the example script.
* `sitename_output.geojson`: this output can be visualised in a GIS software (e.g., QGIS, ArcGIS).

**Reference shoreline**

Before running the batch shoreline detection, there is the option to manually digitize a reference shoreline on one cloud-free image. This reference shoreline helps to reject outliers and false detections when mapping shorelines as it only considers as valid shorelines the points that are within a defined distance from this reference shoreline.

The user can manually digitize one or several reference shorelines on one of the images by calling:

In [None]:
settings['reference_shoreline'] = SDS_preprocess.get_reference_sl_manual(metadata, settings)
settings['max_dist_ref'] = 100 # max distance (in meters) allowed from the reference shoreline

This function allows the user to click points along the shoreline on cloud-free satellite images.

![Alt Text](https://user-images.githubusercontent.com/7217258/70408922-063c6e00-1a9e-11ea-8775-fc62e9855774.gif)

The maximum distance (in metres) allowed from the reference shoreline is defined by the parameter `max_dist_ref`. This parameter is set to a default value of 100 m. If you think that 100 m buffer from the reference shoreline will not capture the shoreline variability at your site, increase the value of this parameter. This may be the case for large nourishments or eroding/accreting coastlines.

**Advanced shoreline detection parameters**

As mentioned above, there are some additional parameters that can be modified to optimise the shoreline detection:

* `min_beach_area`: minimum allowable object area (in metres^2) for the class 'sand'. During the image classification, some features (for example, building roofs) may be incorrectly labelled as sand. To correct this, all the objects classified as sand containing less than a certain number of connected pixels are removed from the sand class. The default value is 4500 m^2, which corresponds to 20 connected pixels of 15 m^2. If you are looking at a very small beach (<20 connected pixels on the images), try decreasing the value of this parameter.
* `buffer_size`: radius (in metres) that defines the buffer around sandy pixels that is considered to calculate the sand/water threshold. The default value of buffer_size is 150 m. This parameter should be increased if you have a very wide (>150 m) surf zone or inter-tidal zone.
* `min_length_sl`: minimum length (in metres) of shoreline perimeter to be valid. This can be used to discard small features that are detected but do not correspond to the actual shoreline (clouds for example). The default value is 1200 m. If the shoreline that you are trying to map is shorter than 1200 m, decrease the value of this parameter.
* `cloud_mask_issue`: the cloud mask algorithm applied to Landsat images by USGS, namely CFMASK, does have difficulties sometimes with very bright features such as beaches or white-water in the ocean. This may result in pixels corresponding to a beach being identified as clouds and appear as masked pixels on your images. If this issue seems to be present in a large proportion of images from your local beach, you can switch this parameter to True and CoastSat will remove from the cloud mask the pixels that form very thin linear features, as often these are beaches and not clouds. Only activate this parameter if you observe this very specific cloud mask issue, otherwise leave to the default value of False.
* `sand_color`: this parameter can take 3 values: `default`, `dark` or `bright`. Only change this parameter if you are seing that with the default the sand pixels are not being classified as sand (in orange). If your beach has dark sand (grey/black sand beaches), you can set this parameter to dark and the classifier will be able to pick up the dark sand. On the other hand, if your beach has white sand and the default classifier is not picking it up, switch this parameter to bright. At this stage this option is only available for Landsat images (soon for Sentinel-2 as well).

**Batch shoreline detection**

The function `extract_shorelines` will detect the 2D shorelines from each image and save them into several files (`.pkl`, `geojson`, `.shp`). The coordinates are stored in the `output` dictionnary together with the exact dates in UTC time, the georeferencing accuracy and the cloud cover. This function also removes duplicates and images with inaccurate georeferencing (threhsold at 10m) and makes a simple plot of the mapped shorelines. 

Depending on the settings defined previously, the user can be asked to validate the results, adjust the detection or both.

### 2.3 Shoreline change analysis <a class="anchor" id="section_2_3"></a>

This section shows how to obtain time-series of shoreline change along shore-normal transects. Each transect is defined by two points, its origin and a second point that defines its length and orientation. The origin is always defined first and located landwards, the second point is located seawards. There are 2 options to define the coordinates of the transects:

    1. Interactively draw shore-normal transects along the mapped shorelines:
    
`transects = SDS_transects.draw_transects(output, settings)`

    2. Load the transect coordinates from a .geojson file:

`transects = SDS_tools.transects_from_geojson(path_to_geojson_file)`

Once the shore-normal transects have been defined, the intersection between the 2D shorelines and the transects is computed with the following function:

In [None]:
settings['along_dist'] = 25
cross_distance = analyze_shoreline.analyze_shoreline(output,transects,settings,plot=True)

The parameter `along_dist` defines the along-shore distance around the transect over which shoreline points are selected to compute the intersection. The default value is 25 m, which means that the intersection is computed as the median of the points located within 25 m of the transect (50 m alongshore-median). This helps to smooth out localised water levels in the swash zone.

An example is shown in the animation below:

![Alt Text](https://user-images.githubusercontent.com/7217258/49990925-8b985a00-ffd3-11e8-8c54-57e4bf8082dd.gif)

### 2.4 Tidal correction <a class="anchor" id="section_2_4"></a>

Each satellite image is captured at a different stage of the tide, therefore a tidal correction is necessary to remove the apparent shoreline changes cause by tidal fluctuations.

`cross_distance = correct_tides.correct_tides(cross_distance,settings,output,reference_elevation,beach_slope)`

In order to tidally-correct the time-series of shoreline change you will need the following data:

* Time-series of water/tide level: this can be formatted as a .csv file, an example is provided [here](https://github.com/Space4Dev/coastline-predictor/blob/main/data/WestPoint/WestPoint_tides.csv). Make sure that the dates are in UTC time as the shorelines are always in UTC time. Also the vertical datum needs to be approx. Mean Sea Level (MSL). If those tide values are in Mean Lower Low Water (MLLW), you will need to get the constant value of this datum at your station. 

`reference_elevation = 0` if tides are in MSL or `reference_elevation = MLLW value` otherwise.


* An estimate of the beach-face slope along each transect. If you don't have this data you can estimate it either using CoastSat.slope, see Vos et al. 2020 for more details (preprint available [here](https://www.essoar.org/doi/10.1002/essoar.10502903.1)) or using a global worldwide dataset of nearshore slopes estimates with a resolution of 1 km made by Athanasiou et al. 2019, see [their artice](https://essd.copernicus.org/articles/11/1515/2019/) for more details.

If you already have a beach slope estimate:

`beach_slope = 0.1
cross_distance = correct_tides.correct_tides(cross_distance,settings,output,reference_elevation,beach_slope)`

If you want to estimate the beach slope using  CoastSat.slope:

`cross_distance = correct_tides.correct_tides(cross_distance,settings,output,reference_elevation,estimate_slope=True)`

If you want to use the estimation made by Athanasiou et al. 2019:

`cross_distance = correct_tides.correct_tides(cross_distance,settings,output,reference_elevation)`

**Note**: if you don't have measured water levels, it is possible to obtain an estimate of the  time-series of modelled tide levels at the time of image acquisition from the [FES2014](https://www.aviso.altimetry.fr/es/data/products/auxiliary-products/global-tide-fes/description-fes2014.html) global tide model. Instructions on how to install the global tide model are available [here](https://github.com/kvos/CoastSat.slope/blob/master/doc/FES2014_installation.md).

The function `correct_tides` returns the tidally-corrected time-series of shoreline change and we call `reconstruct_shoreline` to recover the corrected shorelines and save them as shapefiles.

You will find several websites from where you can get tides prediction or covering the last months. In order to properly correct our shorelines, we need to have tide tables covering not only the last months but also the last years. There are several websites for this purpose but we only found one that can provide tides for long periods rather quickly. The steps to take are detailed below.  

**Extract the tides from the National Oceanic and Atmospheric Administration (NOAA)**

On the [NOAA website](https://tidesandcurrents.noaa.gov/tide_predictions.html), you will find tides tables for the USA, Carribean islands and some Pacific islands but not all of them.
You can get the tides annually from 2019 to 2021 or monthly for the years before 2019. If you have a lot of years to retrieve from the website, the fastest way to do it is to modify the downloading link below :
```
https://tidesandcurrents.noaa.gov/cgi-bin/predictiondownload.cgi?&stnid=STATION_ID&threshold=&thresholdDirection=&bdate=START_DATE&edate=END_DATE&units=metric&timezone=GMT&datum=MLLW&interval=hilo&clock=24hour&type=txt&annual=true
```
In this link, change the following :
* STATION_ID : You can find it in the name if the station you want to extract tides from (example : stnid=TEC4723 for Santo Domingo or stnid=TPT2707 for Taongi Atoll).
* START_DATE and END_DATE : You can only extract the years one by one, so you must change START_DATE and END_DATE by January 1st and December 31st with this format : yyyymmdd. For example, bdate=20010101 and edate=20011231 if you want to extract all the tides for the year 2001.  

Repeat this operation for every year you want to cover and paste the tides in a blank Excel file by using *Paste Special > Text*. Use the following formula in J1 and then [expand it](https://support.microsoft.com/en-us/office/copy-a-formula-by-dragging-the-fill-handle-in-excel-for-mac-dd928259-622b-473f-9a33-83aa1a63e218) to select only the valuable information from the tides tab :
```
=TEXT(A1;"yyyy-mm-dd") & " " & TEXT(C1;"hh:mm") & "," & SUBSTITUTE(F1/100;",";".")
```
Once you have expanded your formula, copy and paste your column in a new Excel file using the *Values* option when pasting. Then, save your file as .csv.

As the tide values retrieved on the NOAA website will be in Mean Lower Low Water (MLLW), you need to get the value of this datum at this station. For that, find your sation [here](https://tidesandcurrents.noaa.gov/stations.html?type=Datums) and read the value of the MLLW in the chart.

### 2.5 Time series forecasting <a class="anchor" id="section_2_5"></a>

Several models are implemented to predict the evolution of the time series for each transect. Beforehand, interpolation was carried out in order to make the intervals between the different steps regular. 

Time series data analysis means analysis of time series data to get the meaningful information from the data. Time series forecasting uses model to predict future values based on previous observed values at the present time. In other words, the physical parameters of the studied phenomenon, here the displacement of the coastline, are already present in the data and will therefore be taken into account in the prediction using a time series forecasting model.

The function `predict` allows to get the prediction and the `model` parameter allows to select the desired algorithm among:

* a **Holt's Linear Trend** model `'Holt'`

Exponential smoothing is one of the most preferred methods for a wide variety of time series data for its simplicity to understand, to implement with a simple numerical program, and for reliable forecast in a wide variety of applications.

Single exponential smoothing (SES) is a well-known method of forecasting for stationary time series data. However, it is not reliable when processing non-stationary time series data. Cross distances data have a general tendency of decreasing over time. So, they contain a trend and SES will not be useful in this case. Therefore, Holt gave a method to deal with data pertaining trend which is known as **Holt’s linear trend** method. Holt’s linear trend method comprises two smoothing constants, two smoothing equations and one forecast equation.

<a href="https://www.codecogs.com/eqnedit.php?latex=\begin{align*}&space;\hat{y}_{t&plus;k}&space;&=&space;a_t&space;&plus;&space;kc_t&space;\\&space;a_t&space;&=&space;\gamma&space;y_t&space;&plus;&space;(1-\gamma)(a_{t-1}&plus;c_{t-1})&space;\\&space;c_t&space;&=&space;\delta&space;(a_t&space;-&space;a_{t-1})&space;&plus;&space;(1-\delta)c_{t-1}\\&space;\end{align*}" target="_blank"><img src="https://latex.codecogs.com/svg.latex?\begin{align*}&space;\hat{y}_{t&plus;k}&space;&=&space;a_t&space;&plus;&space;kc_t&space;\\&space;a_t&space;&=&space;\gamma&space;y_t&space;&plus;&space;(1-\gamma)(a_{t-1}&plus;c_{t-1})&space;\\&space;c_t&space;&=&space;\delta&space;(a_t&space;-&space;a_{t-1})&space;&plus;&space;(1-\delta)c_{t-1}\\&space;\end{align*}" title="\begin{align*} \hat{y}_{t+k} &= a_t + kc_t \\ a_t &= \gamma y_t + (1-\gamma)(a_{t-1}+c_{t-1}) \\ c_t &= \delta (a_t - a_{t-1}) + (1-\delta)c_{t-1}\\ \end{align*}" /></a>

where (1) represents the forecast equation, (2) denotes the level equation and (3) represents the trend equation. 

γ and δ are smoothing constants for level and trend respectively whose values lie on the interval from 0 to 1.

a and c are estimates of the level and the trend of the time series respectively.

y denotes the observation.


The characteristics of Holt's linear trend method are mentioned below:
1. Holt's exponential smoothing method doesn't work with data that show cyclical or seasonal patterns. In our case, it should not be a problem as coastal shrinkage is not a cyclical physical phenomenon.
2. It is based to use for short term forecasting as it is based on the assumption that future trend will follow the current trend. This means that we will have to predict much less steps than we use to make the prediction. The longer you want to make a prediction, the older data you will need.
3. It does not provide good result in case of small data. In our case, if the first images are only three-four years old, the prediction will be rather bad. A range of images over at least seven years is more reasonable.


* an **AutoRegressive** model `'AR'`
* a **Long Short Term Memory** model `'LSTM'`
* an **AutoRegressive Integrated Moving Average model** `'ARIMA'`

*Note: At this time, the Holt model is recommended as the other three are still under development.* 

You also need to set up the `n_months_further` parameter to define the number of months for which the prediction will be performed.

The function `predict` returns the time-series along each transect with the prediction for the `n_months_further` months added and a dates vector with the predicted dates added.

Some other parameters of the predict function can be specified:

* `smooth_data` (default to *True*): if set to True the training data is smoothed. This allows to erase recent outliers and at the same time to move back the period on which the learning is focused.
* `smooth_coef` (default to *5*): smoothing strength. The higher the value, the smoother the data.
* `plot`: if set to True, the time series predictions will be plotted.

In [None]:
n_months_further=24
time_series_pred, dates_pred_m = predict.predict(cross_distance,output,inputs,n_months_further,model='Holt',smooth_data=True,smooth_coef=5,plot=True)

### 2.6 New shoreline reconstruction <a class="anchor" id="section_2_6"></a>

The function `reconstruct_shoreline` reconstructs the predicted shorelines. It keeps one shoreline per year predicted. These new shorelines will be plotted and saved as geojson and shapefiles.

In [None]:
predicted_sl = reconstruct_shoreline.reconstruct_shoreline(time_series_pred,transects,dates_pred_m,output,inputs,settings,n_months_further)

### 2.7 Computation of the prediction's generalization error <a class="anchor" id="section_2_7"></a>

The function `cross_validation` computes the **generalization error** of the prediction by splitting the images in train and test samples and comparing the predicted shorelines with hand-drawn shorelines. It also computes the parameters that minimize the rmse and stores them in a `.pkl` so that the `predict` function reads the file and uses these parameters if it exists.

As with the reference shoreline, you will be asked to draw manually the different shorelines that will serve as a reference for the error calculation. 

In [None]:
best_param, rmse, mae = pt.cross_validation(cross_distance,metadata,output,settings,model='Holt')

### 2.8 Threatened population estimation <a class="anchor" id="section_2_8"></a>

Using the population density, it is possible to calculate the area lost between the current coastline and the predicted one in order to estimate the number of people at threat. This estimate is given by the function `estimate_pop`.

The `areas` output contains a low, median and high estimate of the surface that is predicted to be lost. 
If the `density` parameter is specified, the function also returns the equivalents in terms of population and that is stored in `pop` here.

In [None]:
areas, pop = ep.estimate_pop(cross_distance,predicted_sl,time_series_pred,dates_pred_m,n_months_further,transects,output,inputs,settings,density=7600,plot=True)

## Issues

Having a problem? Post an issue in the [Issues page](https://github.com/Space4Dev/coastline-predictor/issues).

## References

Kilian Vos, Kristen D. Splinter, Mitchell D. Harley, Joshua A. Simmons, Ian L. Turner,
CoastSat: A Google Earth Engine-enabled Python toolkit to extract shorelines from publicly available satellite imagery,
Environmental Modelling & Software,
Volume 122,
2019,
104528,
ISSN 1364-8152,
https://doi.org/10.1016/j.envsoft.2019.104528.
(https://www.sciencedirect.com/science/article/pii/S1364815219300490)

Saha, Amit & Sinha, Kanchan. (2020). Usage of Holt's Linear Trend Exponential Smoothing for Time Series Forecasting in Agricultural Research. 
(https://www.researchgate.net/publication/345413376_Usage_of_Holt's_Linear_Trend_Exponential_Smoothing_for_Time_Series_Forecasting_in_Agricultural_Research)

Athanasiou, P., van Dongeren, A., Giardino, A., Vousdoukas, M., Gaytan-Aguilar, S., and Ranasinghe, R.: Global distribution of nearshore slopes with implications for coastal retreat, Earth Syst. Sci. Data, 11, 1515–1529, https://doi.org/10.5194/essd-11-1515-2019, 2019