# **The nearby link appraoch**
___
<img src="https://github.com/overeem11/RAINLINK/blob/v.1.21/LinksAmsterdam15min201109102015StamenMapsMap.jpeg?raw=true" alt="drawing" width="600"/>

15 min rainfall map from 10 September 2011, for links only for Amsterdam, the Netherlands. Spatial resolution is approximately 0.9 km2 from 
[Overeem et al. 2016](https://doi.org/10.5194/amt-9-2425-2016)    

___  

Maximilian Graf & Erlend Oydvin
___
University of Augsburg & Norwegian University of Life Sciences

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import xarray as xr

import pycomlink as pycml
import pycomlink.processing.wet_dry.nearby_wetdry as nearby_wetdry
import pycomlink.processing.nearby_rain_retrival as nearby_rain

# Example of the rain event detection and rainfall retrieval using the nearby appraoch from Overeem et al. 2016

We load example data included in `pycomlink`. One NetCDF file contains the time series of 500 CMLs with two `sublinks/channels` over 10 days. 

In [None]:
pycml.io.examples.get_example_data_path()

data_path = pycml.io.examples.get_example_data_path()

cmls = xr.open_dataset(data_path + "/example_cml_data.nc")
cmls

In [None]:
cmls.sel(cml_id='333',channel_id='channel_1').rsl.plot()
cmls.sel(cml_id='333',channel_id='channel_1').tsl.plot();

## Prepare data
#### Removing default values from CML DAQ system and interpolating small gaps in tsl and rsl time series

In [None]:
cmls["rsl"] = cmls["rsl"].where(cmls.rsl > -99.9)
cmls["tsl"] = cmls["tsl"].where(cmls.tsl < 255.0)
cmls["rsl"] = cmls.rsl.interpolate_na(dim="time", method="linear", max_gap="5min")
cmls["tsl"] = cmls.tsl.interpolate_na(dim="time", method="linear", max_gap="5min")

In [None]:
plt.plot(
        [cmls.site_a_longitude, cmls.site_b_longitude],
        [cmls.site_a_latitude, cmls.site_b_latitude],
        color='grey',
        linewidth=1,
    );

#### Instanteanous to min-max data and calculation of attenuation

Transfering instantaneous example data to 15 minute (interval) min-max data defining the minmal number of hours (min_hours) needed in a given time period (time period) to calssify wet and dry periods in the subsequent step. If no tsl data is available, a constant tsl has to be assumed and incoporated in CMLs.  

Also, this step calculates deltaP (attenuation) and deltaPL (specific attenuation)


In [None]:
pmin, max_pmin, deltaP, deltaPL = nearby_wetdry.instantaneous_to_minmax_data(
    rsl=cmls.rsl,
    tsl=cmls.tsl,
    length=cmls.length,
    interval=15,
    timeperiod=24,
    min_hours=6,
)


#### Exercise 1
Plot rsl and tsl for one CML and sub-link. Plot pmin, max_pmin, deltaP, deltaPL in a new figure for the same CML/sublink. Checkout [`nearby_wetdry.instantaneous_to_minmax_data()`](https://github.com/pycomlink/pycomlink/blob/ca4383987c6fec29630a782854affcc2b5e8df98/pycomlink/processing/wet_dry/nearby_wetdry.py#L86) and explain to each other what pmin, max_pmin, deltaP and deltaPL mean.

In [None]:
# your solution:


In [None]:
if input("Enter 'Solution' to display solutions: ")=='Solution':
    %load hints_solutions/2_1_solution.py

#### Calculate a distance matrix
Calculating distances between all cml endpoints and plotting the neighbors used for wet-dry classification depending on the distance

In [None]:
ds_dist = nearby_wetdry.calc_distance_between_cml_endpoints(
    cml_ids=cmls.cml_id.values,
    site_a_latitude=cmls.site_a_latitude,
    site_a_longitude=cmls.site_a_longitude,
    site_b_latitude=cmls.site_b_latitude,
    site_b_longitude=cmls.site_b_longitude,
)

In [None]:
ds_dist.isel(cml_id1=250).a_to_all_a.plot.hist(bins=50);

In [None]:
r=15 # radius in km
ds_dist["within_r"] = (
        (ds_dist.a_to_all_a < r)
        & (ds_dist.a_to_all_b < r)
        & (ds_dist.b_to_all_a < r)
        & (ds_dist.b_to_all_b < r)
)

In [None]:
ds_dist.within_r.sum(dim="cml_id2").plot.hist(bins=int(ds_dist.within_r.sum(dim="cml_id2").max()))
plt.vlines(4,ymin=0,ymax=50,color="red")
plt.grid()
plt.annotate(text="sufficient dense\nCML network if count>3", xy=(4,45), xytext=(10,43), arrowprops=dict(arrowstyle="<-"))
plt.xlabel("CMLs within radius r")
plt.ylabel("count");

#### Exercise 2
Vary the radius r for the distance between CML endopints and plot the count of CMLs which are within this certain radius.

In [None]:
# your solution:


In [None]:
if input("Enter 'Solution' to display solutions: ")=='Solution':
    %load hints_solutions/2_2_solution.py

## Rain event detection
using default paramters from Overeem et al. (2016)

In [None]:
wet, F, medianP_out, medianPL_out = nearby_wetdry.nearby_wetdry(
    pmin=pmin,
    max_pmin=max_pmin,
    deltaP=deltaP,
    deltaPL=deltaPL,
    ds_dist=ds_dist,
    r=15,
    thresh_median_P=-1.4,
    thresh_median_PL=-0.7,
    min_links=3,
)

#### Exercise 3
Plot instantaneous data, pmin, maxpmin, deltaP, delta_PL and the rain event detection for several CMLs and the period from 13. - 15. May 2023.

In [None]:
# your solution


In [None]:
if input("Enter 'Solution' to display solutions: ")=='Solution':
    %load hints_solutions/2_3_solution.py

## Rainfall retrival

#### Baseline estimation (pref)
Median over the dry time steps from the previous 24 hours

In [None]:
pref = nearby_rain.nearby_determine_reference_level(wet, pmin)

In [None]:
t_start, t_end = "2018-05-13", "2018-05-15"
for cmlid in ["5"]:
    pmin.sel(cml_id=cmlid, time=slice(t_start, t_end)).isel(
        channel_id=0
    ).plot(figsize=(10, 4),label="pmin",)
    pref.sel(cml_id=cmlid, time=slice(t_start, t_end)).isel(
        channel_id=0
    ).plot(label="pref")
    (
        (
            wet.isel(channel_id=0)
            .sel(cml_id=cmlid, time=slice(t_start, t_end))
            * 50
        )
        - 100
    ).plot(label="wet", alpha=0.5)
plt.legend();

#### Correction of pmin and pmax
To prevent rainfall estimates during dry intervals, a corrected minimum (P_c_min) and maximum (p_c_max) received power is calculated by adjusting the signals to the baseline (pref) for dry intervals.
  
Note that pmax data should be used here if available. If no pmax data is available, pmin will be used for both cases instead.  

for pmin:  
*If (pmin < pref) & (wet == 1) --> p_c_min = pmin, otherwise p_c_min = pref*  
for pmax:  
*if p_c_min < pref & pmin < pref --> p_c_max = pmax, otherwise p_c_max = pref*


In [None]:
p_c_min, p_c_max = nearby_rain.nearby_correct_recieved_signals(
            pmin, wet, pref)

#### Calculate rain rates from attenuation data 
* Calculatating minimum and maximum rain-induced attenuation
* retrieve rainfall intensities
* correcting for wet antenna attenuation 
* weighted mean path averaged rainfall intensity: setting the alpha value which defines how close to the minimum attenuation of each intervall the rain rate should be set
* using the F-score (F) for outlier detection

In [None]:
R = nearby_rain.nearby_rainfall_retrival(
    pref,
    p_c_min,
    p_c_max,
    F,
    length=pmin.length,
    f_GHz=pmin.frequency/1e9,
    pol=pmin.polarization,
    waa_max=2.3,
    alpha=0.99,
    F_value_correction=True)

#### Compare derived rain rates with reference data
As reference, path-averaged rain rates along the CMLs paths from RADKLIM-YW are provided. This data has a temporal resolution of 5 minutes and is resampled to 15 minute rainfall intensities. Here CML timeseries are compared individually against reference timeseries. 

In [None]:
path_ref = xr.open_dataset(data_path + '/example_path_averaged_reference_data.nc')

In [None]:
for i in ["0", "12", "57"]:
#for i in ["12"]:
    # Plot reference rainfall amount (converted to 15-minute rainfall rate)
    (path_ref.sel(cml_id=i).rainfall_amount.resample(time='15min').sum() * 12).plot(
        label="RADKLIM_YW", color='C3', figsize=(12,3)
    )
    # Plot 15-minute mean rainfall rates from CMLs
    (R.sel(cml_id=i,channel_id="channel_1")).plot(
        x="time", label="CML_nearby", color='C0'
    )
    
    plt.xlim(np.datetime64('2018-05-13'), np.datetime64('2018-05-15'))
    plt.ylabel('5-min rainfall rate (mm/h)')
    plt.legend();

#### Exercise 4
Discuss what influence the wet antenna attenuation and the scaling factor alpha have e.g. when plotting the time series from above. Check your assumptions by recalculating the rain rates and checking the differences against the references,

In [None]:
# your solution:


In [None]:
if input("Enter 'Solution' to display solutions: ")=='Solution':
    %load hints_solutions/2_4_solution.py

#### Exercise 5
Test several different radii r and how they affect the rainrate from CMLs. Discuss what implications different types of rainfall regimes might have on the used radius r.

In [None]:
# your solution:


In [None]:
if input("Enter 'Solution' to display solutions: ")=='Solution':
    %load hints_solutions/2_5_solution.py