# **Evaluation of CML data - From processing one CML to evaluating many CMLs**
___
<img src="https://hess.copernicus.org/articles/24/2931/2020/hess-24-2931-2020-f06-web.png" alt="drawing" width="1300"/>

Evaluation of rainfall estiamtes from one year of CML data in Germany against RADOLAN-RW, a gauge ajdusted radar product from the German Weaterh Service for three temporal aggregatins [Graf et al. 2020](https://hess.copernicus.org/articles/24/2931/2020/).     

___  

Maximilian Graf & Erlend Oydvin
___
University of Augsburg & Norwegian University of Life Sciences

In [None]:
%load_ext autoreload
%autoreload 2
%matplotlib inline

import xarray as xr
import matplotlib.pyplot as plt
import numpy as np
import tqdm

import pycomlink as pycml

## 1. Process many CMLs with a time series-based approach

In [None]:
# load dataset with hundreds of CMLs
cmls = xr.open_dataset(".././data/cml/openMRG_example.nc").load()
cmls

In [None]:
# calculate total loss
cmls["tl"] = cmls.tsl - cmls.rsl

# seperate periods of rain from dry time steps
cmls["wet"] = cmls.tl.rolling(time=60, center=True).std(skipna=False) > 0.4

# estiamte the baseline during rain events
cmls["baseline"] = pycml.processing.baseline.baseline_constant(
    trsl=cmls.tl,
    wet=cmls.wet,
    n_average_last_dry=5,
)

# compensate for wet antenna attenuation
cmls["waa"] = pycml.processing.wet_antenna.waa_schleiss_2013(
    rsl=cmls.tl,
    baseline=cmls.baseline,
    wet=cmls.wet,
    waa_max=2.2,
    delta_t=1,
    tau=15,
)

# calculate attenuation caused by rain and remove negative attenuation
cmls["A"] = cmls.tl - cmls.baseline - cmls.waa
cmls["A"].values[cmls.A < 0] = 0

# derive rain rate via the k-R relation
cmls["R"] = pycml.processing.k_R_relation.calc_R_from_A(
    A=cmls.A,
    L_km=cmls.length.astype(float) / 1000,  # convert to km
    f_GHz=cmls.frequency / 1000,  # convert to GHz
    pol=cmls.polarization,
)

In [None]:
cmls

In [None]:
cmls = cmls.isel(sublink_id=0)

## 2. Link based evaluation and performance metrics

As reference, path-averaged rain rates along the CMLs paths from RADKLIM-YW are provided. This data has a temporal resolution of 5 minutes. First, we compare one CML timeseries aggregated to five minutes individually against its reference timeseries. Then we resample all cml data und prepare a scatterplot between CML and reference data. Finally some metrics are claculated. (for simplicity only channel 1 is evaluated here)

In [None]:
# load reference data
ds_radar = xr.open_dataset(".././data/cml/openMRG_example_rad.nc")
ds_radar

In [None]:
ds_radar.R.sel(time=slice("2015-08-27T01:00:00", "2015-08-27T02:35:00")).plot(
    x="x", y="y", col="time", col_wrap=5, cmap="YlGnBu", levels=10,
);

In [None]:
import pyproj

# get a x and y grid from x and y data
x_grid, y_grid = np.meshgrid(ds_radar.x.values, ds_radar.y.values)

# transform original radar projction to WGS84 (EPSG:4326)
transformer = pyproj.Transformer.from_crs(
    "+proj=stere +lat_ts=60 +ellps=bessel +lon_0=14 +lat_0=90",
    "EPSG:4326",
    always_xy=True,
)
lon_grid, lat_grid = transformer.transform(xx=x_grid, yy=y_grid)

# add the lon and lat grid as coordinates to the radar dataset
ds_radar.coords["lon"] = (("y", "x"), lon_grid)
ds_radar.coords["lat"] = (("y", "x"), lat_grid)

In [None]:
# map of rainfall sum over all time steps
ds_radar.R.resample(time="1H").mean().sum(dim="time").plot.pcolormesh(
    x="lon", y="lat", cmap="YlGnBu"
)
for lon1, lat1, lon2, lat2 in zip(
    cmls.site_0_lon, cmls.site_0_lat, cmls.site_1_lon, cmls.site_1_lat
):
    plt.plot([lon1, lon2], [lat1, lat2], "-", c="black", alpha=0.5)


### Radar along CML path

<img src="./hints_solutions/radar_along_cml.png" style="height: 200px;"/>


In [None]:
# calculate the intersection weights with a sparse matrix
da_intersect_weights = pycml.spatial.grid_intersection.calc_sparse_intersect_weights_for_several_cmls(
    x1_line=cmls.site_0_lon.values,
    y1_line=cmls.site_0_lat.values,
    x2_line=cmls.site_1_lon.values,
    y2_line=cmls.site_1_lat.values,
    cml_id=cmls.cml_id.values,
    x_grid=ds_radar.lon.values,
    y_grid=ds_radar.lat.values,
    grid_point_location='center',
)

In [None]:
# get the radar values along the CMLs weighted with the intersection weights
da_radar_along_cmls = (
    pycml.spatial.grid_intersection.get_grid_time_series_at_intersections(
        grid_data=ds_radar.R,
        intersect_weights=da_intersect_weights,
    )
)

There is [an example notebook](https://github.com/pycomlink/pycomlink/blob/master/notebooks/Get%20radar%20rainfall%20along%20CML%20paths.ipynb) within pycomlink describing the grid intersection step wise.


In [None]:
# plot one CML and its radar reference as 1 and 5 minute rainfall intensities
(cmls.sel(cml_id=10222).R).plot(
    x="time",
    figsize=(16, 3),
    label="CML 1-minute rainfall intensities",
    color="darkblue",
    alpha=0.5,
    add_legend=True,
)

da_radar_along_cmls.sel(cml_id=10222).plot(
    alpha=0.75, label="Radar along CML 5-minute rainfall intensities", color="green"
)
plt.legend()

# .. and as hourly rainfall sums
(cmls.sel(cml_id=10222).R.resample(time="60min").mean()).plot(
    x="time",
    figsize=(16, 3),
    label="CML 1h rainfall sum",
    color="darkblue",
    alpha=0.5,
    add_legend=True,
)

da_radar_along_cmls.resample(time="60min").mean().sel(cml_id=10222).plot(
    alpha=0.75, label="Radar along CML 1h rainfall sum", color="green"
)
plt.legend();

##### Q: How to compare the CML rainfall estiamtes with the radar reference (along the CML paths)?
* scatter plots
* metrics 

## 3. Preparation of the CML data

### Exercise 1
Resample the CML rainfall estimates to 5-minute rainfall intensities.

In [None]:
# enter you solution


In [None]:
if input("Enter 'Solution' to display solutions: ")=='Solution':
    %load hints_solutions/3_1_solution.py

### Exercise 2
Compare the mean rainfall initensity over all CMLs and radar along CMLs at this 5-minute resolution


In [None]:
# enter your solution


In [None]:
if input("Enter 'Solution' to display solutions: ")=='Solution':
    %load hints_solutions/3_2_solution.py

### Scatterplots

In [None]:
fig, ax = plt.subplots(figsize=(4, 3.5))
hx = ax.hexbin(
    cmls_5min.sel(time=da_radar_along_cmls.time).values.T.flatten(),
    da_radar_along_cmls.values.flatten(),
    mincnt=1,
    bins="log",
    gridsize=45,
    extent=(0, 100, 0, 100),
)
ax.plot([0,100],[0,100],'--',color='black',alpha=.5)
ax.set_xlabel("CML 5-minute rainfall intensity")
ax.set_ylabel("Radar along CML 5-minute rainfall intensity")
cbar = fig.colorbar(hx)
cbar.set_label("count")

In [None]:
fig, ax = plt.subplots(figsize=(4, 3.5))
hx = ax.hexbin(
    cmls_5min.sel(time=da_radar_along_cmls.time).resample(time="60min").mean().values.T.flatten(),
    da_radar_along_cmls.resample(time="60min").mean().values.flatten(),
    mincnt=1,
    bins="log",
    gridsize=45,
    extent=(0, 12, 0, 12),
)
ax.plot([0,12],[0,12],'--',color='black',alpha=.5)
ax.set_xlabel("CML 1h sums (mm)")
ax.set_ylabel("Radar along CML 1h sums (mm)")
cbar = fig.colorbar(hx)
cbar.set_label("count")

### Performance metrics

In [None]:
error_stats = pycml.validation.stats.calc_rain_error_performance_metrics(
    cmls_5min.sel(time=da_radar_along_cmls.time).values.T.flatten(),
    da_radar_along_cmls.values.flatten(),
    rainfall_threshold_wet=0.1,
)

In [None]:
for stat, field in zip(error_stats, error_stats._fields):
    print(field, stat)

In [None]:
error_stats = pycml.validation.stats.calc_rain_error_performance_metrics(
    cmls_5min.sel(time=da_radar_along_cmls.time).resample(time="60min").mean().values.T.flatten(),
    da_radar_along_cmls.resample(time="60min").mean().values.flatten(),
    rainfall_threshold_wet=0.1,
)
for stat, field in zip(error_stats, error_stats._fields):
    print(field, stat)

### Optional Exercise 3
Change the threshold in the rain event detection to values between 0.1 and 3. What do you expect? How do the metrics change?  

Hint: rain event detection step in the processing:   
`cmls["wet"] = cmls.tl.rolling(time=60, center=True).std(skipna=False) > 0.4`

In [None]:
# enter your solution


In [None]:
if input("Enter 'Solution' to display solutions: ")=='Solution':
    %load hints_solutions/3_3_solution.py