# TOC Thematic Report - February 2019 (Part 6: summary for Task Force meeting)

This notebook summarises the work done so far in preparation for the Task Force meeting in Helsinki in June.

## 1. Datasets

The ICPW programme comprises two main datasets: a "core" selection of 264 stations and a broader "trends" grouping of 432 stations (as of May 2019). These are organised separately in the database, but with some duplication that makes processing a little awkward/messy. These two datasets overlap, such that there are currently **556 unique stations** within the programme as a whole.

Analysis for the Thematic Report initially considered data from all 556 stations, but selection criteria have been applied for checking time series length and completeness (see below). The results so far are therefore based (hopefully!) on a high quality subset of the full dataset.

Where stations are present in both the "core" and "trends" datasets, I have preferentially chosen to work with data from the "trends" stations. This is because these stations have already been through several iterations of quality control for the TOC Trends paper drafted by Heleen, Don and John.

## 2. Time period and selection criteria

The main time period of interest is **1990 to 2016**. To be comnsidered in the analysis, each series for each staion and parameter must **begin before 1995**, **end after 2011** and be at least **75% complete** in the interval between these years. Completeness is assessed based on **monthly** resampled values for rivers and **seasonal** (i.e. quarterly) resampled values for lakes. The justification for this is that robust annual estimates of concentration for lakes can be obtained more easily than for rivers (i.e about 4 samples per year are required for lakes compared to ~12 for rivers).

Using these criteria, 231 stations have at least some relevant data, although only a few of these have suitable records for *all* the parameters of interest.

## 3. Parameters of interest

The following parameters have been considered (calculated from the raw data as required):

 * **TOC ($mgC/l$)**. Some Focal Centres report DOC, which is assumed to be equal to TOC for the analysis here
 * **EH ($\mu eq/l$)**. Calculated from pH as $1.10^6 * 10^{-pH}$
 * **ESO4X ($\mu eq/l$)**. Sea-salt corrected sulphate (see below for details of the correction applied)
 * **ECaX_EMgX ($\mu eq/l$)**. The sum of sea-salt corrected calcium and magnesium (see below for details of the correction applied)
 * **ENO3 ($\mu eq/l$)**. Nitrate
 * **ANC ($\mu eq/l$)**. Acid Neutralising Capacity, calculated as $(ECa+EMg+EK+ENa+ENH4) - (ECl+ESO4+ENO3)$. Note that NH4 is often not reported. In this case, it is assumed to be negligible (i.e. zero). All the other species in the calculation are assumed to be mandatory to estimate a valid ANC value
 * **ALK-E ($\mu eq/l$)**. Alkalinity. Usually reported as "gran alkalinity", but there are many method variants between the various Focal Centres. These values should probably be used with caution
 
### 3.1. Sea-salt correction

Corrections (denoted by `X`) are calculated as

$$EParX = EPar_{sample} - \left[ \left( \frac{EPar}{ECl} \right)_{ref} * ECl_{sample} \right]$$

Where $\left( \frac{EPar}{ECl} \right)_{ref}$ is a reference ratio assumed to be constant for all locations:

| Species | Molar mass | Valency | Ref. ratio |
|:-------:|:----------:|:-------:|:----------:|
|   SO4   |         96 |       2 |      0.103 |
|    Cl   |         35 |       1 |          1 |
|    Ca   |         40 |       2 |      0.037 |
|    Mg   |         24 |       2 |      0.196 |
|  NO3-N  |         14 |       1 |        N/A |

**Note:** Using the same reference ratios everywhere produces some strange effects, especially in countires like Germany or the Czech Republic where chloride is not necessarily marine. This probably needs revisiting as the "one-size-fits-all" approach taken here is clearly not appropriate in some locations.

## 4. Workflow overview

The following notebooks describe different stages of the analysis so far:

 1. **[Update "trends" dataset](https://github.com/JamesSample/icpw#toc-trends-paper)**. The latest work using the "trends" dataset is documented [here](https://github.com/JamesSample/icpw#toc-trends-paper) under the heading *"TOC Trends paper"*
 
 2. **[Update "core" dataset](https://nbviewer.jupyter.org/github/JamesSample/icpw/blob/master/toc_report_feb_2019_part1.ipynb)**. Adding recent data for the "core" stations and dealing with data issues
 
 3. **[Combining the "core" and "trends" datasets](https://nbviewer.jupyter.org/github/JamesSample/icpw/blob/master/toc_report_feb_2019_part3.ipynb)**. An overview of the unified ICPW dataset of 556 stations
 
 4. **[Scatterplots of annual data](https://nbviewer.jupyter.org/github/JamesSample/icpw/blob/master/toc_report_feb_2019_part4_wge_plots.ipynb)**. A high-level overview of the raw data, aggregated to annual medians
 
 5. **[Stations with high frequency monitoring](https://nbviewer.jupyter.org/github/JamesSample/icpw/blob/master/toc_report_feb_2019_part5_hi_freq.ipynb)**. Ten stations have substantially more detailed monitoring than the others (approximately 25 to 100 samples per year from 1990 to 2016). This notebook performs trend and change point analyses based on *monthly* data, using algorithms that are too "data hungry" to be applied elsewhere
 
 6. **[Stations with "standard" monitoring](https://nbviewer.jupyter.org/github/JamesSample/icpw/blob/master/toc_report_feb_2019_part6.ipynb)**. Trend and change point analyses using *annually* aggregated data for the 231 stations with time series matching the criteria defined in section 2, above