# Tuning Factor Boundaries

In this vignette boundaries (min/max values) for the pollen tuning factors are calculated.
These tuning factors are adjusted in real-time in Cosmo based on measurements.
To prevent any unforseen extreme values boundaries are set based on pollen climatology 2010-2020.

In [None]:
renv::load(here::here())
library(readr)
library(dplyr)
library(tidyr)
library(stringr)
library(magrittr)
library(padr)
library(kableExtra)
library(purrr)
library(ggplot2)
library(ggthemr)
library(here)
library(lubridate)
library(AeRobiology)

ggthemr("fresh")
devtools::load_all()

load(paste0(here(), "/data/other/species.RData"))
load(paste0(here(), "/data/other/stations.RData"))
species %<>%
  mutate(fieldextra_taxon = str_replace_all(fieldextra_taxon, "1", "24"))
data_dwh <- import_data_dwh(paste0(here(), "/data/dwh/pollen_dwh_daily.txt"))


We are working with the 10 most recent years of pollen climatology, as the seasons are affected by
climate change. Generally, the pollen season for many species might start earlier and become more
intense.
The values below were historically used as fixed tuning factors as provided in the lm_c namelist.
The new 2D-tune fields adjusted in the `update_strength_realtime()` subroutine from the 
pol_seasons.f90 module of the cosmo source code, start out with this factor and are being adjusted 
in realtime.

In [None]:
pollen_split <- data_dwh %>%
  filter(date >= as.Date("2010-01-01"),
         date <= as.Date("2020-12-31")) %>%
  mutate(year = year(date),
         tuning_factor_orig = case_when(
           taxon == "Corylus" ~ 1.0, # This factor is a rough approximation as it has never been exactly defined
           taxon == "Alnus" ~ 1.0,
           taxon == "Betula" ~ 1.0,
           taxon == "Poaceae" ~ 1.0
         )) %>%
  select(taxon, station, year, datetime, value, tuning_factor_orig) %>%
  split(.$taxon) %>%
    map(~ .x %>% split(list(.$year, .$station)))

The pollen seasons are retrieved based on the definition used in XXXX

In [None]:

season_def_alnu <- map(pollen_split$Alnus, ~.x %>%
  select(datetime, value) %>%
  calculate_ps(
    method = "clinical",
    n.clinical = 5,
    window.clinical = 7,
    th.pollen = 10,
    th.sum = 100,
    plot = FALSE,
    result = "table"
  ))

season_def_betu <- map(pollen_split$Betula, ~.x %>%
  select(datetime, value) %>%
  calculate_ps(
    method = "clinical",
    n.clinical = 5,
    window.clinical = 7,
    th.pollen = 10,
    th.sum = 100,
    plot = FALSE,
    result = "table"
  ))

season_def_cory <- map(pollen_split$Corylus, ~.x %>%
  select(datetime, value) %>%
  calculate_ps(
    method = "clinical",
    n.clinical = 5,
    window.clinical = 7,
    th.pollen = 10,
    th.sum = 100,
    plot = FALSE,
    result = "table"
  ))


season_def_poac <- map(pollen_split$Poaceae, ~.x %>%
  select(datetime, value) %>%
  calculate_ps(
    method = "clinical",
    n.clinical = 5,
    window.clinical = 7,
    th.pollen = 3,
    th.sum = 30,
    plot = FALSE,
    result = "table"
  ))

season_def <- c(season_def_alnu, season_def_betu, season_def_cory, season_def_poac)

season_start <- map(season_def, ~ .x$st.jd) %>% unlist
season_end <- map(season_def, ~ .x$en.jd) %>% unlist

pollen_clean <- unlist(pollen_split, recursive = FALSE)
pollen_clean <- pollen_clean[-which(is.na(season_start))]
season_start <- season_start[!is.na(season_start)]
season_end <- season_end[!is.na(season_end)]

pollen_in_season <- pmap(list(pollen_clean, season_start, season_end), ~
..1 %>%
  slice(..2:..3)) %>%
  bind_rows()


An average pollen sum is calculated for the period 2010-2020. Every year is then compared to this
average season to identify the range of historic pollen factors. As we expect the season intensity
to increase due to climate change and to account for statistical variance, the boundaries are
slightly modified, to allow for more extreme seasons to occur. The lower threshold is also decreased
to counter-effect the impact of wrong precipitation events in the model. For those days, the pollen
emission should be allowed to reach almost zero.

In [None]:
pollen_in_season %>%
  group_by(taxon, station, tuning_factor_orig) %>%
  mutate(average_season = sum(value, na.rm = TRUE) / length(unique(pollen_in_season$year))) %>%
  group_by(taxon, station, year, tuning_factor_orig, average_season) %>%
  summarise(current_season = sum(value, na.rm = TRUE)) %>%
  ungroup() %>%
  mutate(tuning_factor_current = tuning_factor_orig / average_season * current_season) %>%
  group_by(taxon) %>%
  summarise(max = max(tuning_factor_current) * 1.2,
  min = min(tuning_factor_current) * 0.9)
