# Time aggregation

For a climatology, there are different ways to aggregate data in time. Common ways are:
* monthly climatology, aggregating all observations per month
* seasonal climatology
* yearly climatology
* decadal climatology

If the data coverage is sufficient, one can also make a seasonal climatology per decades which allows to resolve the seasonal cycle and long term changes.

In `DIVAnd`, the temporal aggregation is represented by a structure called time selector. The most common is `TimeSelectorYearListMonthList` which behaves similarly than the `yearlist` and `monthlist` files the Fortran version of DIVA.

In [1]:
using Dates
using DIVAnd
using JupyterFormatter
enable_autoformat()

LoadError: ArgumentError: Package JupyterFormatter [b8b539d8-55b4-4e60-a505-d7876c054e58] is required but does not seem to be installed:
 - Run `Pkg.instantiate()` to install all recorded dependencies.


In [None]:
Pkg.update()

[32m[1m    Updating[22m[39m registry at `~/.julia/registries/General.toml`
[32m[1m    Updating[22m[39m git-repo `https://github.com/weech/GRIB.jl`
[32m[1m    Updating[22m[39m git-repo `https://github.com/gher-uliege/PhysOcean.jl.git`
[32m[1m   Installed[22m[39m libpng_jll ──── v1.6.45+1
[32m[1m   Installed[22m[39m NaNMath ─────── v1.0.3
[32m[1m   Installed[22m[39m Accessors ───── v0.1.40
[32m[1m   Installed[22m[39m SIMD ────────── v3.7.1
[32m[1m   Installed[22m[39m PlotlyKaleido ─ v2.2.5
[32m[1m   Installed[22m[39m PlotlyBase ──── v0.8.19
[32m[1m   Installed[22m[39m GLMakie ─────── v0.10.18
[32m[1m   Installed[22m[39m Extents ─────── v0.1.5
[32m[1m   Installed[22m[39m Kaleido_jll ─── v0.2.1+0
[32m[1m   Installed[22m[39m ArchGDAL ────── v0.10.7
[32m[1m   Installed[22m[39m GeometryTypes ─ v0.8.5
[32m[1m   Installed[22m[39m MeshIO ──────── v0.4.13
[32m[1m   Installed[22m[39m Roots ───────── v2.2.3
[32m[1m   Installed[22m[

In [2]:
# ?TimeSelectorYearListMonthList

In [3]:
yearlist = [1900:2017]
monthlist = [1:3,4:6,7:9,10:12]

TS = DIVAnd.TimeSelectorYearListMonthList(yearlist,monthlist)

TimeSelectorYearListMonthList{Vector{UnitRange{Int64}}, Vector{UnitRange{Int64}}}(UnitRange{Int64}[1900:2017], UnitRange{Int64}[1:3, 4:6, 7:9, 10:12])

The number of time instances defined in this time selector is 4:

In [4]:
length(TS)

4

Assume that we have a time vector with these dates:

In [5]:
obstime = [DateTime(2001,4,1),DateTime(2002,2,1),DateTime(2018,3,1)]

3-element Vector{DateTime}:
 2001-04-01T00:00:00
 2002-02-01T00:00:00
 2018-03-01T00:00:00

Which observation would be used for the first winter analysis?

In [6]:
sel = DIVAnd.select(TS,1,obstime)

3-element BitVector:
 0
 1
 0

In [7]:
obstime[sel]

1-element Vector{DateTime}:
 2002-02-01T00:00:00

Note that 

A time instance in the "center" of a give time insterval is given by `DIVAnd.ctimes(TS)`. These dates are saved in the NetCDF file together with the `climatology_bounds` from the [NetCDF CF convention](http://cfconventions.org/Data/cf-conventions/cf-conventions-1.7/cf-conventions.html#climatological-statistics).

In [8]:
DIVAnd.ctimes(TS)

4-element Vector{DateTime}:
 1958-02-16T00:00:00
 1958-05-16T00:00:00
 1958-08-16T00:00:00
 1958-11-16T00:00:00

In [9]:
yearlist = [y:y+9 for y in 1950:10:2000]

6-element Vector{UnitRange{Int64}}:
 1950:1959
 1960:1969
 1970:1979
 1980:1989
 1990:1999
 2000:2009

Note that the duration of every year range is 10 years becasue the upper bound is inclusive. The last year range coveres the 10 years:

In [10]:
collect(yearlist[end])'

1×10 adjoint(::Vector{Int64}) with eltype Int64:
 2000  2001  2002  2003  2004  2005  2006  2007  2008  2009

In [11]:
TS = DIVAnd.TimeSelectorYearListMonthList(yearlist,monthlist);

For this time selector, there are now $4 × 6=24$ time slices

In [12]:
length(TS)

24

In [13]:
DIVAnd.ctimes(TS)[1:3]

3-element Vector{DateTime}:
 1954-02-16T00:00:00
 1954-05-16T00:00:00
 1954-08-16T00:00:00

# Overlapping years

Sometimes is it desirable to have overlapping year range to make a climatology similar to a running average. This can be achieved by a suitable definition of `yearlist`:

In [14]:
yearlist = [y:y+5 for y in 1990:2000]

11-element Vector{UnitRange{Int64}}:
 1990:1995
 1991:1996
 1992:1997
 1993:1998
 1994:1999
 1995:2000
 1996:2001
 1997:2002
 1998:2003
 1999:2004
 2000:2005

Every time slice is a 6-year average form data from the same season and there are $4 × 11=44$ time slices in this example. 

In [15]:
TS = DIVAnd.TimeSelectorYearListMonthList(yearlist,monthlist);
length(TS)

44

Since the data is overlapping, the same observation are used in multiple time instances:

In [16]:
obstime = [DateTime(2000,1,1)]
for n = 1:length(TS)
    nobs = sum(DIVAnd.select(TS,n,obstime))
    if nobs > 0
        println("$nobs observation(s) are used in time slice $n")
    end
end

1 observation(s) are used in time slice 21
1 observation(s) are used in time slice 25
1 observation(s) are used in time slice 29
1 observation(s) are used in time slice 33
1 observation(s) are used in time slice 37
1 observation(s) are used in time slice 41


As expected an observations is used 6 times.