# Time representation in `Pyleoclim`

For a field as dependent on chronological information as the paleosciences, you would think that our community would have long ago established standards on how to represent time in various records. Yet, like many aspect of paleo, this is a veritable tower of Babel, with scientists using all kinds of units (e.g. 'ky BP', 'years CE', 'Ma'), often without a clear reference point. Consider for instance, the ubiquitous "before present" (BP) notation.  
As pointed out by  Wolff (2007) the BP notation has become a de facto standard in the community, although “present” means different things to different people. It is often taken to mean "Common Era (CE) 1950" (especially within the radiocarbon community), undefined, or defined as some other date (e.g., CE 2000 in the speleothem community). Still others tor the year the study was performed/published (which every user of the dataset must then figure out for themselves).

For studies spanning several million years with age uncertainties in excess of 1,000 years, a  difference of a few decades is immaterial. However, for studies working at higher resolution (e.g., decadal to sub- annual), concentrating on recent millennia, this difference is extremely consequential.  It is thus necessary to standardize the representation of time among various records, especially if they are to be treated jointly in some analysis (e.g. principle components). 

The most common standard for time and date reporting (e.g., ISO 8601) does not accommodate for geologic time. The [OWL time ontology](https://www.w3.org/TR/owl-time/) draws on the work of Cox and Richards (2015) and includes geologic time. However, these authors offer no finer division of geologic time than eras. This means that the vast majority of archived paleoclimate data sets (particularly, the totality of data sets archived on the LinkedEarth platform) would be represented by single time point (the Quaternary era). Also, there is no easy way to go from geologic to astronomic time, or some more familiar version (e.g. ISO 8601).  To remedy this gap between ISO 8601 and the OWL time representation, `Pyleoclim` uses a precise mechanism to report the time axis in paleoclimate data sets. Time (or age) is  expressed as:  $$ \mathrm{significand} \times 10^\mathrm{exponent} \text{units direction datum} $$

For instance, in the date "21 ky BP", 
- _significand_ refers to 21
- _units_ are years, therefore to covert to kiloyears (ky), the _exponent_ is 3
- the _direction_ is retrograde (counting back from a _datum_, in this case the "present" defined as CE 1950)

This notebook illustrates how this time representation works, and how one can convert from various time conventions within this framework.



## Load test data

In [1]:
%load_ext autoreload
%autoreload 2
    
import pyleoclim as pyleo
import scipy.io as sio
import numpy as np

data = sio.loadmat('../example_data/wtc_test_data_nino.mat')
air = data['air'][:, 0]
nino = data['nino'][:, 0]
t = data['datayear'][:, 0]

## Define `Series`

We place the data into `Series` objects (`Pyleoclim`'s basic data structure)

In [3]:
ts_air = pyleo.Series(time=t, value=air, time_unit='year', value_name='air')
ts_nino = pyleo.Series(time=t, value=nino, time_unit='year', value_name='nino')

## Test conversion to different target units

As alluded to above, the paleoscience literature employs a wide variety of conventions and abbreviations to denote time-related concepts. Currently supported conventions include:
            {
                'year', 'years', 'yr', 'yrs',
                'y BP', yr BP', 'yrs BP', 'year BP', 'years BP',
                'ky BP', 'kyr BP', 'kyrs BP', 'ka BP', 'ka',
                'my BP', 'myr BP', 'myrs BP', 'ma BP', 'ma',
            }
            
When these units are provided as part of `Series`, `Pyleoclim` will interpret them as you, a human with knowledge of the paleosciences, would. If you give it anything else, it will not understand, so you can either help expand `Pyleoclim`'s repertoire by submitting a [GitHub issue](https://github.com/LinkedEarth/Pyleoclim_util/issues) or express your data in one of those forms. If no unit is provided, `Pyleoclim` assumes that they are expressed in "year CE", that is : prograde, and with a datum aligned with year 0 (which exists in the astronomical calendar, and is implicitly assumed by most paleoscientists, even if it does not exist in the Gregorian calendar).
This is how it works:          

In [4]:
print('Original timeseries:')
print('time unit:', ts_air.time_unit)
print('time:', ts_air.time)
print()
new_ts_air = ts_air.convert_time_unit(time_unit='years')
print('Converted timeseries:')
print('time unit:', new_ts_air.time_unit)
print('time:', new_ts_air.time)
print('-------------------------------')

for tu in ['yrs BP', 'ky BP', 'my BP', 'ka', 'ma']:
    print('Original timeseries:')
    print('time unit:', ts_nino.time_unit)
    print('time:', ts_nino.time)
    print()
    new_ts_nino = ts_nino.convert_time_unit(time_unit=tu)
    print('Converted timeseries:')
    print('time unit:', new_ts_nino.time_unit)
    print('time:', new_ts_nino.time)
    print('-------------------------------')

Original timeseries:
time unit: year
time: [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]

Converted timeseries:
time unit: years
time: [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]
-------------------------------
Original timeseries:
time unit: year
time: [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]

The time axis has been adjusted to be ascending!
Converted timeseries:
time unit: yrs BP
time: [-53.92 -53.83 -53.75 ...  78.83  78.92  79.  ]
-------------------------------
Original timeseries:
time unit: year
time: [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]

The time axis has been adjusted to be ascending!
Converted timeseries:
time unit: ky BP
time: [-0.05392 -0.05383 -0.05375 ...  0.07883  0.07892  0.079  ]
-------------------------------
Original timeseries:
time unit: year
time: [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]

The time axis has been adjusted to be ascending!
Converted timeseries:
time unit: my BP
time: [-5.392e-05 -5.383e-05

Notice that the time axis internally converted to be always ascending. Despite geologists' insistance that time goes from recent to old, most normally-constituted people read time as flowing from old to new, which in the Western reading convention, is left to right. Pyleoclim follows this convention internally so all records can be treated uniformly. 

## Conversion in `MultipleSeries`

When defining a `MultipleSeries`, the argument `time_unit` can be specified to convert the time unit for all `Series` in the list. If `time_unit` is set to None as by default, then no conversion is applied.

In [4]:
ms = pyleo.MultipleSeries([new_ts_air, new_ts_nino], time_unit=None)
print('Original timeseries:')
for ts in ms.series_list:
    print('value_name:', ts.value_name)
    print('time_unit:', ts.time_unit)
    print('time', ts.time)

Original timeseries:
value_name: air
time_unit: years
time [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]
value_name: nino
time_unit: ma
time [-5.392e-05 -5.383e-05 -5.375e-05 ...  7.883e-05  7.892e-05  7.900e-05]


In [5]:
ms = pyleo.MultipleSeries([new_ts_air, new_ts_nino], time_unit='yrs BP')
print('Converted timeseries:')
for ts in ms.series_list:
    print('value_name:', ts.value_name)
    print('time_unit:', ts.time_unit)
    print('time', ts.time)

The time axis has been adjusted to be ascending!
Converted timeseries:
value_name: air
time_unit: yrs BP
time [-53.92 -53.83 -53.75 ...  78.83  78.92  79.  ]
value_name: nino
time_unit: yrs BP
time [-53.92 -53.83 -53.75 ...  78.83  78.92  79.  ]


In [6]:
ms = pyleo.MultipleSeries([new_ts_air, new_ts_nino], time_unit='years')
print('Converted timeseries:')
for ts in ms.series_list:
    print('value_name:', ts.value_name)
    print('time_unit:', ts.time_unit)
    print('time', ts.time)

The time axis has been adjusted to be ascending!
Converted timeseries:
value_name: air
time_unit: years
time [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]
value_name: nino
time_unit: years
time [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]


In [7]:
ms = pyleo.MultipleSeries([new_ts_air, new_ts_nino], time_unit='yr')
print('Converted timeseries:')
for ts in ms.series_list:
    print('value_name:', ts.value_name)
    print('time_unit:', ts.time_unit)
    print('time', ts.time)

The time axis has been adjusted to be ascending!
Converted timeseries:
value_name: air
time_unit: yr
time [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]
value_name: nino
time_unit: yr
time [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]


In [8]:
ms = pyleo.MultipleSeries([new_ts_air, new_ts_nino], time_unit='yr')
print('Converted timeseries:')
for ts in ms.series_list:
    print('value_name:', ts.value_name)
    print('time_unit:', ts.time_unit)
    print('time', ts.time)

The time axis has been adjusted to be ascending!
Converted timeseries:
value_name: air
time_unit: yr
time [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]
value_name: nino
time_unit: yr
time [1871.   1871.08 1871.17 ... 2003.75 2003.83 2003.92]


We can also do the conversion via the method of `MultipleSeries`. 

In [9]:
new_ms = ms.convert_time_unit('yr BP')
for ts in new_ms.series_list:
    print('value_name:', ts.value_name)
    print('time_unit:', ts.time_unit)
    print('time', ts.time)

The time axis has been adjusted to be ascending!
The time axis has been adjusted to be ascending!
value_name: air
time_unit: yr BP
time [-53.92 -53.83 -53.75 ...  78.83  78.92  79.  ]
value_name: nino
time_unit: yr BP
time [-53.92 -53.83 -53.75 ...  78.83  78.92  79.  ]
