# 02. Common DataStore functions
Examples of how to do some of the more commonly used functions:

1. mean, min, max, std
2. Selecting
3. Selecting by index
4. Downsample (time dimension)
5. Upsample / Interpolation (length and time dimension)

In [1]:
import os

from dtscalibration import read_silixa_files

In [2]:
try:
    wd = os.path.dirname(os.path.realpath(__file__))
except:
    wd = os.getcwd()

filepath = os.path.join(wd, '..', '..', 'tests', 'data', 'single_ended')
timezone_netcdf = 'UTC',
timezone_ultima_xml = 'Europe/Amsterdam'
file_ext = '*.xml'

ds = read_silixa_files(
    directory=filepath,
    timezone_netcdf=timezone_netcdf,
    timezone_ultima_xml=timezone_ultima_xml,
    file_ext=file_ext)

3 files were found, each representing a single timestep
4 recorded vars were found: LAF, ST, AST, TMP
Recorded at 1461 points along the cable
The measurement is single ended


## 8.1 mean, min, max
The first argument is the dimension. The function is taken along that dimension. `dim` can be any dimension (e.g., `time`, `x`). The returned `DataStore` does not contain that dimension anymore.

Normally, you would like to keep the attributes (the informative texts from the loaded files), so set `keep_attrs` to `True`.

Note that also the sections are stored as attribute. If you delete the attributes, you would have to redefine the sections.

In [3]:
ds_min = ds.mean(dim='time', keep_attrs=True)

In [4]:
ds_max = ds.max(dim='x', keep_attrs=True)

In [5]:
ds_std = ds.std(dim='time', keep_attrs=True)

## 2 Selecting
What if you would like to get the maximum temperature between $x >= 20$ m and $x < 35$ m over time? We first have to select a section along the cable.

In [6]:
section = slice(20., 35.)
section_of_interest = ds.sel(x=section)

In [7]:
section_of_interest_max = section_of_interest.max(dim='x')

What if you would like to have the measurement at approximately $x=20$ m?

In [8]:
section_of_interest = ds.sel(x=20., method='nearest')

## 3 Selecting by index
What if you would like to see what the values on the first timestep are? We can use isel (index select) 

In [9]:
section_of_interest = ds.isel(x=0)

In [10]:
section_of_interest = ds.isel(time=slice(0, 2))  # The first two time steps

## 4 Downsample (time dimension)
We currently have measurements at 3 time steps, with 30.001 seconds inbetween. For our next exercise we would like to down sample the measurements to 2 time steps with 46 seconds inbetween. The calculated variances are not valid anymore. We use the function `resample_datastore`.

In [11]:
ds.time.data

array(['2018-05-04T10:22:17.710000000', '2018-05-04T10:22:47.702000000',
       '2018-05-04T10:23:18.716000000'], dtype='datetime64[ns]')

In [12]:
ds.resample_datastore(how='mean', time="47S")

<dtscalibration.DataStore>
Sections:    ()
Dimensions:                (time: 2, x: 1461)
Coordinates:
  * time                   (time) datetime64[ns] 2018-05-04T10:21:58 2018-05-04T10:22:45
  * x                      (x) float64 -80.74 -80.62 -80.49 ... 104.7 104.8
Data variables:
    ST                     (time, x) float64 -0.8058 -0.4589 ... 37.89 28.32
    AST                    (time, x) float64 -0.2459 0.3748 ... 50.28 35.43
    TMP                    (time, x) float64 0.0 0.0 0.0 ... 21.57 131.0 112.3
    acquisitionTime        (time) float32 30.71 30.709
    referenceTemperature   (time) float32 24.5187 24.5153
    probe1Temperature      (time) float32 18.0204 18.02135
    probe2Temperature      (time) float32 6.61986 6.616935
    referenceProbeVoltage  (time) float32 0.123199 0.123198
    probe1Voltage          (time) float32 0.12 0.12
    probe2Voltage          (time) float32 0.115 0.115
    userAcquisitionTimeFW  (time) float32 30.0 30.0
Attributes:
    uid:                

## 5 Upsample / Interpolation (length and time dimension)
So we have measurements every 0.12 cm starting at $x=0$ m. What if we would like to change our coordinate system to have a value every 12 cm starting at $x=0.05$ m. We use (linear) interpolation, extrapolation is not supported. The calculated variances are not valid anymore.

In [14]:
x_old = ds.x.data
x_new = x_old[:-1] + 0.05 # no extrapolation
ds_xinterped = ds.interp(coords={'x': x_new})

We can do the same in the time dimension

In [15]:
import numpy as np
time_old = ds.time.data
time_new = time_old + np.timedelta64(10, 's')
ds_tinterped = ds.interp(coords={'time': time_new})