# Computing climate indicators

This notebook will get you started on the use of `xclim` to subset netCDF arrays and compute climate indicators, taking advantage of parallel processing capabilities offered by `xarray` and `dask`. 

`xarray` is a python package making it easy to work with n-dimensional arrays. It labels axes with their names (time, lat, lon, level) instead of indices (0,1,2,3), reducing the likelihood of bugs and making the code easier to understand. One of the key strengths of `xarray` is that it knows how to deal with non-standard calendars (I'm looking at you 360_days) and can easily resample daily time series to weekly, monthly, seasonal or annual periods.  Finally, `xarray` is tightly inegrated with `dask`, a package that can automatically parallelize operations.


In [1]:
# XCLIM and xarray
import xclim.indices as indices
import xclim.atmos as atmos
import numpy as np
import xarray as xr
import dask

# file handling libraries
import os
import glob
import time
import tempfile
from pathlib import Path

# Output folder
outfolder = Path(tempfile.mkdtemp()) 




## 1. Setting up the Dask client - Parallel processing / workers

First we create a pool of workers that will wait for jobs. The `xarray` library will automatically connect to these workers and and dispatch them jobs that can be run in parallel. 

The dashboard link lets you see in real time how busy those workers are. 

In [2]:
from distributed import Client
client=Client(n_workers=2, threads_per_worker=10, dashboard_address=8788, memory_limit='6GB') 
#client=Client(n_workers=1)
client

DEBUG:asyncio:Using selector: EpollSelector
DEBUG:asyncio:Using selector: EpollSelector


0,1
Client  Scheduler: tcp://127.0.0.1:46612  Dashboard: http://127.0.0.1:8788/status,Cluster  Workers: 2  Cores: 20  Memory: 12.00 GB


## 2. Finding data files 

In [3]:
infolder = '<path_to_data>/cb-oura-1.0/'

# Get list of files for tasmax
rcps = ['rcp45','rcp85']
v = 'tasmax'
r = rcps[0]
search_str = os.path.join(infolder, '{v}*CanESM*{r}*.nc'.format(v=v,r=r))
sim_files= sorted(glob.glob(search_str))
print(len(sim_files))

151


## 3. Creating xarray datasets

To open a netCDF file with `xarray`, we use `xr.open_dataset(<path to file>)`. But by default, the entire file is stored in one chunk, so there is no parallelism. To trigger parallel computations, we need to explicitly specify the *chunk* size. 

`Dask`' parallelism is based on memory chunks. We need to tell `xarray` to split our netCDF array into chunks of a given size, and operations on each chunk of the array will automatically be dispatched to the workers. 

In [5]:
# This file is opened as one big chunk: no parallel processing. 
ds = xr.open_dataset(sim_files[0])
print(ds.tasmax)

<xarray.DataArray 'tasmax' (time: 365, lat: 700, lon: 1064)>
[271852000 values with dtype=float32]
Coordinates:
  * time     (time) object 1950-01-01 00:00:00 ... 1950-12-31 00:00:00
  * lat      (lat) float32 83.28931 83.20598 83.12265 ... 25.12497 25.04164
  * lon      (lon) float32 -141.04314 -140.9598 ... -52.54667 -52.46334
Attributes:
    units:          K
    long_name:      air_temperature
    standard_name:  air_temperature


In [6]:
# Chunked in memory along the time dimension.
# Note that the data type is a 'dask.array'. xarray will automatically use client workers 
ds = xr.open_dataset(sim_files[0], chunks={'time': 31})
print(ds.tasmax)
ds.tasmax.chunks

<xarray.DataArray 'tasmax' (time: 365, lat: 700, lon: 1064)>
dask.array<shape=(365, 700, 1064), dtype=float32, chunksize=(31, 700, 1064)>
Coordinates:
  * time     (time) object 1950-01-01 00:00:00 ... 1950-12-31 00:00:00
  * lat      (lat) float32 83.28931 83.20598 83.12265 ... 25.12497 25.04164
  * lon      (lon) float32 -141.04314 -140.9598 ... -52.54667 -52.46334
Attributes:
    units:          K
    long_name:      air_temperature
    standard_name:  air_temperature


((31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 31, 24), (700,), (1064,))

### 3.1. Multifile dataset
netCDF files are often split into periods to keep file size manageable. A single dataset can be split in dozens of individual files. `xarray` has a function `open_mfdataset` that can open and aggregate a list of files and construct a unique *logical* dataset. `open_mfdataset` can aggregate files over coordinates (time, lat, lon) and variables. 

Note that opening a multi-file dataset automatically chunks the array (one chunk per file).

Note also that because `xarray` reads every file metadata to place it in a logical order, it can take a while to load. 

In [8]:
# Create multi-file data & chunks 
ds = xr.open_mfdataset(sim_files, chunks={'time':365, 'lat':50*2, 'lon':56*2})
ds = ds.drop('time_vectors')
ds = ds.drop('ts')
print(ds)

<xarray.Dataset>
Dimensions:  (lat: 700, lon: 1064, time: 55115)
Coordinates:
  * lat      (lat) float32 83.28931 83.20598 83.12265 ... 25.12497 25.04164
  * lon      (lon) float32 -141.04314 -140.9598 ... -52.54667 -52.46334
  * time     (time) object 1950-01-01 00:00:00 ... 2100-12-31 00:00:00
Data variables:
    tasmax   (time, lat, lon) float32 dask.array<shape=(55115, 700, 1064), chunksize=(365, 100, 112)>
Attributes:
    Conventions:     CF-1.5
    title:           CanESM2 model output prepared for CMIP5 historical
    history:         2011-04-14T00:21:01Z CMOR rewrote data to comply with CF...
    institution:     CCCma (Canadian Centre for Climate Modelling and Analysi...
    source:          CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean:...
    redistribution:  Redistribution prohibited. For internal use only.


## 4. Subsetting utilities

### subset_bbox : using a latitude-longitude bounding box

In [9]:
from xclim import subset
lat_bnds = [45, 60]
lon_bnds = [-55, -82]

ds1 = subset.subset_bbox(ds,lat_bnds=lat_bnds,lon_bnds=lon_bnds)
print(ds1)

<xarray.Dataset>
Dimensions:  (lat: 180, lon: 324, time: 55115)
Coordinates:
  * lat      (lat) float64 59.96 59.87 59.79 59.71 ... 45.29 45.21 45.12 45.04
  * lon      (lon) float64 -81.96 -81.88 -81.8 -81.71 ... -55.21 -55.13 -55.05
  * time     (time) object 1950-01-01 00:00:00 ... 2100-12-31 00:00:00
Data variables:
    tasmax   (time, lat, lon) float32 dask.array<shape=(55115, 180, 324), chunksize=(365, 20, 75)>
Attributes:
    Conventions:     CF-1.5
    title:           CanESM2 model output prepared for CMIP5 historical
    history:         2011-04-14T00:21:01Z CMOR rewrote data to comply with CF...
    institution:     CCCma (Canadian Centre for Climate Modelling and Analysi...
    source:          CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean:...
    redistribution:  Redistribution prohibited. For internal use only.


### Add start and/or end years

Note that in the next release, we'll use datetime objects instead of a year integer to specify start and end points.

In [10]:
ds2 = subset.subset_bbox(ds,lat_bnds=lat_bnds,lon_bnds=lon_bnds, start_yr=1981, end_yr=2010)
print(ds2)
print(' ')

# subset years only
ds2 = subset.subset_bbox(ds, start_yr=1981, end_yr=2010)
print(ds2)

<xarray.Dataset>
Dimensions:  (lat: 180, lon: 324, time: 10950)
Coordinates:
  * lat      (lat) float64 59.96 59.87 59.79 59.71 ... 45.29 45.21 45.12 45.04
  * lon      (lon) float64 -81.96 -81.88 -81.8 -81.71 ... -55.21 -55.13 -55.05
  * time     (time) object 1981-01-01 00:00:00 ... 2010-12-31 00:00:00
Data variables:
    tasmax   (time, lat, lon) float32 dask.array<shape=(10950, 180, 324), chunksize=(365, 20, 75)>
Attributes:
    Conventions:     CF-1.5
    title:           CanESM2 model output prepared for CMIP5 historical
    history:         2011-04-14T00:21:01Z CMOR rewrote data to comply with CF...
    institution:     CCCma (Canadian Centre for Climate Modelling and Analysi...
    source:          CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean:...
    redistribution:  Redistribution prohibited. For internal use only.
 
<xarray.Dataset>
Dimensions:  (lat: 700, lon: 1064, time: 10950)
Coordinates:
  * lat      (lat) float64 83.29 83.21 83.12 83.04 ... 25.29 25.21 25.12 

### Select a single grid point 

In [11]:
lon_pt = -70.0
lat_pt = 50.0

ds3 = subset.subset_gridpoint(ds,lon=lon_pt,lat=lat_pt, start_yr=1981)
print(ds3)

<xarray.Dataset>
Dimensions:  (time: 43800)
Coordinates:
    lat      float32 50.04064
    lon      float32 -69.96264
  * time     (time) object 1981-01-01 00:00:00 ... 2100-12-31 00:00:00
Data variables:
    tasmax   (time) float32 dask.array<shape=(43800,), chunksize=(365,)>
Attributes:
    Conventions:     CF-1.5
    title:           CanESM2 model output prepared for CMIP5 historical
    history:         2011-04-14T00:21:01Z CMOR rewrote data to comply with CF...
    institution:     CCCma (Canadian Centre for Climate Modelling and Analysi...
    source:          CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean:...
    redistribution:  Redistribution prohibited. For internal use only.


### Nothing has been computed so far !

If you look at the output of those operations, they're identified as `dask.array` objects. What happens is that `dask` creates a chain of operations that when executed, will yield the values we want. But as long as we don't explicitly ask for a value, no computation will occur. 

You can trigger computations by using the `load` or `compute` method, or writing the output to disk. 

## 5. Climate index calculation & resampling frequencies

`xclim` has two layers for the calculation of indicators. The bottom layer is composed of a list of functions that take a `xarray.DataArray` as an input and return an `xarray.DataArray` as output. You'll find these functions in `xclim.indices`. The indicator's logic is contained in this function, as well as potential unit conversions, but it doesn't check if the time frequency is daily, and doesn't not adjust the meta data of the output array. 

The second layer are class instances that you'll find organized by *realm*. So far, there is only one realm (atmospheric) available in `xclim.atmos`, but we'll be working on `ice` and `land` indicators in 2020. Before running computations, these classes check the input data is a daily average of the expected variable: 
1. If an indicator expects a daily mean and you pass it a daily max, a `warning` will be raised. 
2. After the computation, it also checks the number of values per period to make sure there are not missing values or `NaN` in the input data. If there are, the output is going to be set to `NaN`. 
3. The output units are set correctly as well as other properties of the output array, complying as much as possible with CF conventions. 

For new users, we suggest you use the classes found in `xclim.atmos`. If you know what you're doing and you want to circumvent the built-in checks, then you can use the `xclim.indices` directly. 

All `xclim` indicators convert daily data to lower time frequencies, such as monthly or annual values. This is done using `xarray.DataArray.resample` method. Resampling creates a grouped object over which you apply a reduction operation (e.g. mean, min, max). The list of available frequency is given in the link below, but the most often used are: 

- YS: annual starting in January
- YS-JUL: annual starting in July
- MS: monthly
- QS-DEC: seasonal starting in December
- 7D: 7 day (weekly)


http://pandas.pydata.org/pandas-docs/stable/user_guide/timeseries.html#timeseries-offset-aliases  
Note - not all offsets in the link are supported by cftime objects in `xarray`


In the example below, we're computing the **annual maximum temperature of the daily maximum temperature (tx_max)**

In [12]:
fr = 'YS'
ds1.tasmax.attrs['cell_methods'] = 'time: maximum within days'
out = atmos.tx_max(ds1.tasmax, freq=fr)
print('Number of time-steps using freq == ', fr, ' : ', len(out.time),'\n')
print(out.time)

Number of time-steps using freq ==  YS  :  151 

<xarray.DataArray 'time' (time: 151)>
array([cftime.DatetimeNoLeap(1950, 1, 1, 0, 0, 0, 0, 4, 1),
       cftime.DatetimeNoLeap(1951, 1, 1, 0, 0, 0, 0, 5, 1),
       cftime.DatetimeNoLeap(1952, 1, 1, 0, 0, 0, 0, 6, 1),
       cftime.DatetimeNoLeap(1953, 1, 1, 0, 0, 0, 0, 0, 1),
       cftime.DatetimeNoLeap(1954, 1, 1, 0, 0, 0, 0, 1, 1),
       cftime.DatetimeNoLeap(1955, 1, 1, 0, 0, 0, 0, 2, 1),
       cftime.DatetimeNoLeap(1956, 1, 1, 0, 0, 0, 0, 3, 1),
       cftime.DatetimeNoLeap(1957, 1, 1, 0, 0, 0, 0, 4, 1),
       cftime.DatetimeNoLeap(1958, 1, 1, 0, 0, 0, 0, 5, 1),
       cftime.DatetimeNoLeap(1959, 1, 1, 0, 0, 0, 0, 6, 1),
       cftime.DatetimeNoLeap(1960, 1, 1, 0, 0, 0, 0, 0, 1),
       cftime.DatetimeNoLeap(1961, 1, 1, 0, 0, 0, 0, 1, 1),
       cftime.DatetimeNoLeap(1962, 1, 1, 0, 0, 0, 0, 2, 1),
       cftime.DatetimeNoLeap(1963, 1, 1, 0, 0, 0, 0, 3, 1),
       cftime.DatetimeNoLeap(1964, 1, 1, 0, 0, 0, 0, 4, 1),
       cftime

### Example output using `atmos` vs `indices` modules
The `atmos` module adds CF metadata attributes to the output variable

In [13]:
out30d = atmos.tx_days_above(ds1.tasmax,thresh='25 C',freq=fr)
print('output atmos : \n', out30d,'\n\n\n',)

out30d = indices.tx_days_above(ds1.tasmax,thresh='25 C',freq=fr)
print('output indices : \n', out30d)

output atmos : 
 <xarray.DataArray 'txgt_25 C' (time: 151, lat: 180, lon: 324)>
dask.array<shape=(151, 180, 324), dtype=float64, chunksize=(1, 20, 75)>
Coordinates:
  * time     (time) object 1950-01-01 00:00:00 ... 2100-01-01 00:00:00
  * lat      (lat) float64 59.96 59.87 59.79 59.71 ... 45.29 45.21 45.12 45.04
  * lon      (lon) float64 -81.96 -81.88 -81.8 -81.71 ... -55.21 -55.13 -55.05
Attributes:
    units:          days
    history:        [2019-05-14 11:43:52] txgt_25 C(tasmax, thresh='25.0 degC...
    cell_methods:   time: maximum within days time: maximum within days time:...
    abstract:       Number of days where daily maximum temperature exceed a t...
    keywords:       
    standard_name:  number_of_days_with_air_temperature_above_threshold
    long_name:      Number of days with Tmax > 25 CC
    description:    Annual number of days where daily maximum temperature exc...
    comment:        
    references:     
    notes:          \nLet :math:`TX_{ij}` be the daily ma

In [14]:
# We have created an xarray data-array - We can insert this into an output dataset object
# Create an xarray dataset object - copy original dataset global attrs
dsOut = xr.Dataset(data_vars=None, coords=out.coords, attrs=ds1.attrs)
# Add our climate index as a data variable to the dataset
dsOut[out.name] = out
print(dsOut)

<xarray.Dataset>
Dimensions:  (lat: 180, lon: 324, time: 151)
Coordinates:
  * time     (time) object 1950-01-01 00:00:00 ... 2100-01-01 00:00:00
  * lat      (lat) float64 59.96 59.87 59.79 59.71 ... 45.29 45.21 45.12 45.04
  * lon      (lon) float64 -81.96 -81.88 -81.8 -81.71 ... -55.21 -55.13 -55.05
Data variables:
    tx_max   (time, lat, lon) float32 dask.array<shape=(151, 180, 324), chunksize=(1, 20, 75)>
Attributes:
    Conventions:     CF-1.5
    title:           CanESM2 model output prepared for CMIP5 historical
    history:         2011-04-14T00:21:01Z CMOR rewrote data to comply with CF...
    institution:     CCCma (Canadian Centre for Climate Modelling and Analysi...
    source:          CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean:...
    redistribution:  Redistribution prohibited. For internal use only.


## 7. xclim computations are *lazy*

Up until now we have ony created a schedule of tasks with a small preview, not done any actual computations. As mentionned above, writing the output to disk will trigger the cascade of computations on all the chunks. 

In [15]:
outfile = outfolder / 'test_tx_max.nc'
start= time.time()
dsOut.to_netcdf(outfile, format='NETCDF4')
end = time.time()
print('calculation took ',end-start, 's')

calculation took  147.816965341568 s


### Optimizing the chunk size

You can improve performance by being smart about chunk sizes. If chunks are too small, there is a lot of time lost in overhead. If chunks are too large, you may end up exceeding the individual worker memory limit. 

In [16]:
print(ds1)

<xarray.Dataset>
Dimensions:  (lat: 180, lon: 324, time: 55115)
Coordinates:
  * lat      (lat) float64 59.96 59.87 59.79 59.71 ... 45.29 45.21 45.12 45.04
  * lon      (lon) float64 -81.96 -81.88 -81.8 -81.71 ... -55.21 -55.13 -55.05
  * time     (time) object 1950-01-01 00:00:00 ... 2100-12-31 00:00:00
Data variables:
    tasmax   (time, lat, lon) float32 dask.array<shape=(55115, 180, 324), chunksize=(365, 20, 75)>
Attributes:
    Conventions:     CF-1.5
    title:           CanESM2 model output prepared for CMIP5 historical
    history:         2011-04-14T00:21:01Z CMOR rewrote data to comply with CF...
    institution:     CCCma (Canadian Centre for Climate Modelling and Analysi...
    source:          CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean:...
    redistribution:  Redistribution prohibited. For internal use only.


In [17]:
ds1 = ds1.chunk(chunks={'time':365, 'lon':-1, 'lat':-1})
print(ds1)

<xarray.Dataset>
Dimensions:  (lat: 180, lon: 324, time: 55115)
Coordinates:
  * lat      (lat) float64 59.96 59.87 59.79 59.71 ... 45.29 45.21 45.12 45.04
  * lon      (lon) float64 -81.96 -81.88 -81.8 -81.71 ... -55.21 -55.13 -55.05
  * time     (time) object 1950-01-01 00:00:00 ... 2100-12-31 00:00:00
Data variables:
    tasmax   (time, lat, lon) float32 dask.array<shape=(55115, 180, 324), chunksize=(365, 180, 324)>
Attributes:
    Conventions:     CF-1.5
    title:           CanESM2 model output prepared for CMIP5 historical
    history:         2011-04-14T00:21:01Z CMOR rewrote data to comply with CF...
    institution:     CCCma (Canadian Centre for Climate Modelling and Analysi...
    source:          CanESM2 2010 atmosphere: CanAM4 (AGCM15i, T63L35) ocean:...
    redistribution:  Redistribution prohibited. For internal use only.


In [18]:
out = atmos.tx_max(ds1.tasmax, freq=fr)
dsOut = xr.Dataset(data_vars=None, coords=out.coords, attrs=ds1.attrs)
dsOut[out.name] = out

start= time.time()

dsOut.to_netcdf( outfile,format='NETCDF4')

end = time.time()
print('calculation took ',end-start, 's')

calculation took  84.12525868415833 s


### XCLIM unit handling 

A lot of effort has been placed into automatic handling of input data units.  `xclim` will automatically detect the input variable(s) units (e.g. °C versus °K or mm/s versus mm/day etc.) and adjust on-the-fly in order to calculate indices in the consistent manner.  This comes with the obvious caveat that input data requires metadata attribute for units

In the example below, we compute weekly total precipitation in mm using inputs of mm/s and mm/d.

In [19]:
dsPr = xr.open_dataset(sim_files[0].replace('tasmax','pr'),chunks={'time':31}).drop(['ts','time_vectors'])
dsPr = subset.subset_gridpoint(dsPr,lon=lon_pt,lat=lat_pt)

# Create a copy of the data converted to mm d-1
dsPr_mmd = dsPr.copy()
dsPr_mmd['pr'].values = dsPr.pr.values * 3600 *24
dsPr_mmd.pr.attrs['units'] = 'mm d-1'

print(dsPr.pr.values[0:31],'\n')
print(dsPr_mmd.pr.values[0:31])

[0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 9.66575590e-06 3.50769289e-04 1.34260035e-05
 9.91609704e-06 0.00000000e+00 0.00000000e+00 5.53925656e-06
 0.00000000e+00 0.00000000e+00 0.00000000e+00 0.00000000e+00
 0.00000000e+00 0.00000000e+00 1.05151732e-04 1.17095777e-04
 1.53630954e-05 4.42854389e-06 6.87591637e-06 5.46419369e-06
 8.52907669e-06 1.59427127e-05 2.47019962e-05 0.00000000e+00
 1.95580651e-05 3.90298010e-05 1.75301684e-05] 

[ 0.          0.          0.          0.          0.          0.83512133
 30.306467    1.1600066   0.8567507   0.          0.          0.4785918
  0.          0.          0.          0.          0.          0.
  9.08511    10.117075    1.3273714   0.38262618  0.59407914  0.47210634
  0.7369122   1.3774505   2.1342525   0.          1.689817    3.3721747
  1.5146066 ]


In [20]:
out1 = atmos.precip_accumulation(dsPr.pr,freq='MS')
print('1. results using inputs in mm/s : \n\n','units :', out1.units,'\n',out1.values,'\n')

out2 = atmos.precip_accumulation(dsPr_mmd.pr,freq='MS')
print('2. results using inputs in mm/d : \n\n','units :', out2.units,'\n',out2.values,'\n')
   

1. results using inputs in mm/s : 

 units : mm 
 [ 66.44051   54.20339   36.376373 101.53389   33.530552  87.39966
 104.95978  133.99432   89.70896   56.47059   75.21128   49.83245 ] 

2. results using inputs in mm/d : 

 units : mm 
 [ 66.44052   54.203384  36.376373 101.5339    33.53055   87.39966
 104.959785 133.99434   89.70897   56.470596  75.21129   49.83245 ] 



#### Threshold indices

`xclim` unit handling also applies to threshold indicators. Users can provide threshold in units of choice and `xclim` will adjust automatically.  For example determining the number of days with tasmax > 20°C users can define a threshold input of '20 C' or '20 degC' even if input data is in Kelvin.  Alernatively users could send provide a threshold in Kelvin '293.15 K' (if they really wanted to)

In [21]:
# Original data in Kelvin
dsTasmax = xr.open_dataset(sim_files[0],chunks={'time':31}).drop(['ts','time_vectors'])
dsTasmax.tasmax.attrs['cell_methods'] = 'time: maximum within days'
dsTasmax = subset.subset_gridpoint(dsTasmax,lon=lon_pt,lat=lat_pt)

# Create a copy of the data converted to C
dsTasmax_C = dsTasmax.copy()
dsTasmax_C['tasmax'].values = dsTasmax.tasmax.values - 273.15
dsTasmax_C.tasmax.attrs['units'] = 'C'

print(dsTasmax.tasmax.values[0:31],'\n')
print(dsTasmax_C.tasmax.values[0:31])


[273.5242  271.8584  247.72333 252.86673 264.26947 273.6165  274.6312
 266.9602  249.18898 255.07526 263.65112 265.15714 253.79906 260.2527
 261.9956  267.89328 268.66718 268.61023 268.0569  267.94894 256.55618
 248.05493 247.59552 248.5298  248.72592 248.59483 250.02737 247.43605
 249.71368 250.91469 249.64064] 

[  0.37420654  -1.2915955  -25.426666   -20.283264    -8.880524
   0.4664917    1.4812012   -6.189789   -23.961014   -18.074738
  -9.498871    -7.992859   -19.350937   -12.897308   -11.154388
  -5.256714    -4.4828186   -4.5397644   -5.093109    -5.20105
 -16.593811   -25.095062   -25.554474   -24.620193   -24.424072
 -24.55516    -23.12262    -25.713943   -23.43631    -22.235306
 -23.509354  ]


In [22]:
# Using Kelvin data
out1 = atmos.tx_days_above(dsTasmax.tasmax,thresh='20 C', freq='MS')
print('1. results using inputs in °K  : threshold in °C: \n\n',out1.values,'\n')

# Using Celsius data
out2 = atmos.tx_days_above(dsTasmax_C.tasmax,thresh='20 C', freq='MS')
print('2. results using inputs in °C : threshold in °C\n\n',out2.values)

# Using Celsius but with threshold in Kelvin
out3 = atmos.tx_days_above(dsTasmax_C.tasmax,thresh='293.15 K', freq='MS')
print('\n3. results using inputs in °C : threshold in °K : \n\n',out3.values)


1. results using inputs in °K  : threshold in °C: 

 [ 0.  0.  0.  0.  4. 14. 16. 16.  3.  0.  0.  0.] 

2. results using inputs in °C : threshold in °C

 [ 0.  0.  0.  0.  4. 14. 16. 16.  3.  0.  0.  0.]

3. results using inputs in °C : threshold in °K : 

 [ 0.  0.  0.  0.  4. 14. 16. 16.  3.  0.  0.  0.]
