# Bioclimatic variables

Compute the bioclimatic variables as defined in https://pubs.usgs.gov/ds/691/ds691.pdf. Here are the variables from the dataset to compare results: https://worldclim.org/data/worldclim21.html.

In [None]:
import xarray as xr
import matplotlib.pyplot as plt
import numpy as np
from os.path import isfile, exists
from os import mkdir

In [None]:
dsspat = xr.open_zarr("https://openstack.cebitec.uni-bielefeld.de:8080/swift/v1/DWDCube/cube_spatial.zarr/")
dstime = xr.open_zarr("https://openstack.cebitec.uni-bielefeld.de:8080/swift/v1/DWDCube/cube_temporal.zarr/")
subset = dstime.isel(Time=range(0, 2*365+1)) #two year subset for faster testing

In [None]:
# Lets look at the data
dstime

In [None]:
# Check for a folder to save the data
if not exists("./datasets"):
    mkdir("./datasets")

# Intermediate variables

The original definitions are based on daily maximum and minimum temperatures, which our data does not have. We only have daily mean temperatures. So there will be an inherent difference between our calculations and the data from worldclim.

### Temperature

In [None]:
# Monthly max:
if isfile("./datasets/tas_mmax.nc"):
    tas_mmax = xr.open_dataarray("./datasets/tas_mmax.nc")
else:
    tas_mmax = dstime["tas"].resample(Time="1MS").max(dim="Time")
    tas_mmax.to_netcdf("./datasets/tas_mmax.nc")
    
    
# Monthly min:
if isfile("./datasets/tas_mmin.nc"):
    tas_mmin = xr.open_dataarray("./datasets/tas_mmin.nc")
else:
    tas_mmin = dstime["tas"].resample(Time="1MS").min(dim="Time")
    tas_mmin.to_netcdf("./datasets/tas_mmin.nc")
    
    
# Monthly mean:
if isfile("./datasets/tas_mmean.nc"):
    tas_mmean = xr.open_dataarray("./datasets/tas_mmean.nc")
else:
    tas_mmean = dstime["tas"].resample(Time="1MS").mean(dim="Time")
    tas_mmean.to_netcdf("./datasets/tas_mmean.nc")

### Precipitation

In [None]:
# Monthly total:
if isfile("./datasets/prec_mtot.nc"):
    prec_mtot = xr.open_dataarray("./datasets/prec_mtot.nc")
else:
    prec_mtot = dstime["pr"].resample(Time="1MS").sum(dim="Time") * 86400 #convert from flux to total in mm
    prec_mtot = prec_mtot.assign_attrs({"standard_name": "total_precipitation", "units": "mm"})
    prec_mtot.to_netcdf("./datasets/prec_mtot.nc")

# [BIO1] Annual Mean Temperature

The annual mean temperature is defined as the average of all the 12 monthly averages.

In [None]:
ds_bio1 = tas_mmean.groupby("Time.year").mean("Time")-273.15
ds_bio1.rename("BIO1").assign_attrs(name="BIO1", standard_name="BIO1", long_name="Annual Mean Temperature", units="°C").to_netcdf("./datasets/bio1.nc")

# [BIO2] Mean Diurnal Range

The mean diurnal range is defined as mean of monthly maximum temperatures and minimum temperatures, i.e.: MDR $= \frac{1}{12} \sum_{i=1}^{12} T_\text{max} - T_\text{min}$

In [None]:
monthly_range = tas_mmax - tas_mmin

ds_bio2 = monthly_range.groupby("Time.year").mean(dim="Time")
ds_bio2.rename("BIO2").assign_attrs(name="BIO2", standard_name="BIO2", long_name="Annual Mean Diurnal Range",units="°C").to_netcdf("./datasets/bio2.nc")

# [BIO4] Temperature Seasonality

Temperature seasonality is defined as the standard deviation (over one year) of monthly temperature averages.

In [None]:
ds_bio4 = tas_mmean.groupby("Time.year").std(dim="Time")
ds_bio4.rename("BIO4").assign_attrs(name="BIO4", standard_name="BIO4", long_name="Temperature Seasonality", units="°C").to_netcdf("./datasets/bio4.nc")

# [BIO5] Max Temperature of Warmest Month

In [None]:
ds_bio5 = tas_mmax.groupby("Time.year").max(dim="Time")-273.15
ds_bio5.rename("BIO5").assign_attrs(name="BIO5", standard_name="BIO5", long_name="Max Temperature of Warmest Month", units="°C").to_netcdf("./datasets/bio5.nc")

# [BIO6] Min Temperature of Coldest Month

In [None]:
ds_bio6 = tas_mmin.groupby("Time.year").min(dim="Time")-273.15
ds_bio6.rename("BIO6").assign_attrs(name="BIO6", standard_name="BIO6", long_name="Min Temperature of Coldest Month", units="°C").to_netcdf("./datasets/bio6.nc")

# [BIO7] Temperature Annual Range

Defined as: $\text{BIO5} - \text{BIO6}$

In [None]:
ds_bio7 = ds_bio5 - ds_bio6 
ds_bio7.rename("BIO7").assign_attrs(name="BIO7", standard_name="BIO7", long_name="Annual Temperature Range", units="°C").to_netcdf("./datasets/bio7.nc")

# [BIO3] Isothermality

Isothermality quantifies how large the day-to-night temperatures oscillate relative to the summer-to-winter (annual) oscillations.
Defined as: $\frac{\text{BIO2}}{\text{BIO7}} \cdot 100$\

It does not make much sense to compute this since we dont have daily maximum and minimum temperatures, but we'll do it anyways.

Execute below cell only after computing BIO7...

In [None]:
ds_bio3 = ds_bio2 / ds_bio7 * 100.
ds_bio3.rename("BIO3").assign_attrs(name="BIO3", standard_name="BIO3", long_name="Isothermality", units="%").to_netcdf("./datasets/bio3.nc")

# [BIO8/9] Mean Temperature of Wettest/Driest Quarter

In our [reference](https://pubs.usgs.gov/ds/691/ds691.pdf), the quartals are defined as 12 rolling three month windows looking forward, i.e., Jan-Feb-Mar, Feb-Mar-Apr, ..., Okt-Nov-Dec, Nov-Dec-Jan, Dec-Jan-Feb (12 in total). xarrays rolling functionality only does backwards or centered rolling windows. Then, we first invert the data along the time dimension, we do the rolling sum/mean, and we invert the result.

In [None]:
quarterly_precsum = prec_mtot[::-1].rolling({"Time":3}, min_periods=1).sum()[::-1]
quarterly_tasmean = tas_mmean[::-1].rolling({"Time":3}, min_periods=1).mean()[::-1]

In [None]:
# argmax and argmin are not available for groupby so we have to apply them with a workaround
wettest_quarters = quarterly_precsum.groupby("Time.year").apply(lambda c: c.argmax(dim="Time"))
driest_quarters = quarterly_precsum.groupby("Time.year").apply(lambda c: c.argmin(dim="Time"))

In [None]:
# we need to select the corresponding index of wettest_quarters or driest_quarters for each year
# let's make a function to do that
# The function extracts the year of each group, and selects the index of the group based on the index of the corresponding year
def isel_index(group, index):
    yy = group.Time.dt.year.values[0]
    return group.isel(Time=index.sel(year=yy))

In [None]:
ds_bio8 = quarterly_tasmean.groupby("Time.year").apply(isel_index, index=wettest_quarters)-273.15
ds_bio9 = quarterly_tasmean.groupby("Time.year").apply(isel_index, index=driest_quarters)-273.15

Let's check that it worked as expected. The BIO8 for the 5th year is (at a random point 500,500):


In [None]:
ds_bio8[4,500,500]

The wettest quarter in the fifth year at that point is:

In [None]:
wettest_quarters[4, 500, 500].values

Let's check that that is true:

In [None]:
quarterly_precsum[12*4:12*5, 500, 500]

Indeed, the value at index=7 is the maximum. Then, the mean temperature at index=7 for year 5 is:

In [None]:
quarterly_tasmean[12*4+7, 500, 500]-273.15

Which is indeed the same value as the one in BIO8 (with some rounding error).

Let's save the datasets:

In [None]:
ds_bio8.rename("BIO8").assign_attrs(name="BIO8", standard_name="BIO8", long_name="Mean Temperature of Wettest Quarter", units="°C").to_netcdf("./datasets/bio8.nc")
ds_bio9.rename("BIO9").assign_attrs(name="BIO9", standard_name="BIO9", long_name="Mean Temperature of Driest Quarter", units="°C").to_netcdf("./datasets/bio9.nc")

# [BIO10/11] Mean Temperature of Warmest/Coldest Quarter

Analogously as before:

In [None]:
# argmax and argmin are not available for groupby so we have to apply them with a workaround
warmest_quarters = quarterly_tasmean.groupby("Time.year").apply(lambda c: c.argmax(dim="Time"))
coldest_quarters = quarterly_tasmean.groupby("Time.year").apply(lambda c: c.argmin(dim="Time"))

ds_bio10 = quarterly_tasmean.groupby("Time.year").apply(isel_index, index=warmest_quarters)-273.15
ds_bio11 = quarterly_tasmean.groupby("Time.year").apply(isel_index, index=coldest_quarters)-273.15

(the original definition is based on the sum of the monthly values of each quarter, but since the location of the maximum of the quarters sum or the quarters mean is the same, we reuse the quarters mean since we already calculated them).

In [None]:
# Save:
ds_bio10.rename("BIO10").assign_attrs(name="BIO10", standard_name="BIO10", long_name="Mean Temperature of Warmest Quarter", units="°C").to_netcdf("./datasets/bio10.nc")
ds_bio11.rename("BIO11").assign_attrs(name="BIO11", standard_name="BIO11", long_name="Mean Temperature of Coldest Quarter", units="°C").to_netcdf("./datasets/bio11.nc")

# [BIO12] Annual Precipitation

In [None]:
ds_bio12 = prec_mtot.groupby("Time.year").sum(dim="Time")
ds_bio12.rename("BIO12").assign_attrs(name="BIO12", standard_name="BIO12", long_name="Annual Precipitation", units="mm").to_netcdf("./datasets/bio12.nc")

# [BIO13] Precipitation of Wettest Month

In [None]:
ds_bio13 = prec_mtot.groupby("Time.year").max(dim="Time")
ds_bio13.rename("BIO13").assign_attrs(name="BIO13", standard_name="BIO13", long_name="Precipitation of Wettest Month", units="mm").to_netcdf("./datasets/bio13.nc")

# [BIO14] Precipitation of Driest Month

In [None]:
ds_bio14 = prec_mtot.groupby("Time.year").min(dim="Time")
ds_bio14.rename("BIO14").assign_attrs(name="BIO14", standard_name="BIO14", long_name="Precipitation of Driest Month", units="mm").to_netcdf("./datasets/bio14.nc")

# [BIO15] Precipitation Seasonality

Precipitation seasonality in a given year is defined as: 
$$
\frac{\sigma(P_i)}{1 + \text{BIO12}\,/\,12} \cdot 100 \quad \text{for}\, i \in [1,12]
$$

In [None]:
ds_bio15 = prec_mtot.groupby("Time.year").std(dim="Time") / (1. + ds_bio12/12.) * 100
ds_bio15.rename("BIO15").assign_attrs(name="BIO15", standard_name="BIO15", long_name="Precipitation Seasonality (CV)", units="%").to_netcdf("./datasets/bio15.nc")

# [BIO16/17] Precipitation of Wettest/Driest Quarter

In [None]:
# analogous to BIO8-11
ds_bio16 = quarterly_precsum.groupby("Time.year").apply(isel_index, index=wettest_quarters)
ds_bio17 = quarterly_precsum.groupby("Time.year").apply(isel_index, index=driest_quarters)

In [None]:
# Save:
ds_bio16.rename("BIO16").assign_attrs(name="BIO16", standard_name="BIO16", long_name="Precipitation of Wettest Quarter", units="mm").to_netcdf("./datasets/bio16.nc")
ds_bio17.rename("BIO17").assign_attrs(name="BIO17", standard_name="BIO17", long_name="Precipitation of Driest Quarter", units="mm").to_netcdf("./datasets/bio17.nc")

# [BIO18/19] Precipitation of Warmest/Coldest Quarter

In [None]:
# analogous to BIO8-11
ds_bio18 = quarterly_precsum.groupby("Time.year").apply(isel_index, index=warmest_quarters)
ds_bio19 = quarterly_precsum.groupby("Time.year").apply(isel_index, index=coldest_quarters)

In [None]:
# Save:
ds_bio18.rename("BIO18").assign_attrs(name="BIO18", standard_name="BIO18", long_name="Precipitation of Warmest Quarter", units="mm").to_netcdf("./datasets/bio18.nc")
ds_bio19.rename("BIO19").assign_attrs(name="BIO19", standard_name="BIO19", long_name="Precipitation of Coldest Quarter", units="mm").to_netcdf("./datasets/bio19.nc")