# Worldclim pre-process to arctic

This demo documents the preprocessing steps we take for **Worldclim** data which is used as a
**climate reference** in the delta downscaling process. Our downscaling tools require that data 
is formatted with standardized **units**. 

Additionally, source Worldclim files are global in extent, so we preprocess to reduce the data 
to just the area we are interested in, the arctic in this case. 

## Source Worldclim data 

Source worldclim data retrieved per variable in `.zip` archives. Each archive has 12 monthly `.tif` 
**high resolution** raster files. These data have a global extent with **EPSG:4326** as the CRS. 
Additional information  on the format of the source data can be 
found [here](https://www.worldclim.org/data/worldclim21.html).

Note: By default we look at the 30 degree second (~1km^2) data.

## Preprocessing 

Before downscaling we perform 2 preprocessing steps. First we transform the data spatially. Second 
we standardize the units.

For the first step the extent, resolution, and CRS are determined from a raster file. We use a geotiff 
raster file with an extent covering the arctic with 4km^2 resolution in **EPSG:6931**. **GDAL** is used 
to convert each monthly raster file target extent, resolution, and CRS. The resulting data is combined into
an **xarray** dataset. 

After the spatial transformation the units are converted to the units that our downscaling processes 
are setup to support. 

## Result

The result of the worldclim preprocessing is a **TEMdataset**. `TEMDataset` objects are a wrapper around 
`xarray.Dataset` objects. Our wrapper give us a common interface to tools for clipping, saving, and loading
data amongst others. Our intermediate data, like this arctic Worldclim data, is saved to disc as netcdf files.  

By convention we keep data in directories based on how much processing we have done on it. Here we keep the
data for the arctic in a directory called `02-arctic`.

# Set up a log for the process

Our tools use an object to log and display various messages that may be created. There are several message priorities that
a logged message may have. All messages are saved, but only messages with the priority set in `verbose_levels` are printed to the screen\console. For this demo we will display all **INFO** messages, but if you are having issues it may be usefull to 
log all **DEBUG** messages. 

In [None]:
from temds import logger
log = logger.Logger(verbose_levels=logger.INFO)

# Extent raster file

Define the extent raster file

In [None]:
extent_file = 'working/00-aoi/aoi-5km-buffer-6931.tif' 

# Preprocessing source data

The **TEMDataset** class has a static method **from_worldclim** that allows a `TEMDataset` to be created from source Worldclim data.
The required argument to this method is the directory where the source data is or will be downloaded to. Important optional arguments
used here include `download`, `extent_raster`, and `logger`.

* download: If you have already downloaded the data you can set this to `False`
* extent_raster: If you do not provide this the extent, resolution, and CRS are detrimied from the source data
* logger: This is the previously discussed logging object. If not provided a logger is created that does not display messages. 

Other optioanl arguments allow for variables to use to be set, the Worldclim version and resolution to be set. See code documentation 
details.

In [None]:
from temds.datasources import dataset

worldclim_arctic = dataset.TEMDataset.from_worldclim(
            'working/01-download/worldclim',
            download=True, 
            extent_raster=extent_file, 
            logger=log,
)

# Takes about 20 minutues on a 8 core 32GBmemory machachine.



# Viewing data

We can view the data with the `.dataset` attribute

In [None]:
worldclim_arctic.dataset

The units in the dataset can be quickly viewed with the `units` property.

In [None]:
worldclim_arctic.units

TEMDatasets have additional properties `crs`, `resolution`, `vars`, and `transform`

# Saving 

The **TEMDataset** class has a `save` function. **netcdf** files created with `save` will have a global attribute with the TEMDS code version. See documentation for additional arguments. 

In [None]:
worldclim_arctic.save('working/02-arctic/worldclim/worldclim-arctic.nc', overwrite=True)

