# This notebook gives an introduction to the xcube's "zenodo" data store

This notebook shows an example how to access a TIF and a NetCDF published on the [https://zenodo.org](https://zenodo.org) webpage. 

### Setup
In order to run this notebook you need to get an access token for the Zenodo API following the [documentation](https://zenodo.org/login/?next=%2Faccount%2Fsettings%2Fapplications%2Ftokens%2Fnew%2F). Furthermore, make sure that [`xcube_zenodo`](https://github.com/xcube-dev/xcube-zenodo) is installed. You may install [`xcube_zenodo`](https://github.com/xcube-dev/xcube-zenodo) directly from the git repository by cloning the repository, directing into `xcube-zenodo`, and following the steps below:

```bash
conda env create -f environment.yml
conda activate xcube-zenodo
pip install .
```

Note that [`xcube_zenodo`](https://github.com/xcube-dev/xcube-zenodo) is a plugin of [`xcube`](https://xcube.readthedocs.io/en/latest/), where `xcube` is included in the `environment.yml`.  

Now, we first import everything we need:

In [1]:
%%time
from xcube.core.store import new_data_store
from xcube.core.store import get_data_store_params_schema

CPU times: user 7.59 s, sys: 1.06 s, total: 8.65 s
Wall time: 7.56 s


First, we get the store parameters needed to initialize a zenodo [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework). 

In [2]:
%%time
store_params = get_data_store_params_schema("zenodo")
store_params

CPU times: user 188 ms, sys: 30.3 ms, total: 218 ms
Wall time: 218 ms


<xcube.util.jsonschema.JsonObjectSchema at 0x762af8b03440>

We initiate a zenodo [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework). Note that the `xcube-zenodo` plugin is recognized after installation by setting the first argument to `"zenodo"` in the `new_data_store` function.

In [3]:
%%time
store = new_data_store("zenodo")

CPU times: user 28.5 ms, sys: 2.96 ms, total: 31.4 ms
Wall time: 31.4 ms


The data IDs are set to `"<record_id>/<file_name>"`. For example for the [Canopy height and biomass map for Europe](https://zenodo.org/records/8154445) the data ID for the dataset "planet_canopy_cover_30m_v0.1.tif" will be given by `"8154445/planet_canopy_cover_30m_v0.1.tif"`. The record ID can be found in the url of the zenodo page.

After selection of a specific dataset, we can describe the dataset using the `describe_data` method.  

In [4]:
store.describe_data("8154445/planet_canopy_cover_30m_v0.1.tif")

DataStoreError: Data resource "records/8154445/files/planet_canopy_cover_30m_v0.1.tif" does not exist in store

Next we can open the data. We can first view the available opening parameters, which can be added to the `open_data` method in the subsequent cell. 

In [None]:
%%time
open_params = store.get_open_data_params_schema(data_id="8154445/planet_canopy_cover_30m_v0.1.tif")
open_params

In [None]:
%%time
ds = store.open_data(
    "8154445/planet_canopy_cover_30m_v0.1.tif",
    tile_size=(1024, 1024)
)
ds

We plot parts of the opened data as an example below. The data shows the canopy cover fraction within a range of [0, 100]. 

In [None]:
%%time
ds.band_1[100000:102000, 100000:102000].plot(vmin=0, vmax=100)

We can also open a TIFF as a [xcube's multi-resolution  dataset](https://xcube.readthedocs.io/en/latest/mldatasets.html#xcube-multi-resolution-datasets), where we can select the level of resolution. The opened dataset however is not cloud optimized and thus consists of only one level.   

In [None]:
%%time
mlds = store.open_data(
    "8154445/planet_canopy_cover_30m_v0.1.tif",
    tile_size=(1024, 1024),
    data_type="mldataset"
)
mlds.num_levels

In [None]:
ds = mlds.get_dataset(0)
ds

---

We can also use the zenodo data store to open NetCDF files as shown in the following cells. Note if `chunks` are given, the data set is loaded lazily as a [chunked xr.Dataset](https://xarray.pydata.org/en/v0.10.2/dask.html).

In [None]:
%%time
ds = store.open_data(
    "13882297/gridded_tidestats_ERA5weather.nc",
    chunks={}
)
ds

We plot the Mean Low Water (MLW) data as an example. 

In [None]:
%%time
ds.MLW.plot()