# This notebook gives an introduction to the xcube's "zenodo" data store and its preload_data method

This notebook shows an example how to preload a Zarr file, which is published in compressed zip format on the [https://zenodo.org](https://zenodo.org) webpage, where the zarr extension is missing. The compressed file will be downloaded, unpacked and the the Zarr files will be made available, which can be subsequently used by the data store as usual. 

### Setup
In order to run this notebook, you need to install [`xcube_zenodo`](https://github.com/xcube-dev/xcube-zenodo). You may install [`xcube_zenodo`](https://github.com/xcube-dev/xcube-zenodo) directly from the git repository by cloning the repository, directing into `xcube-zenodo`, and following the steps below:

```bash
conda env create -f environment.yml
conda activate xcube-zenodo
pip install .
```

Note that [`xcube_zenodo`](https://github.com/xcube-dev/xcube-zenodo) is a plugin of [`xcube`](https://xcube.readthedocs.io/en/latest/), where `xcube` is included in the `environment.yml`.  

Now, we first import everything we need:

In [1]:
%%time
from xcube.core.store import new_data_store
from xcube.core.store import get_data_store_params_schema

CPU times: user 3.15 s, sys: 264 ms, total: 3.42 s
Wall time: 1.57 s


First, we get the store parameters needed to initialize a zenodo [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework). 

In [2]:
%%time
store_params = get_data_store_params_schema("zenodo")
store_params

CPU times: user 46.7 ms, sys: 11 ms, total: 57.7 ms
Wall time: 57.9 ms


<xcube.util.jsonschema.JsonObjectSchema at 0x7fa82c407530>

We initiate a zenodo [data store](https://xcube.readthedocs.io/en/latest/dataaccess.html#data-store-framework). Note that the `xcube-zenodo` plugin is recognized after installation by setting the first argument to `"zenodo"` in the `new_data_store` function. We can optionally specify the cache data store's ID and parameters using the `cache_store_id` and `cache_store_params` keyword arguments. By default, `cache_store_id` is set to `file`, and `cache_store_params` defaults to `dict(root="zenodo_cache", max_depth=3)`.

In [3]:
%%time
store = new_data_store("zenodo")

CPU times: user 8.2 ms, sys: 0 ns, total: 8.2 ms
Wall time: 8.12 ms


Compressed files can be preloaded using the `preload_data` method. This approach enables the downloading of compressed files that cannot be lazily loaded, allowing them to be stored and readily available for the duration of the project. Also this method uses `preload_params`, which can be viewed in the next cell.

In [4]:
%%time
preload_params = store.get_preload_data_params()
preload_params

CPU times: user 60 μs, sys: 6 μs, total: 66 μs
Wall time: 71 μs


<xcube.util.jsonschema.JsonObjectSchema at 0x7fa82c483770>

The `preload_data` method returns a handler which can be used to cancel the preload by typing `handler.cancel()` into the next cell. Note that the `preload_method` is new and highly experimental.

In [None]:
handler = store.preload_data("7108392/seasfire.zip")

VBox(children=(HTML(value='<table>\n<thead>\n<tr><th>Data ID             </th><th>Status  </th><th>Progress  <…

The data IDs can be viewed using the following line. The new data ID is identical to the original, except that the `.zip` extension indicating a compressed format has been removed and a `.zarr` extension has been added.

In [None]:
store.cache_store.list_data_ids()

Next we want to open one of the datasets. We first view the availbale parameters to open the data. 

In [None]:
%%time
open_params = store.get_open_data_params_schema(data_id="7108392/seasfire.zarr")
open_params

In [None]:
%%time
ds = store.open_data("7108392/seasfire.zarr")
ds

We plot the opened data at the last time step as an example below.

In [None]:
%%time
ds.cams_co2fire.isel(time=-1).plot()