# AVL Demo notebook 1: xcube data stores

This notebook demonstrates the usage of xcube data stores. A *data store* is a Python object generated by xcube, providing easy, uniform access to one of a range of data sources – for example, locally stored files, remote object storage, or an online data access API. You can create a data store using xcube’s `new_data_store` function; for convenience, several data stores are already predefined in AVL:

## Summary of pre-defined data stores

| store name | description |
|---|---|
| `lab_store` | file data in your Jupyter Lab environment (also visible in the file chooser on the left) |
| `user_store` | Your personal, private object storage. Only you can read and write data here. |
| `public_store_write` | Your publicly shared object storage. Only you can write to it, but all AVL users can read it. |
| `public_store_read` | Everyone’s publicly shared object storage. You can read both your own and other users’ publicly shared data here. |
| `scratch_store` | Insecure, temporary shared storage. All AVL users can read and write freely, and data are deleted automatically after two days. |
| `data_store` | Pre-processed, standard data sets made available for all users by the AVL project. |
| `staging_store` | A staging area for the `data_store` store. Data here are migrated to `data_store` once they have been thoroughly tested. |
| `test_store` | A pre-staging area for `staging_store` and `data_store`. Data here are migrated to `data_store` after some initial testing. |

Note that these predefined stores are only available from within the AVL environment.

## Preliminary set-up

First we set up some functions and variables which will be needed for the demonstration.

In [1]:
import os
from xcube.core.new import new_cube
from xcube.core.store import new_data_store

my_username = os.environ['JUPYTERHUB_USER']
print(f"Hello {my_username}!")

Hello pont-us!


We’ve set the variable `my_username` to the name of the currently logged in user – this will be useful when writing to the public user bucket.

## List the datasets in a store

You can list the identifiers of a store’s datasets with the `get_data_ids` function. For a remote store containing a lot of data, this may take a little while.

In [2]:
list(staging_store.get_data_ids())

['avl/l2a-s1-resampled/2019/bel/S1_L2_BCK_VH_31UFR.zarr',
 'avl/l2a-s1-resampled/2019/bel/S1_L2_BCK_VH_31UFS.zarr',
 'avl/l2a-s1-resampled/2019/bel/S1_L2_BCK_VV_31UFR.zarr',
 'avl/l2a-s1-resampled/2019/bel/S1_L2_BCK_VV_31UFS.zarr',
 'avl/l2a-s1-resampled/2019/bel/S1_L2_COH_VH_31UFR.zarr',
 'avl/l2a-s1-resampled/2019/bel/S1_L2_COH_VH_31UFS.zarr',
 'avl/l2a-s1-resampled/2019/bel/S1_L2_COH_VV_31UFR.zarr',
 'avl/l2a-s1-resampled/2019/bel/S1_L2_COH_VV_31UFS.zarr',
 'avl/l2a-s1-resampled/2019/fra/S1_L2_BCK_VH_30TWT.zarr',
 'avl/l2a-s1-resampled/2019/fra/S1_L2_BCK_VH_30TXT.zarr',
 'avl/l2a-s1-resampled/2019/fra/S1_L2_BCK_VV_30TWT.zarr',
 'avl/l2a-s1-resampled/2019/fra/S1_L2_BCK_VV_30TXT.zarr',
 'avl/l2a-s1-resampled/2019/fra/S1_L2_COH_VH_30TWT.zarr',
 'avl/l2a-s1-resampled/2019/fra/S1_L2_COH_VH_30TXT.zarr',
 'avl/l2a-s1-resampled/2019/fra/S1_L2_COH_VV_30TWT.zarr',
 'avl/l2a-s1-resampled/2019/fra/S1_L2_COH_VV_30TXT.zarr',
 'avl/l2a-s1-resampled/2020/bel/S1_L2_BCK_VH_31UFR.zarr',
 'avl/l2a-s1-r

## Write a dataset to a store

First we create and view a simple data cube containing some example data.

In [3]:
cube = new_cube(variables=dict(my_variable=1.0))
cube

Now we write this to our publicly shared data store. We specify `replace=True` to overwrite any existing dataset with the same name.

In [4]:
public_store_write.write_data(data=cube, data_id="my_test_data_1.zarr", replace=True)

'my_test_data_1.zarr'

The data is now readable to all AVL users through the `public_store_read` data store. If we list the IDs in `public_store_read`, we can see it under our own user prefix.

In [5]:
list(public_store_read.get_data_ids())

['pont-us/my_test_data_1.zarr']

## Delete a dataset from a store

It’s simple to delete a dataset from a store by ID, and listing the store’s IDs afterwards confirms that it’s gone.

In [6]:
public_store_write.delete_data(data_id="my_test_data_1.zarr")

In [7]:
list(public_store_write.get_data_ids())

[]

## Creating your own data stores

As well as using the AVL’s predefined data stores, you can create your own using the `xcube.core.store.new_data_store` function. Look at the [xcube data store example notebook](https://github.com/dcs4cop/xcube/blob/master/examples/notebooks/datastores/1_getting_started.ipynb) for more details on the types of data stores available and how to create and use them.