## xcube Data Store Framework - CCI Open Data Portal

To use the CCI ODP Data Store in a Jupyter Notebook, we need to execute the following lines:

In [1]:
%matplotlib inline
# CCI Data Store requires asyncio. In order to work in Notebooks, we need to do the following:
import nest_asyncio
nest_asyncio.apply()

In [2]:
from xcube.core.store import find_data_store_extensions
from xcube.core.store import get_data_store_params_schema
from xcube.core.store import new_data_store
from IPython.display import JSON

Which data accessors are available?

In [3]:
JSON({e.name: e.metadata for e in find_data_store_extensions()})

<IPython.core.display.JSON object>

Usually we need more information to get the actual data store object. Which data store parameters are available?

In [4]:
get_data_store_params_schema('cciodp')

<xcube.util.jsonschema.JsonObjectSchema at 0x1e28c9037c8>

Just provide mandatory parameters to instantiate the store class:

In [5]:
store = new_data_store('cciodp')
store

<xcube_cci.dataaccess.CciOdpDataStore at 0x1e28c90c108>

Which datasets are provided? (the list may contain both gridded and vector datasets):

In [7]:
JSON(list(store.get_data_ids()))

<IPython.core.display.JSON object>

Which in-memory data types are provided?

- `dataset` --> `xarray.Dataset` (Gridded data)
- `mldataset` -->  `xcube.core.mlds.MultiLevelDataset`  (Gridded data, multi-resolution pyramid)
- `geodataframe` --> `geopandas.GeoDataFrame` (Vector data) 


In [8]:
store.get_data_opener_ids()

('dataset:zarr:cciodp',)

We may ask for a specific type of data ...

In [10]:
store.has_data('esacci.OC.5-days.L3S.CHLOR_A.multi-sensor.multi-platform.MERGED.3-1.geographic')

True

... but in many cases we want to query for certain criteria. How can we do that?

In [9]:
store.get_search_params_schema()

<xcube.util.jsonschema.JsonObjectSchema at 0x1e28c90cc08>

Now search.

In [10]:
iterator = store.search_data(ecv='OZONE', frequency='month')
JSON([item.to_dict() for item in iterator])

<IPython.core.display.JSON object>

Which parameters must I pass or are available to open the dataset?

In [11]:
store.get_open_data_params_schema('esacci.OZONE.mon.L3.NP.multi-sensor.multi-platform.MERGED.fv0002.r1')

<xcube.util.jsonschema.JsonObjectSchema at 0x1e28cd225c8>

There are 4 required parameters, so we need to provide them to open a dataset:

In [13]:
dataset = store.open_data(
    'esacci.OZONE.mon.L3.NP.multi-sensor.multi-platform.MERGED.fv0002.r1', 
    var_names=['surface_pressure', 'O3_du', 'O3_du_tot'],
    time_range=['2008-01-01','2008-12-10'])

xarray.set_options(display_style="text")

dataset



ValidationError: Additional properties are not allowed ('var_names' was unexpected)

Failed validating 'additionalProperties' in schema:
    {'additionalProperties': False,
     'properties': {'bbox': {'items': [{'maximum': 180,
                                        'minimum': -180,
                                        'type': 'number'},
                                       {'maximum': 90,
                                        'minimum': -90,
                                        'type': 'number'},
                                       {'maximum': 180,
                                        'minimum': -180,
                                        'type': 'number'},
                                       {'maximum': 90,
                                        'minimum': -90,
                                        'type': 'number'}],
                             'type': 'array'},
                    'crs': {'const': 'http://www.opengis.net/def/crs/EPSG/0/4326',
                            'type': 'string'},
                    'time_range': {'items': [{'format': 'date-time',
                                              'type': 'string'},
                                             {'format': 'date-time',
                                              'type': 'string'}],
                                   'type': 'array'},
                    'variable_names': {'items': {'enum': ['surface_pressure',
                                                          'O3_du',
                                                          'O3e_du',
                                                          'O3_du_tot',
                                                          'O3e_du_tot',
                                                          'O3_vmr',
                                                          'O3e_vmr',
                                                          'O3_ndens',
                                                          'O3e_ndens'],
                                                 'type': 'string'},
                                       'type': 'array'}},
     'type': 'object'}

On instance:
    {'time_range': ['2008-01-01', '2008-12-10'],
     'var_names': ['surface_pressure', 'O3_du', 'O3_du_tot']}

In [None]:
dataset.surface_pressure.isel(time=1).plot.imshow(cmap='Greys', figsize=(16, 8))