# CEDA DataPoint Search/Discovery

We have recently created the `ceda-datapoint` package to make accessing CEDA STAC records and associated cloud-optimised data formats easier for our users. The DataPoint package provides a single tool to search and access data across the CCI collection at CEDA. You can read more about this at https://cedadev.github.io/datapoint/inspiration.html

In [2]:
from ceda_datapoint import DataPointClient

client = DataPointClient() # URL for CEDA API as default

Using the datapoint client, we can list the queryable terms of any collection, including the CCI collection for this example.

In [3]:
client.list_query_terms(collection='cci')

['datetime',
 'start_datetime',
 'end_datetime',
 'units',
 'project_id',
 'institution_id',
 'platform_id',
 'activity_id',
 'source_id',
 'table_id',
 'project',
 'product_version',
 'frequency',
 'variables',
 'doi',
 'uuid',
 'created',
 'updated']

DataPoint uses the pystac client API syntax for making searches, with a few minor additions to enhance cloud-optimised use cases.

In [4]:
item_search = client.search(
    collections=['cci'],
    query=[
        'platform_id=GOES16',
    ],
    max_items=8)

Using datapoint we can list the assets and the subset of cloud-optimised assets that form part of our search:

In [5]:
item_search.display_assets()

<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg_1MONTHLY-20180101230000-20201201230000-fv1.00 (Collection: cci)>
 - reference_file
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg-20180104230000-20201231230000-fv1.00 (Collection: cci)>
 - reference_file
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg_1MONTHLY-20180101220000-20201201220000-fv1.00 (Collection: cci)>
 - reference_file
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg-20180104220000-20201231220000-fv1.00 (Collection: cci)>
 - reference_file
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg-20180104114000-20201231140000-fv1.00 (Collection: cci)>
 - reference_file
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg_1MONTHLY-20180101130000-20201201130000-fv1.00 (Collection: cci)>
 - reference_file
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg-20180104130000-20201231130000-fv1.00 (Collection: cci)>
 - reference_file
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg_1MONTHLY-20180101114000-20201201140000-fv1.00 (Collection: cci)>
 - ref

In [6]:
item_search.display_cloud_assets()

<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg_1MONTHLY-20180101230000-20201201230000-fv1.00 (Collection: cci)>
 - kerchunk
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg-20180104230000-20201231230000-fv1.00 (Collection: cci)>
 - kerchunk
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg_1MONTHLY-20180101220000-20201201220000-fv1.00 (Collection: cci)>
 - kerchunk
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg-20180104220000-20201231220000-fv1.00 (Collection: cci)>
 - kerchunk
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg-20180104114000-20201231140000-fv1.00 (Collection: cci)>
 - kerchunk
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg_1MONTHLY-20180101130000-20201201130000-fv1.00 (Collection: cci)>
 - kerchunk
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg-20180104130000-20201231130000-fv1.00 (Collection: cci)>
 - kerchunk
<DataPointItem: ESACCI-LST-L3C-LST-GOES-0.05deg_1MONTHLY-20180101114000-20201201140000-fv1.00 (Collection: cci)>
 - kerchunk


We can now collect all the cloud-optimised assets from our search into a DataPoint `cluster`, which we can then use to easily access a particular dataset.

Note: In this case we have not downloaded specific data files, rather accessed the dataset as a whole using Xarray, where we can now make selections or perform analyses to the whole dataset, then plot the results - all without having to download 10s-1000s of GBs of data we didn't actually need!

In [9]:
cluster = item_search.collect_cloud_assets()
ds = cluster[0].open_dataset()
ds

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,timedelta64[ns] numpy.ndarray,timedelta64[ns] numpy.ndarray
"Array Chunk Bytes 6.37 GiB 21.97 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type timedelta64[ns] numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,timedelta64[ns] numpy.ndarray,timedelta64[ns] numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.37 GiB 21.97 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float64 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.37 GiB 21.97 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float64 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.37 GiB 21.97 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float64 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.37 GiB 21.97 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float64 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,8 B,8 B
Shape,"(1,)","(1,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 8 B 8 B Shape (1,) (1,) Dask graph 1 chunks in 2 graph layers Data type float64 numpy.ndarray",1  1,

Unnamed: 0,Array,Chunk
Bytes,8 B,8 B
Shape,"(1,)","(1,)"
Dask graph,1 chunks in 2 graph layers,1 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.37 GiB 21.97 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float64 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.19 GiB,10.99 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 3.19 GiB 10.99 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float32 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,3.19 GiB,10.99 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,3.19 GiB,10.99 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray
"Array Chunk Bytes 3.19 GiB 10.99 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float32 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,3.19 GiB,10.99 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float32 numpy.ndarray,float32 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.37 GiB 21.97 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float64 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
"Array Chunk Bytes 6.37 GiB 21.97 MiB Shape (33, 3600, 7200) (1, 1200, 2400) Dask graph 297 chunks in 2 graph layers Data type float64 numpy.ndarray",7200  3600  33,

Unnamed: 0,Array,Chunk
Bytes,6.37 GiB,21.97 MiB
Shape,"(33, 3600, 7200)","(1, 1200, 2400)"
Dask graph,297 chunks in 2 graph layers,297 chunks in 2 graph layers
Data type,float64 numpy.ndarray,float64 numpy.ndarray
