In [1]:
!pip install --quiet climetlab


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip available: [0m[31;49m22.3[0m[39;49m -> [0m[32;49m22.3.1[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


# Creating a shared dataset of GRIBs

In [2]:
import climetlab as cml

## Download data to the climetlab cache

In [3]:
for month in range(1, 13):  # This takes a few minutes.
    cml.load_source(
        "mars",
        param=["2t"],
        levtype="sfc",
        area=[50, -50, 20, 50],
        grid=[1, 1],
        date=f"2012-{month}",
    )

2022-12-06 14:12:29 ECMWF API python library 1.6.3
2022-12-06 14:12:29 ECMWF API at https://api.ecmwf.int/v1
2022-12-06 14:12:29 Welcome Florian Pinault
2022-12-06 14:12:30 In case of problems, please check https://confluence.ecmwf.int/display/WEBAPI/Web+API+FAQ or contact servicedesk@ecmwf.int
2022-12-06 14:12:30 Request submitted
2022-12-06 14:12:30 Request id: 638f3fbe457ff7b14dec36fc
2022-12-06 14:12:30 Request is submitted
2022-12-06 14:12:32 Request is active
2022-12-06 14:12:37 Calling 'nice mars /tmp/20221206-1310/23/tmp-_marsvHsjNS.req'
2022-12-06 14:12:37 mars - WARN -
2022-12-06 14:12:37 mars - WARN -
2022-12-06 14:12:37 MIR environment variables:
2022-12-06 14:12:37 MIR_CACHE_PATH=/data/ec_coeff
2022-12-06 14:12:37 Using MARS binary: /usr/local/apps/mars/versions/20220411095548/bin/mars.bin
2022-12-06 14:12:37 mars - INFO   - 20221206.131231 - Welcome to MARS
2022-12-06 14:12:37 mars - INFO   - 20221206.131231 - MARS Client build stamp: 20220411095548
2022-12-06 14:12:37 ma

In [4]:
cml.load_source(
    "mars",
    param="msl",
    levtype="sfc",
    area=[50, -50, 20, 50],
    grid=[1, 1],
    date="2012-12-01",
);

2022-12-06 14:14:24 ECMWF API python library 1.6.3
2022-12-06 14:14:24 ECMWF API at https://api.ecmwf.int/v1
2022-12-06 14:14:25 Welcome Florian Pinault
2022-12-06 14:14:25 In case of problems, please check https://confluence.ecmwf.int/display/WEBAPI/Web+API+FAQ or contact servicedesk@ecmwf.int
2022-12-06 14:14:26 Request submitted
2022-12-06 14:14:26 Request id: 638f40323677e48ef887c792
2022-12-06 14:14:26 Request is submitted
2022-12-06 14:14:27 Request is active
2022-12-06 14:14:32 Calling 'nice mars /tmp/20221206-1310/64/tmp-_marst1S1si.req'
2022-12-06 14:14:32 mars - WARN -
2022-12-06 14:14:32 mars - WARN -
2022-12-06 14:14:32 MIR environment variables:
2022-12-06 14:14:32 MIR_CACHE_PATH=/data/ec_coeff
2022-12-06 14:14:32 Using MARS binary: /usr/local/apps/mars/versions/20220411095548/bin/mars.bin
2022-12-06 14:14:32 mars - INFO   - 20221206.131426 - Welcome to MARS
2022-12-06 14:14:32 mars - INFO   - 20221206.131426 - MARS Client build stamp: 20220411095548
2022-12-06 14:14:32 ma

## Export the data to a shared directory

This is optional, you could keep working on the data from the cache if you are the only user of the data and you do not mind redownloading it later.
Other people should not use your cache:
- When using climetlab the cache will eventually fills up and the data may be deleted automatically,
- You will need to deal with permissions issues.
- It will make it difficult to share the data with other people.

Let us export the data to a shared directory `shared-data/temperature-for-analysis`

In [5]:
# Some housekeeping
!rm -rf shared-data/temperature-for-analysis
!mkdir -p shared-data/temperature-for-analysis

In [6]:
# export all data from my cache which is from mars and not older that 1 day
!climetlab export_cache shared-data/temperature-for-analysis --newer 1h --match mars

[32mCopying cache entries matching 'mars' and newer than '2022-12-06 13:14:36' to shared-data/temperature-for-analysis.[0m
100%|██████████████████████████████████████████| 13/13 [00:00<00:00, 145.45it/s]
[32mCopied 13 cache entries to shared-data/temperature-for-analysis.[0m


## Create indexes to speed up data access when using it. (Optional)

In [7]:
!climetlab index_directory shared-data/temperature-for-analysis

  0%|                                                    | 0/13 [00:00<?, ?it/s]
Parsing shared-data/temperature-for-analysis/mars-retriever-ed2ab4d5cd22ea26a951[A
Parsing shared-data/temperature-for-analysis/mars-retriever-ed2ab4d5cd22ea26a951[A
  8%|███▍                                        | 1/13 [00:00<00:01,  7.06it/s][A
Parsing shared-data/temperature-for-analysis/mars-retriever-4617cd8c5d7638680756[A
                                                                                [A
Parsing shared-data/temperature-for-analysis/mars-retriever-f7438015c79b0b04fd36[A
                                                                                [A
Parsing shared-data/temperature-for-analysis/mars-retriever-a151e7fcd829068805a0[A
                                                                                [A
Parsing shared-data/temperature-for-analysis/mars-retriever-753548dc22f2ee602d29[A
 38%|████████████████▉                           | 5/13 [00:00<00:00, 23.06it/s

In [8]:
!climetlab availability shared-data/temperature-for-analysis

                                                                                

## Using the data


In [9]:
DATA = "shared-data/temperature-for-analysis"

In [10]:
source = cml.load_source("directory", DATA)

In [11]:
source.availability

This is a good time to check the data, is all the data here? Are they missing dates? Parameters?

The data is ready to be used as numpy, tensorflow or xarray object.

In [12]:
source.sel(param="msl").to_numpy().mean()

101725.47522756307

In [13]:
cml.load_source("directory", DATA, param="msl").to_numpy().mean()

101725.47522756307

In [14]:
temp = source.sel(param="2t").order_by("date")
temp.to_tfdataset()

Metal device set to: Apple M1 Pro

systemMemory: 16.00 GB
maxCacheSize: 5.33 GB



2022-12-06 14:14:54.269537: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:306] Could not identify NUMA node of platform GPU ID 0, defaulting to 0. Your kernel may not have been built with NUMA support.
2022-12-06 14:14:54.269952: I tensorflow/core/common_runtime/pluggable_device/pluggable_device_factory.cc:272] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 0 MB memory) -> physical PluggableDevice (device: 0, name: METAL, pci bus id: <undefined>)


<ZipDataset element_spec=(TensorSpec(shape=<unknown>, dtype=tf.float32, name=None), TensorSpec(shape=<unknown>, dtype=tf.float32, name=None))>

In [15]:
temp.to_xarray()

In [16]:
# Note that this is wrong (not implemented yet)
temp.availability