# Gribber class

In [8]:
from aqua.gribber import Gribber
from aqua.util import get_machine, username, get_aqua_path

The `Gribber` class is used to integrate the `gribscan` tool within AQUA. It can be used to convert GRIB files in catalog entries so that experiments can be easily added to catalog.


It checks the availability of grib files in a specified directory `datadir` for a specific experiment and creates `.index` files and a `.json` file.
It also automatically updates the catalog `yaml` file.

It takes as input, which as to be specified in a configuration yaml file (please check `AQUA/cli/gribber/gribber_setup.tmpl`):
- `model`: model name
- `exp`: experiment name
- `source`: source name
- `dir`: dictionary with directories for disk operations:
    - `datadir`: data directory 
    - `tmpdir`: temporary directory 
    - `jsondir`: JSON directory 
    - `configdir`: configuration directory
- `verbose`: print help message (default: `False`)
- `nprocs`: number of processors (default: `1`)
- `replace`: replace existing files (default: `False`)

We start defining the default directories to work on Levante and then apply the class with a default IFS experiment.  Please change the outdir if you want to try this notebook!

Note that in this example the `datadir` is hard-coded to point to a small subset of GRIB files.
Note also that we're assuming that you downloaded AQUA in your `home` directory. For simplicity, this notebook use the `USER` environment variable. If not, please change the `configdir` variable to point to `AQUA/config`.

In [12]:
user=username()
aqua_path = get_aqua_path()

if get_machine()=='levante':
    default_dir = {'datadir': f'/work/bb1153/{user}/grib-example/tco1279-orca025',
                'tmpdir': f'/scratch/b/{user}/gribscan',
                'jsondir': f'/work/bb1153/{user}/gribscan-json',
                'configdir': f'{aqua_path}/config'}
elif get_machine()=='lumi':
    default_dir = {'datadir': f'/users/{user}/work/grib-example/tco1279-orca025',
                'tmpdir': f'/users/{user}/scratch/gribscan',
                'jsondir': f'/users/{user}/work/gribscan-json',
                'configdir': f'{aqua_path}/config'}


Initializing the `class` does not immediately scan the data directory, but only creates the `Gribber` object.
The method `create_entry` scans the data directory and creates the `.index` files and the `.json` file, then update the catalog `yaml` file.

In [24]:
mygrib = Gribber(model='OSI-SAF',exp='osi-450',source='sh-daily', dirdict = default_dir, loglevel = 'INFO')

[38;2;64;184;50m2024-02-09 18:07:43 :: gribber :: INFO     -> JSON directory: /users/nazarova/work/gribscan-json/OSI-SAF/osi-450[0m
[38;2;64;184;50m2024-02-09 18:07:43 :: gribber :: INFO     -> Data directory: /users/nazarova/work/grib-example/tco1279-orca025[0m
[38;2;64;184;50m2024-02-09 18:07:43 :: gribber :: INFO     -> JSON directory: /users/nazarova/work/gribscan-json/OSI-SAF/osi-450[0m
[38;2;64;184;50m2024-02-09 18:07:43 :: gribber :: INFO     -> Catalog directory: /users/nazarova/work/AQUA//config[0m
[38;2;64;184;50m2024-02-09 18:07:43 :: gribber :: INFO     -> json file will be named as daily.[0m
[38;2;64;184;50m2024-02-09 18:07:43 :: gribber :: INFO     -> Gribfile wildcard: sh????+*[0m
[38;2;64;184;50m2024-02-09 18:07:43 :: gribber :: INFO     -> Checking if indices already exist...[0m
[38;2;64;184;50m2024-02-09 18:07:43 :: gribber :: INFO     -> Checking if JSON file /users/nazarova/work/gribscan-json/OSI-SAF/osi-450/daily.json already exists...[0m
[38;2;64;

Then we can create the catalog entries, scanning with `gribscan` the entire GRIB archive by calling the `create_entry()` method

In [25]:
mygrib.create_entry()

[38;2;64;184;50m2024-02-09 18:07:46 :: gribber :: INFO     -> Creating symlinks...[0m
[38;2;64;184;50m2024-02-09 18:07:46 :: gribber :: INFO     -> Searching in /users/nazarova/work/grib-example/tco1279-orca025...[0m
[38;2;64;184;50m2024-02-09 18:07:46 :: gribber :: INFO     -> /users/nazarova/work/grib-example/tco1279-orca025/sh????+*[0m
[38;2;64;184;50m2024-02-09 18:07:46 :: gribber :: INFO     -> Creating GRIB indices...[0m


usage: gribscan-index [-h] [-o [OUTDIR]] [-f] [-n [NPROCS]] GRIB [GRIB ...]
gribscan-index: error: the following arguments are required: GRIB
[38;2;64;184;50m2024-02-09 18:07:47 :: gribber :: INFO     -> CompletedProcess(args=['gribscan-index', '-n', '1'], returncode=2)[0m
[38;2;64;184;50m2024-02-09 18:07:47 :: gribber :: INFO     -> Creating JSON file /users/nazarova/work/gribscan-json/OSI-SAF/osi-450/daily.json...[0m
usage: gribscan-build [-h] [-g [GLOB.index]] [-o outdir]
                      [--prefix template_prefix] [-m magician]
                      [GRIB.index ...]
gribscan-build: error: You need to pass a glob pattern or a file list.
[38;2;255;0;0m2024-02-09 18:07:49 :: gribber :: ERROR    -> Gribscan has created different json filename![0m
[38;2;255;0;0m2024-02-09 18:07:49 :: gribber :: ERROR    -> Please check and modify the catalog files accordingly[0m
[38;2;64;184;50m2024-02-09 18:07:49 :: gribber :: INFO     -> Block to be added to catalog file:[0m
[38;2;64;1

Since for the example we used a subset of the original grib files, we need to specify the `replace` option to `False`, in order to not overwrite the already existing entry in the catalogue.
However, since the folder in which `.index` and `.json` files are stored is different from the one in which the original grib files are stored, the first time that the `Gribber` class is called, they have been created.

In addition, the `Gribber` class can be used via a command line tool in `cli/gribber/cli_gribber.py`. This in an executable python script, which can be configured via a configuration yaml file (`--config configfile.yaml`), which must include the dictionary for the directories as well as the experiment details. A template for this configuration file can be found in `AQUA/cli/gribber/gribber_setup.tmpl`.