# Gribber class

In [1]:
from aqua.gribber import Gribber

The `Gribber` class is used to integrate the `gribscan` tool within AQUA. It can be used to convert GRIB files in catalog entries so that experiments can be easily added to catalog.


It checks the availability of grib files in a specified directory `datadir` for a specific experiment and creates `.index` files and a `.json` file.
It also automatically updates the catalog `yaml` file.

It takes as input, which as to be specified in a configuration yaml file (please check `AQUA/cli/gribber/gribber_setup.tmpl`):
- `model`: model name
- `exp`: experiment name
- `source`: source name
- `dir`: dictionary with directories for disk operations:
    - `datadir`: data directory 
    - `tmpdir`: temporary directory 
    - `jsondir`: JSON directory 
    - `configdir`: configuration directory
- `verbose`: print help message (default: `False`)
- `nprocs`: number of processors (default: `1`)
- `replace`: replace existing files (default: `False`)

We start defining the default directories to work on Levante and then apply the class with a default IFS experiment.  Please change the outdir if you want to try this notebook!

Note that in this example the `datadir` is hard-coded to point to a small subset of GRIB files.
Note also that we're assuming that you downloaded AQUA in your `home` directory. For simplicity, this notebook use the `USER` environment variable. If not, please change the `configdir` variable to point to `AQUA/config`.

In [4]:
import os
USER = os.getenv('USER')
outdir = USER
default_dir = {'datadir': '/work/bb1153/b382289/grib-example/tco1279-orca025',
               'tmpdir': '/scratch/b/' + outdir + '/gribscan',
               'jsondir': '/work/bb1153/' + outdir + '/gribscan-json',
               'configdir': '/home/b/' + outdir + '/AQUA/config'}


Initializing the `class` does not immediately scan the data directory, but only creates the `Gribber` object.
The method `create_entry` scans the data directory and creates the `.index` files and the `.json` file, then update the catalog `yaml` file.

In [5]:
mygrib = Gribber(model='IFS',exp='tco1279-orca025',source='ICMGG_atm2d', dirdict = default_dir, loglevel = 'INFO')

2023-05-26 11:18:53 :: gribber :: INFO     -> Data directory: /work/bb1153/b382289/grib-example/tco1279-orca025
2023-05-26 11:18:53 :: gribber :: INFO     -> JSON directory: /work/bb1153/b382076/gribscan-json/tco1279-orca025
2023-05-26 11:18:53 :: gribber :: INFO     -> Catalog directory: /home/b/b382076/AQUA/config
2023-05-26 11:18:53 :: gribber :: INFO     -> Gribfile wildcard: ICMGG????+*
2023-05-26 11:18:53 :: gribber :: INFO     -> Checking if indices already exist...
2023-05-26 11:18:53 :: gribber :: INFO     -> Checking if JSON file already exists...
2023-05-26 11:18:53 :: gribber :: INFO     -> Checking if catalog file already exists...


Then we can create the catalog entries, scanning with `gribscan` the entire GRIB archive by calling the `create_entry()` method

In [6]:
mygrib.create_entry()

2023-05-26 11:18:58 :: gribber :: INFO     -> Creating symlinks...
2023-05-26 11:18:58 :: gribber :: INFO     -> Searching in /work/bb1153/b382289/grib-example/tco1279-orca025...
2023-05-26 11:18:58 :: gribber :: INFO     -> /work/bb1153/b382289/grib-example/tco1279-orca025/ICMGG????+*
2023-05-26 11:18:58 :: gribber :: INFO     -> Creating GRIB indices...
2023-05-26 11:19:15 :: gribber :: INFO     -> CompletedProcess(args=['gribscan-index', '-n', '1', '/scratch/b/b382076/gribscan/tco1279-orca025/ICMGGhvi1+000060', '/scratch/b/b382076/gribscan/tco1279-orca025/ICMGGhvi1+000078', '/scratch/b/b382076/gribscan/tco1279-orca025/ICMGGhvi1+000090', '/scratch/b/b382076/gribscan/tco1279-orca025/ICMGGhvi1+000012', '/scratch/b/b382076/gribscan/tco1279-orca025/ICMGGhvi1+000036', '/scratch/b/b382076/gribscan/tco1279-orca025/ICMGGhvi1+000006', '/scratch/b/b382076/gribscan/tco1279-orca025/ICMGGhvi1+000072', '/scratch/b/b382076/gribscan/tco1279-orca025/ICMGGhvi1+000096', '/scratch/b/b382076/gribscan/tco

Since for the example we used a subset of the original grib files, we need to specify the `replace` option to `False`, in order to not overwrite the already existing entry in the catalogue.
However, since the folder in which `.index` and `.json` files are stored is different from the one in which the original grib files are stored, the first time that the `Gribber` class is called, they have been created.

In addition, the `Gribber` class can be used via a command line tool in `cli/gribber/cli_gribber.py`. This in an executable python script, which can be configured via a configuration yaml file (`--config configfile.yaml`), which must include the dictionary for the directories as well as the experiment details. A template for this configuration file can be found in `AQUA/cli/gribber/gribber_setup.tmpl`.