# Gribber class

In [1]:
from aqua.gribber import Gribber
from aqua.util import load_yaml

The `Gribber` class from the `aqua` framework checks the availability of grib files in a specified directory `datadir` for a specific experiment and creates `.index` files and a `.json` file.
It also automatically updates the catalog `yaml` file.

It takes as input:
- `model`: model name
- `exp`: experiment name
- `source`: source name
- `nprocs`: number of processors (default: `1`)
- `dir`: dictionary with directories (defaults working on levante):
    - `datadir`: data directory (default: `/scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2`)
    - `tmpdir`: temporary directory (default: `/scratch/b/b382289/gribscan`)
    - `jsondir`: JSON directory (default: `/work/bb1153/b382289/gribscan-json`)
    - `catalogdir`: catalog directory (default: `/work/bb1153/b382289/AQUA/config/levante/catalog`)
- `verbose`: print help message (default: `False`)
- `replace`: replace existing files (default: `False`)

In [2]:
mygrib = Gribber(model='IFS',exp='tco1279-orca025',source='ICMGG_atm2d',verbose=True)

Directory datadir is None. Using default directory:
/scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2
Directory tmpdir is None. Using default directory:
/scratch/b/b382289/gribscan
Directory jsondir is None. Using default directory:
/work/bb1153/b382289/gribscan-json
Directory catalogdir is None. Using default directory:
/work/bb1153/b382289/AQUA/config/levante/catalog
Data directory: /scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2
JSON directory: /work/bb1153/b382289/gribscan-json/tco1279-orca025
Catalog directory: /work/bb1153/b382289/AQUA/config/levante/catalog
Gribfile wildcard: ICMGG????+*
Catalog file: /work/bb1153/b382289/AQUA/config/levante/catalog/IFS/tco1279-orca025.yaml
JSON file: /work/bb1153/b382289/gribscan-json/tco1279-orca025/atm2d.json
Checking if indices already exist...
Indices already exist.
Checking if JSON file already exists...
JSON file already exists.
Checking if catalog file already exists...
Catalog file /work/bb1153/b382289/AQUA/config/levante/catalo

Initializing the `class` does not immediately scan the data directory, but only creates the `Gribber` object.
The method `create_entry` scans the data directory and creates the `.index` files and the `.json` file, then update the catalog `yaml` file.

In [3]:
mygrib.create_entry()

Folder /scratch/b/b382289/gribscan/tco1279-orca025 already exists
Folder /work/bb1153/b382289/gribscan-json/tco1279-orca025 already exists
Creating symlinks...
Searching in...
/scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2/ICMGG????+*
File /scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2/ICMGGhr2n+000064 already exists in /scratch/b/b382289/gribscan/tco1279-orca025
File /scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2/ICMGGhr2n+000040 already exists in /scratch/b/b382289/gribscan/tco1279-orca025
File /scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2/ICMGGhr2n+000096 already exists in /scratch/b/b382289/gribscan/tco1279-orca025
File /scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2/ICMGGhr2n+000024 already exists in /scratch/b/b382289/gribscan/tco1279-orca025
File /scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2/ICMGGhr2n+000048 already exists in /scratch/b/b382289/gribscan/tco1279-orca025
File /scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2/ICMGGhr2n+000056

Since for the example we used a subset of the original grib files, we need to specify the `replace` option to `False`, in order to not overwrite the already existing entry in the catalogue.
However, since the folder in which `.index` and `.json` files are stored is different from the one in which the original grib files are stored, the first time that the `Gribber` class is called, they have been created.

The directories can be also specified in a `yaml` file, load them and then pass them to the class.

In [5]:
dir = load_yaml('gribber_setup_example.yaml')

mygrib_yaml = Gribber(model='IFS',exp='tco1279-orca025',source='ICMGG_atm2d',dir=dir,verbose=True)

Data directory: /scratch/b/b382289/tco1279-orca025/nemo_deep/ICMGGc2
JSON directory: /work/bb1153/b382289/gribscan-json/tco1279-orca025
Catalog directory: /work/bb1153/b382289/AQUA/config/levante/catalog
Gribfile wildcard: ICMGG????+*
Catalog file: /work/bb1153/b382289/AQUA/config/levante/catalog/IFS/tco1279-orca025.yaml
JSON file: /work/bb1153/b382289/gribscan-json/tco1279-orca025/atm2d.json
Checking if indices already exist...
Indices already exist.
Checking if JSON file already exists...
JSON file already exists.
Checking if catalog file already exists...
Catalog file /work/bb1153/b382289/AQUA/config/levante/catalog/IFS/tco1279-orca025.yaml already exists.


An internal method `_check_steps` is used to check if indices and json file have to be created or the step has already been processed.
Additionally the method `_check_json` checks if the catalog file has to be created.

In [4]:
mygrib._check_steps()

Checking if indices already exist...
Indices already exist.
Checking if JSON file already exists...
JSON file already exists.
Checking if catalog file already exists...
Catalog file /work/bb1153/b382289/AQUA/config/levante/catalog/IFS/tco1279-orca025.yaml already exists.
