# Julia CFGRIB Development

Notebook containing some notes on the development of a Julia equivalent to the Python [cfgrib](https://github.com/ecmwf/cfgrib) package

## Deliverables

    Deliverable 1 (by 30. December latest, would be better earlier – we need a deliverable this year): A plan for implementation;  A feasibility study, based on existing solutions in the Julie eco system. cfGrib for Python is successful because we started from the Python solution, like xarray, and mapped GRIb onto it. A set of mockup Julia notebook to see how future programs will look like working with GRIB data
    
    Deliverable 2 (by July 2020?): CFGriB v? pyton version implemented into Julia; A use case (details to be decided in collaboration with ECMWF)

## Living Timeline

### 2019 - December: Feasibility Study

#### 2019-12-01: Python Functionality Exploration

Python notebook(s) exploring existing functionality in CFGRIB:

 - [x] Play with features mentioned in readme
 - [x] Look through [CFGRIB: EASY AND EFFICIENT GRIB FILEACCESS IN XARRAY](https://www.ecmwf.int/sites/default/files/elibrary/2018/18727-cfgrib-easy-and-efficient-grib-file-access-xarray.pdf) presentation
 - [ ] Explore unit tests

#### 2019-12-08: Julia Existing Package Overview

Julia notebook(s) going through existing packages which may be useful in this project

Generic tools:

 - [ ] JuliaIO
 - [ ] JuliaDB
 - [ ] Images.jl vs. NamedArrays.jl vs. AxisArrays.jl
 - [ ] Unitful.jl

(Optionally) Look at existing GRIB/Climate specific tools:

 - [ ] ClimateTools.jl
 - [ ] GRIB.jl
 - [ ] GDAJ.jl / ArchGDAL.jl
 - [ ] ICOADSDictionary.jl

#### 2019-12-15: Proposed Options for Julia Interface

 - [ ] Julia notebooks replicating Python functionality

#### 2019-12-22

 - [ ] Cleanup and deliver


## CFGRIB Features

 - Read grib as xarray
     - All standard xarray features work - may be a problem if Julia-equivalent to xarray is missing crucial functionality
     - Integration with dask
     - "cfgrib reads a limited set of ecCodes recognised keys from the GRIB files"
 - Data model translation via cf2cdm
     - Dictionary of user-defined translations of CF compliant coordinates
     - Translations cover:
         - name: `depthKelowLand` -> `level`
         - units: `undefined` -> `m`
         - direction: `undefined` -> `increasing`
 - Filter heterogeneous GRIB files
     - If file contains multiple `typeOfLevel` an error `ValueError: multiple values for unique key` is raised as the values are ambiguous
     - If different variables (`t` and `z`) share a coordinate and the coordinates are not identical an error `ValueError: key present and new value is different` is raised

## Notes

 - xarray GRIB engine allows for Dask integration. If similar functionality is desired in Julia then Dagger.jl  / JuliaDB integration would be required.
 - Last I checked, AxisArrays.jl is the best xarray equivalent in Julia, however there are plans to [overhaul it in the future](https://github.com/JuliaCollections/AxisArraysFuture/issues/1) - this is an inherent risk of using a young language, package stability is a long ways off as there are no real established standard packages yet
 - Small typo in cfgrib readme, 'Translate to a custom data model' says "cf2cfm" instead of "cf2cdm"
 - Translation via `cf2cdm` can sort coordinate in ascending/descending order
 - Some tests are performed via [cdsapi](https://github.com/ecmwf/cdsapi) calls - how should this be handled?

    - A GRIB stream, a file, is list of GRIB messages
    - A GRIB message contains a single geographic field with latitude, longitude
    - Message metadata (keys) can be regarded as additional coordinates: time, level, etc.
    - MARS retrievals are typically nice hypercubes
    - Messages in a stream are completely independent, there's no guarantee
