In [1]:
!test -f tuv_pl.grib || https://get.ecmwf.int/repository/test-data/emohawk/examples/tuv_pl.grib

## GRIB selection using metadata

We read a GRIB file containing 18 messages:

In [2]:
import emohawk

fs = emohawk.load_from("file", "tuv_pl.grib")

In [3]:
len(fs)

18

### Using sel

Calling sel() provides a "view":

In [4]:
a = fs.sel(level=500)
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
1,ecmf,u,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
2,ecmf,v,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll


In [5]:
type(a)

emohawk.readers.grib.index.MaskFieldSet

We can use a dict instead of keyword arguments:

In [6]:
a = fs.sel({"level": 500, "shortName": "v"})
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,v,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll


Lists are accepted:

In [7]:
a = fs.sel(level=[500, 850])
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
1,ecmf,u,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
2,ecmf,v,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
3,ecmf,t,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
4,ecmf,u,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
5,ecmf,v,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll


Slices can define closed intervals, so they are treated as inclusive of both the start and stop values, unlike normal Python indexing:

In [8]:
a = fs.sel(param="t", level=slice(500, 850))
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
1,ecmf,t,isobaricInhPa,700,20180801,1200,0,an,0,regular_ll
2,ecmf,t,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll


### Using isel

isel() works similarly to sel() but takes indices instead of values. Please note that the coordinate values are not sorted for isel() but used in their order of appearance in the input data.

In [9]:
a = fs.isel(level=0)
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
1,ecmf,u,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
2,ecmf,v,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll


In [10]:
a = fs.isel({"level": 2, "shortName": 1})
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,u,isobaricInhPa,700,20180801,1200,0,an,0,regular_ll


In [11]:
a = fs.isel(level=[2,3], param=0)
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,700,20180801,1200,0,an,0,regular_ll
1,ecmf,t,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll


Slices are used as in normal Python indexing: 

In [24]:
a = fs.isel(level=slice(2,5), param=0)
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,700,20180801,1200,0,an,0,regular_ll
1,ecmf,t,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
2,ecmf,t,isobaricInhPa,400,20180801,1200,0,an,0,regular_ll


### Using order_by

Calling order_by() provides a "view":

In [13]:
b = a.order_by()
type(b)

emohawk.readers.grib.index.MaskFieldSet

In [14]:
b.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,700,20180801,1200,0,an,0,regular_ll


The sorting keys can be specified as a list:

In [15]:
b = a.order_by(["shortName"])
b.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,700,20180801,1200,0,an,0,regular_ll


We can prescribe the actual order within a key. It only works when all the possible values are specified:

In [16]:
a = a.order_by(shortName=["v", "t", "u"])
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,700,20180801,1200,0,an,0,regular_ll


### Combining sel and order_by

In [17]:
a = fs.sel(level=[500, 850]).order_by(["shortName"])
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
1,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
2,ecmf,u,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
3,ecmf,u,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
4,ecmf,v,isobaricInhPa,500,20180801,1200,0,an,0,regular_ll
5,ecmf,v,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll


### Using coords

Coordinates are generated using almost all the MARS ecCodes keys.

The coords containing more than one values can be accessed as a property:

In [18]:
c = fs.coords
c

{'levelist': (1000, 850, 700, 500, 400, 300), 'param': ('t', 'u', 'v')}

The value of a given coord:

In [19]:
fs.coord("param")

('t', 'u', 'v')

In [20]:
fs.coord("date")

(20180801,)

Aliases can be used. E.g. instead of levelist we can use level:

In [21]:
fs.coord("level")

(1000, 850, 700, 500, 400, 300)

Count the number of fields for each available level:

In [22]:
for level in fs.coord("level"):
    print(f"level={level} len={len(fs.sel(level=level))}")

level=1000 len=3
level=850 len=3
level=700 len=3
level=500 len=3
level=400 len=3
level=300 len=3
