## Reading GRIB from FDB stream

In [1]:
import pyfdb 
import earthkit.data

<div class="alert alert-block alert-warning">
To run this scirpt FDB access is needed and the <b>FDB_HOME</b> environment variable has to be set correctly.
</div>

In [2]:
# date must be adjusted since FDB only stores recent dates
request = {
    'class': 'od',
    'expver': '0001',
    'stream': 'oper',
    'date': '20230404',
    'time': [0, 12],
    'domain': 'g',
    'type': 'an',
    'levtype': 'sfc',
    'step': 0,
    'param': [151, 167]
}

# Must be set correctly
%env FDB_HOME=/home/fdbprod

fdb = pyfdb.FDB()

env: FDB_HOME=/home/fdbprod


#### Iteration with one field at a time in memory

In [3]:
r = fdb.retrieve(request)

fs = earthkit.data.from_source("stream", r)
type(fs)

earthkit.data.sources.stream.StreamSource

Nothing is read at this moment.

We can only use *fs* for iteration. Fields crerated in the iteration get deleted when going out of scope:

In [4]:
for f in fs:
    print(f"  param={f['param']} shape={f.values.shape} mean={f.values.mean()}")

  param=msl shape=(6599680,) mean=101165.30143351648
  param=2t shape=(6599680,) mean=288.471588476875
  param=msl shape=(6599680,) mean=101158.70017015976
  param=2t shape=(6599680,) mean=289.29876940588883


Once the iteration is completed, there is nothing left in fs.

In [5]:
for f in fs:
    print(f"type(f)={type(f)}")

#### Using group_by

We can read multiple fields into memory from the stream at a time by using **group_by** in *from_source()*:

In [6]:
r = fdb.retrieve(request)
fs = earthkit.data.from_source("stream", r, group_by=2)
type(fs)

earthkit.data.sources.stream.StreamSource

In [7]:
for f in fs:
    # f is a FieldList containing 2 fields. It gets deleted when going out of scope
    print(len(f))
    print(f.metadata("param"))

2
['msl', '2t']
2
['msl', '2t']


#### Storing all the fields in memory

In [8]:
r = fdb.retrieve(request)

fs = earthkit.data.from_source("stream", r, group_by=0)
type(fs)

earthkit.data.sources.stream.StreamMemorySource

Nothing is read at this moment:

In [9]:
print(f"stored fields count={len(fs._reader._fields)}")

stored fields count=0


If we call any function on the fieldlist it reads the messages into memory

In [10]:
len(fs)

4

In [11]:
print(f"stored fields count={len(fs._reader._fields)}")

stored fields count=4


In [12]:
fs.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,msl,surface,0,20230404,0,0,an,0,reduced_gg
1,ecmf,2t,surface,0,20230404,0,0,an,0,reduced_gg
2,ecmf,msl,surface,0,20230404,1200,0,an,0,reduced_gg
3,ecmf,2t,surface,0,20230404,1200,0,an,0,reduced_gg


In [13]:
a = fs.sel(param="2t")
print(type(a))

<class 'earthkit.data.readers.grib.index.MaskFieldSet'>


In [14]:
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,2t,surface,0,20230404,0,0,an,0,reduced_gg
1,ecmf,2t,surface,0,20230404,1200,0,an,0,reduced_gg


In [15]:
fs.to_xarray()