## Retrieving data from FDB

In [1]:
import earthkit.data

FDB (Fields DataBase) is a domain-specific object store developed at ECMWF for storing, indexing and retrieving GRIB data. For more information on FBD please consult the following pages:

- [FDB](https://fields-database.readthedocs.io/en/latest/)
- [pyfdb](https://pyfdb.readthedocs.io/en/latest/)

This example requires FDB access and the <b>FDB_HOME</b> environment variable has to be set correctly. 

The following request was  written to retrieve data from the operational FDB at ECMWF.  Please note that the **date** must be adjusted since FDB at ECMWF only stores the most recent dates.

In [2]:
request = {
    'class': 'od',
    'expver': '0001',
    'stream': 'oper',
    'date': '20240421',
    'time': [0, 12],
    'domain': 'g',
    'type': 'an',
    'levtype': 'sfc',
    'step': 0,
    'param': [151, 167, 168]
}

### Reading as a stream

#### Iteration with one field at a time in memory

In [3]:
ds = earthkit.data.from_source("fdb", request)
for f in ds:
    print(f)

GribField(msl,None,20240421,0,0,0)
GribField(2t,None,20240421,0,0,0)
GribField(2d,None,20240421,0,0,0)
GribField(msl,None,20240421,1200,0,0)
GribField(2t,None,20240421,1200,0,0)
GribField(2d,None,20240421,1200,0,0)


Once the iteration is completed, there is nothing left in *ds*.

In [4]:
sum([1 for _ in ds])

0

#### Iteration with group_by

In [5]:
ds = earthkit.data.from_source("fdb", request)
for f in ds.group_by("time"):
    print(f"len={len(f)} {f.metadata(('param', 'level'))}")

len=3 [('msl', 0), ('2t', 0), ('2d', 0)]
len=3 [('msl', 0), ('2t', 0), ('2d', 0)]


#### Iteration with batched

In [6]:
ds = earthkit.data.from_source("fdb", request)
for f in ds.batched(2):
    print(f"len={len(f)} {f.metadata(('param', 'level'))}")

len=2 [('msl', 0), ('2t', 0)]
len=2 [('2d', 0), ('msl', 0)]
len=2 [('2t', 0), ('2d', 0)]


#### Storing all the fields in memory

In [7]:
ds = earthkit.data.from_source("fdb", request, read_all=True)

In [8]:
len(ds)

6

In [9]:
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,msl,surface,0,20240421,0,0,an,0,reduced_gg
1,ecmf,2t,surface,0,20240421,0,0,an,0,reduced_gg
2,ecmf,2d,surface,0,20240421,0,0,an,0,reduced_gg
3,ecmf,msl,surface,0,20240421,1200,0,an,0,reduced_gg
4,ecmf,2t,surface,0,20240421,1200,0,an,0,reduced_gg
5,ecmf,2d,surface,0,20240421,1200,0,an,0,reduced_gg


In [10]:
ds.sel(param="2t").ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,2t,surface,0,20240421,0,0,an,0,reduced_gg
1,ecmf,2t,surface,0,20240421,1200,0,an,0,reduced_gg


In [11]:
ds.to_xarray()

### Reading into a file

In [12]:
ds = earthkit.data.from_source("fdb", request, stream=False)

In [13]:
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,msl,surface,0,20240421,0,0,an,0,reduced_gg
1,ecmf,2t,surface,0,20240421,0,0,an,0,reduced_gg
2,ecmf,2d,surface,0,20240421,0,0,an,0,reduced_gg
3,ecmf,msl,surface,0,20240421,1200,0,an,0,reduced_gg
4,ecmf,2t,surface,0,20240421,1200,0,an,0,reduced_gg
5,ecmf,2d,surface,0,20240421,1200,0,an,0,reduced_gg
