In [1]:
!test -f test6.grib || wget https://get.ecmwf.int/repository/test-data/earthkit-data/examples/test6.grib

## Reading GRIB from a stream

In [2]:
import earthkit.data

earthkit-data can load GRIB data from a **stream**, which can be an FDB stream, a standard Python IO stream or any object implementing the necessary stream methods. 

For simplicity, in this notebook we will use a **file stream** to demonstrate the usage of streams.

### Getting single items from the stream

We create a stream from a file containing 6 GRIB fields by simply calling *open()*. It returns an io.BufferedReader object (a file stream).

In [3]:
stream = open("test6.grib", "rb")

We load it into earthkit-data by using the **batch_size=1** (default) option. With this when we iterate through *fs* it will consume one message from the stream at a time:

In [4]:
fs = earthkit.data.from_source("stream", stream)

At this point nothing is read from the stream. As we progressing with the iteration GribField objects are created then get deleted when going out of scope. As a result there is only one GRIB message is kept in memory at a time.

In [5]:
for f in fs:
    # f is GribField object. It gets deleted when going out of scope
    print(f)

GribField(t,1000,20180801,1200,0,0)
GribField(u,1000,20180801,1200,0,0)
GribField(v,1000,20180801,1200,0,0)
GribField(t,850,20180801,1200,0,0)
GribField(u,850,20180801,1200,0,0)
GribField(v,850,20180801,1200,0,0)


Having finished the iteration there is no data available any longer in *fs*.  We can close the stream:

In [6]:
stream.close()

### Using batch_size

This time we create a stream and read 2 fields from it at a time by using **batch_size=2** in *from_source()*:

In [7]:
stream = open("test6.grib", "rb")
fs = earthkit.data.from_source("stream", stream, batch_size=2)

In [8]:
for f in fs:
    # f is a FieldList containing 2 fields. It gets deleted when going out of scope
    print(len(f))
    print(f.metadata("param"))

2
['t', 'u']
2
['v', 't']
2
['u', 'v']


Having finished the iteration there is no data available any longer in *fs*.  We can close the stream:

In [9]:
stream.close()

### Storing each GRIB message in memory

We can also set **batch_size=0** in *from_source()*:

In [10]:
stream = open("test6.grib", "rb")
fs = earthkit.data.from_source("stream", stream, batch_size=0)

The resulting earthkit-data object is empty at this point. However, as soon as we call any method on it it will consume the whole stream and load all the GRIB messages into memory. They will be stored in memory as long as *fs* exists.

We can call all the standard earthkit-data methods on *fs*:

In [11]:
len(fs)

6

In [12]:
fs.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
1,ecmf,u,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
2,ecmf,v,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
3,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
4,ecmf,u,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll
5,ecmf,v,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll


In [13]:
a = fs.sel(param="t")
a.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
1,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll


In [14]:
a = a.to_xarray()
a

We close the stream:

In [15]:
stream.close()