## Reading file parts

In [1]:
import earthkit.data as ekd

In [2]:
ekd.download_example_file(["test.grib", "test6.grib", "tuv_pl.grib"])

In [3]:
ds = ekd.from_source("file", "test6.grib")
ds.ls(extra_keys="offset")

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType,offset
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll,0.0
1,ecmf,u,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll,240.0
2,ecmf,v,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll,480.0
3,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll,720.0
4,ecmf,u,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll,960.0
5,ecmf,v,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll,1200.0


### Single files

In [4]:
ds = ekd.from_source("file", "test6.grib", parts=(0, 240))
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll


The call above can also be written as:

In [5]:
ds = ekd.from_source("file", "test6.grib", parts=[(0, 240)])
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll


A part can go over a message boundary. Here bytes 240-244 belong to the second message, which is not read because not all of its bytes are specified.

In [6]:
ds = ekd.from_source("file", "test6.grib", parts=[(0, 245)])
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll


In [7]:
ds = ekd.from_source("file", "test6.grib", parts=[(0, 240), (480, 480)])
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
1,ecmf,v,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
2,ecmf,t,isobaricInhPa,850,20180801,1200,0,an,0,regular_ll


Parts cannot overlap.

In [8]:
try:
    ds = ekd.from_source("file", "test6.grib", parts=[(0, 240), (220, 240)])
except Exception as e:
    print(e)

Offsets and lengths must be in order, and not overlapping: offset=220, end of previous part=240


### Multiple files

In [9]:
ds = ekd.from_source("file", [
                               ["test.grib", (0,526)], 
                               ["test6.grib", [(0, 240), (480, 240)]]
                              ])
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,2t,surface,0,20200513,1200,0,an,0,regular_ll
1,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
2,ecmf,v,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll


When a part is None for a given file the whole file will be used.

In [10]:
ds = ekd.from_source("file", [
                               ["test.grib", None], 
                               ["test6.grib", [(0,240), (480, 240)]]
                              ])
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,2t,surface,0,20200513,1200,0,an,0,regular_ll
1,ecmf,msl,surface,0,20200513,1200,0,an,0,regular_ll
2,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
3,ecmf,v,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll


The **parts** kwarg can still be used for multiple files; in this case it will be applied to each of them one by one.

In [11]:
ds = ekd.from_source("file", ["test6.grib", "tuv_pl.grib"], parts=(0,240))
ds.ls()

Unnamed: 0,centre,shortName,typeOfLevel,level,dataDate,dataTime,stepRange,dataType,number,gridType
0,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
1,ecmf,t,isobaricInhPa,1000,20180801,1200,0,an,0,regular_ll
