# 📃 Inventory

Ideally, GRIB2 files have a companion "index" file that are a plain ASCII text file that provides some details about the contents of each file (i.e., what each GRIB message contains). These files can tell you the variable represented in the GRIB message, the level, forecast lead time, and the starting byte range in the file.

There are two "flavors" of index files, wgrib-style and eccodes-style.

NCEP models provide the wgrib-style index files while ECMWF models provide the eccodes-style index file.

Herbie provides a parser to read the index file into a Pandas DataFrame and calls it the file's **inventory**.

Let's start by looking at the inventory for a HRRR file.


In [1]:
from herbie import Herbie

In [2]:
H = Herbie("2024-01-01", model="hrrr")
H

✅ Found ┊ model=hrrr ┊ [3mproduct=sfc[0m ┊ [38;2;41;130;13m2024-Jan-01 00:00 UTC[92m F00[0m ┊ [38;2;255;153;0m[3mGRIB2 @ aws[0m ┊ [38;2;255;153;0m[3mIDX @ aws[0m


[48;2;255;255;255m[38;2;136;33;27m▌[0m[38;2;12;53;118m[48;2;240;234;210m▌[38;2;0;0;0m[1mHerbie[0m HRRR model [3msfc[0m product initialized [38;2;41;130;13m2024-Jan-01 00:00 UTC[92m F00[0m ┊ [38;2;255;153;0m[3msource=aws[0m

The path of the relevant index file is given by `H.idx`. You can go to that URL and see what the raw index file looks like.


In [4]:
H.idx

'https://noaa-hrrr-bdp-pds.s3.amazonaws.com/hrrr.20240101/conus/hrrr.t00z.wrfsfcf00.grib2.idx'

Herbie parses the raw index file as a Pandas DataFrame using `H.inventory()`


In [3]:
H.inventory()

Unnamed: 0,grib_message,start_byte,end_byte,range,reference_time,valid_time,variable,level,forecast_time,search_this
0,1,0,202809.0,0-202809,2024-01-01,2024-01-01,REFC,entire atmosphere,anl,:REFC:entire atmosphere:anl
1,2,202810,246792.0,202810-246792,2024-01-01,2024-01-01,RETOP,cloud top,anl,:RETOP:cloud top:anl
2,3,246793,496145.0,246793-496145,2024-01-01,2024-01-01,var discipline=0 center=7 local_table=1 parmca...,entire atmosphere,anl,:var discipline=0 center=7 local_table=1 parmc...
3,4,496146,649032.0,496146-649032,2024-01-01,2024-01-01,VIL,entire atmosphere,anl,:VIL:entire atmosphere:anl
4,5,649033,2038336.0,649033-2038336,2024-01-01,2024-01-01,VIS,surface,anl,:VIS:surface:anl
...,...,...,...,...,...,...,...,...,...,...
165,166,126776108,126785469.0,126776108-126785469,2024-01-01,2024-01-01,ICEC,surface,anl,:ICEC:surface:anl
166,167,126785470,128189723.0,126785470-128189723,2024-01-01,2024-01-01,SBT123,top of atmosphere,anl,:SBT123:top of atmosphere:anl
167,168,128189724,130514441.0,128189724-130514441,2024-01-01,2024-01-01,SBT124,top of atmosphere,anl,:SBT124:top of atmosphere:anl
168,169,130514442,131785130.0,130514442-131785130,2024-01-01,2024-01-01,SBT113,top of atmosphere,anl,:SBT113:top of atmosphere:anl


Notice the `search_this` column; that is a column that Herbie can do regular expression searches to filter the GRIB messages you want. For example, if you want all the variables at 500 mb...


In [9]:
H.inventory(":500 mb")

Unnamed: 0,grib_message,start_byte,end_byte,range,reference_time,valid_time,variable,level,forecast_time,search_this
13,14,6299332,7003497.0,6299332-7003497,2024-01-01,2024-01-01,HGT,500 mb,anl,:HGT:500 mb:anl
14,15,7003498,7550668.0,7003498-7550668,2024-01-01,2024-01-01,TMP,500 mb,anl,:TMP:500 mb:anl
15,16,7550669,8417238.0,7550669-8417238,2024-01-01,2024-01-01,DPT,500 mb,anl,:DPT:500 mb:anl
16,17,8417239,8997799.0,8417239-8997799,2024-01-01,2024-01-01,UGRD,500 mb,anl,:UGRD:500 mb:anl
17,18,8997800,9584981.0,8997800-9584981,2024-01-01,2024-01-01,VGRD,500 mb,anl,:VGRD:500 mb:anl


Notice that only the rows that contain 500 mb are selected. This is useful when you want to download a subset of variables from the GRIB file. Notice the `range` column which tells us the byte range of each variable in the file. Herbie will use this byte range when you request downloading only the selected variables or opening it in xarray.


In [11]:
H.download(":500 mb", verbose=True, overwrite=True)

📇 Download subset: [48;2;255;255;255m[38;2;136;33;27m▌[0m[38;2;12;53;118m[48;2;240;234;210m▌[38;2;0;0;0m[1mHerbie[0m HRRR model [3msfc[0m product initialized [38;2;41;130;13m2024-Jan-01 00:00 UTC[92m F00[0m ┊ [38;2;255;153;0m[3msource=aws[0m                                                            
 cURL from https://noaa-hrrr-bdp-pds.s3.amazonaws.com/hrrr.20240101/conus/hrrr.t00z.wrfsfcf00.grib2
Found [1m[38;2;41;130;13m5[0m grib messages.
Download subset group 1
  14  [38;2;255;153;0m:HGT:500 mb:anl[0m
  15  [38;2;255;153;0m:TMP:500 mb:anl[0m
  16  [38;2;255;153;0m:DPT:500 mb:anl[0m
  17  [38;2;255;153;0m:UGRD:500 mb:anl[0m
  18  [38;2;255;153;0m:VGRD:500 mb:anl[0m
curl -s --range 6299332-9584981 "https://noaa-hrrr-bdp-pds.s3.amazonaws.com/hrrr.20240101/conus/hrrr.t00z.wrfsfcf00.grib2" > "/home/blaylock/data/hrrr/20240101/subset_6befbe61__hrrr.t00z.wrfsfcf00.grib2"
💾 Saved the subset to /home/blaylock/data/hrrr/20240101/subset_6befbe61__hrrr.t00z.wrfsfc

PosixPath('/home/blaylock/data/hrrr/20240101/subset_6befbe61__hrrr.t00z.wrfsfcf00.grib2')

In [13]:
H.xarray(":500 mb")



More examples of valid regular expressions are found in the [Herbie Docs: Subset with search](https://herbie.readthedocs.io/en/stable/user_guide/search.html). Using `H.inventory(search)` is an effective way to test differeng regex patterns to get the variables you are interested in downloading.


The eccodes-style index files work the same way, expect the regex for selecting variable names and levels will be different. Here is the ECMWF forecast inventory file.


In [14]:
H = Herbie("2024-01-01", model="ecmwf")
H.idx

✅ Found ┊ model=ecmwf ┊ [3mproduct=oper[0m ┊ [38;2;41;130;13m2024-Jan-01 00:00 UTC[92m F00[0m ┊ [38;2;255;153;0m[3mGRIB2 @ azure[0m ┊ [38;2;255;153;0m[3mIDX @ azure[0m


'https://ai4edataeuwest.blob.core.windows.net/ecmwf/20240101/00z/0p4-beta/oper/20240101000000-0h-oper-fc.index'

In [15]:
H.inventory()

Unnamed: 0,grib_message,start_byte,end_byte,range,reference_time,valid_time,step,param,levelist,levtype,number,domain,expver,class,type,stream,search_this
0,1,0,205483,0-205483,2024-01-01,2024-01-01,0 days,gh,250,pl,,g,0001,od,fc,oper,:gh:250:pl:g:0001:od:fc:oper
1,2,205483,427603,205483-427603,2024-01-01,2024-01-01,0 days,gh,925,pl,,g,0001,od,fc,oper,:gh:925:pl:g:0001:od:fc:oper
2,3,427603,427827,427603-427827,2024-01-01,2024-01-01,0 days,tp,,sfc,,g,0001,od,fc,oper,:tp:sfc:g:0001:od:fc:oper
3,4,427827,640309,427827-640309,2024-01-01,2024-01-01,0 days,gh,700,pl,,g,0001,od,fc,oper,:gh:700:pl:g:0001:od:fc:oper
4,5,640309,878511,640309-878511,2024-01-01,2024-01-01,0 days,r,850,pl,,g,0001,od,fc,oper,:r:850:pl:g:0001:od:fc:oper
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
78,79,23834649,24380346,23834649-24380346,2024-01-01,2024-01-01,0 days,vo,700,pl,,g,0001,od,fc,oper,:vo:700:pl:g:0001:od:fc:oper
79,80,24380346,24958955,24380346-24958955,2024-01-01,2024-01-01,0 days,vo,250,pl,,g,0001,od,fc,oper,:vo:250:pl:g:0001:od:fc:oper
80,81,24958955,25515164,24958955-25515164,2024-01-01,2024-01-01,0 days,vo,200,pl,,g,0001,od,fc,oper,:vo:200:pl:g:0001:od:fc:oper
81,82,25515164,26090217,25515164-26090217,2024-01-01,2024-01-01,0 days,d,50,pl,,g,0001,od,fc,oper,:d:50:pl:g:0001:od:fc:oper


In [17]:
H.inventory(":500:pl")

Unnamed: 0,grib_message,start_byte,end_byte,range,reference_time,valid_time,step,param,levelist,levtype,number,domain,expver,class,type,stream,search_this
8,9,1562823,1799100,1562823-1799100,2024-01-01,2024-01-01,0 days,r,500,pl,,g,1,od,fc,oper,:r:500:pl:g:0001:od:fc:oper
23,24,5128477,5391755,5128477-5391755,2024-01-01,2024-01-01,0 days,t,500,pl,,g,1,od,fc,oper,:t:500:pl:g:0001:od:fc:oper
34,35,7931538,8114292,7931538-8114292,2024-01-01,2024-01-01,0 days,gh,500,pl,,g,1,od,fc,oper,:gh:500:pl:g:0001:od:fc:oper
51,52,12740037,13041153,12740037-13041153,2024-01-01,2024-01-01,0 days,u,500,pl,,g,1,od,fc,oper,:u:500:pl:g:0001:od:fc:oper
52,53,13041153,13355478,13041153-13355478,2024-01-01,2024-01-01,0 days,v,500,pl,,g,1,od,fc,oper,:v:500:pl:g:0001:od:fc:oper
56,57,14281363,14611005,14281363-14611005,2024-01-01,2024-01-01,0 days,q,500,pl,,g,1,od,fc,oper,:q:500:pl:g:0001:od:fc:oper
68,69,18693700,19269114,18693700-19269114,2024-01-01,2024-01-01,0 days,d,500,pl,,g,1,od,fc,oper,:d:500:pl:g:0001:od:fc:oper
75,76,22142964,22686453,22142964-22686453,2024-01-01,2024-01-01,0 days,vo,500,pl,,g,1,od,fc,oper,:vo:500:pl:g:0001:od:fc:oper
