Add support to write a `xarray.Dataset` to a GRIB file. #18

alexamici · 2018-09-16T22:32:13Z

Due to the fact that the NetCDF data model is much more free and extensible than the GRIB one, it is not possible to write a generic xarray.Dataset to a GRIB file. The aim for cfgrib is to implement write support for a subset of carefully crafted datasets that fit the GRIB data model.

In particular the only coordinates that we target at the moment are the one returned by opening a GRIB with the cfgrib flavour of cfgrib.open_dataset, namely:

number, time, step, a vertical coordinate (isobaricInhPa, heightAboveGround, surface, etc), and the horizontal coordinates (for example latitude and longitude for a regular_ll grid type).

Note that all passed GRIB_ attributes are used to set keys in the output file, it is left to the user to ensure coherence among them.

Some of the keys are autodetected from the coordinates, namely:

Horizontal coordinates gridTypes:

regular: regular_ll and regular_gg
not target: projected: lambert, etc (can be controlled with GRIB_ attributes)
not target: reduced: reduced_ll and reduced_gg (can be controlled with GRIB_ attributes)

Vertical coordinates typeOfLevel:

single level: surface, meanSea, etc.
pressure: isobaricInhPa and isobaricInPa
other: hybrid

GRIB edition:

GRIB2
GRIB1

The text was updated successfully, but these errors were encountered:

iainrussell · 2018-09-20T13:40:07Z

I get a problem with this GRIB file "User_Guide_Example_Data.grib". It contains 3 vertical levels, each at 2 forecast steps, and when I read it, convert to xarray, then write back to GRIB, all the values are NaN. If I filter just a single step, then it works perfectly - the only differences are those you would expect when converting from GRIB 1 to GRIB 2, e.g. bounding box coords in microdegrees instead of millidegrees, etc.
Here's the code:

import xarray as xr
from cfgrib import xarray_store
from cfgrib import xarray_to_grib
import cfgrib

dst = xarray_store.GribDataStore.from_path("User_Guide_Example_Data.grib")
ds = xr.open_dataset(dst)
print(ds)

cfgrib.to_grib(ds, 'User_Guide_Example_Data_from_cfgrib.grib')#, grib_keys={'centre': 'ecmf'})

And the data is attached.
User_Guide_Example_Data.grib.tar.gz

alexamici · 2018-09-21T07:39:24Z

@iainrussell that's an interesting GRIB file:

$ grib_ls User_Guide_Example_Data.grib 
edition      centre       typeOfLevel  level        dataDate     stepRange    dataType     shortName    packingType  gridType     
1            ecmf         isobaricInhPa  1000         19960428     0            an           t            grid_simple  regular_ll  
1            ecmf         isobaricInhPa  500          19960428     0            an           t            grid_simple  regular_ll  
1            ecmf         isobaricInhPa  100          19960428     0            an           t            grid_simple  regular_ll  
1            ecmf         isobaricInhPa  1000         19960426     48           fc           t            grid_simple  regular_ll  
1            ecmf         isobaricInhPa  500          19960426     48           fc           t            grid_simple  regular_ll  
1            ecmf         isobaricInhPa  100          19960426     48           fc           t            grid_simple  regular_ll  
6 of 6 grib messages in User_Guide_Example_Data.grib

It contains a 48h forecast and the analysis for the date 1996-04-28T12:00:00.

The GRIB file is not a complete hypercube so cfgrib correctly fills the missing fields with np.nan:

>>> import cfgrib
>>> ds = cfgrib.open_dataset('User_Guide_Example_Data.grib/User_Guide_Example_Data.grib')
>>> ds
<xarray.Dataset>
Dimensions:       (air_pressure: 3, latitude: 121, longitude: 240, step: 2, time: 2)
Coordinates:
    number        int64 ...
  * time          (time) datetime64[ns] 1996-04-26T12:00:00 1996-04-28T12:00:00
  * step          (step) timedelta64[ns] 0 days 2 days
  * air_pressure  (air_pressure) float64 1e+03 500.0 100.0
  * latitude      (latitude) float64 90.0 88.5 87.0 85.5 84.0 82.5 81.0 79.5 ...
  * longitude     (longitude) float64 0.0 1.5 3.0 4.5 6.0 7.5 9.0 10.5 12.0 ...
    valid_time    (time, step) datetime64[ns] ...
Data variables:
    t             (time, step, air_pressure, latitude, longitude) float32 ...
Attributes:
    GRIB_edition:            1
    GRIB_centre:             ecmf
    GRIB_centreDescription:  European Centre for Medium-Range Weather Forecasts
    GRIB_subCentre:          0
    history:                 GRIB to CDM+CF via cfgrib-0.8.5.1.dev0/ecCodes-2...
>>> ds.t.sel(time='1996-04-26T12:00:00', air_pressure=1000, step=0)
<xarray.DataArray 't' (latitude: 121, longitude: 240)>
array([[nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       ...,
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan],
       [nan, nan, nan, ..., nan, nan, nan]], dtype=float32)
Coordinates:
    number        int64 0
    time          datetime64[ns] 1996-04-26T12:00:00
    step          timedelta64[ns] 00:00:00
    air_pressure  float64 1e+03
  * latitude      (latitude) float64 90.0 88.5 87.0 85.5 84.0 82.5 81.0 79.5 ...
  * longitude     (longitude) float64 0.0 1.5 3.0 4.5 6.0 7.5 9.0 10.5 12.0 ...
    valid_time    datetime64[ns] ...
Attributes:
    GRIB_paramId:                             130
    GRIB_shortName:                           t
    ...
    standard_name:                            air_temperature
    long_name:                                Temperature
    units:                                    K

The problem is that np.nan were not handled in cfgrib and ecCodes doesn't support them.

In master now invalid values are set to missingValue before being sent to ecCodes and as an optimisation fields entirely comprised of np.nan are not written to disk at all.

Now the saved GRIB looks like this:

$ grib_ls out.grib 
edition      centre       date         dataType     gridType     stepRange    typeOfLevel  level        shortName    packingType  
2            consensus    19960426     af           regular_ll   48           isobaricInhPa  1000         t            grid_simple 
2            consensus    19960428     af           regular_ll   0            isobaricInhPa  1000         t            grid_simple 
2            consensus    19960426     af           regular_ll   48           isobaricInhPa  500          t            grid_simple 
2            consensus    19960428     af           regular_ll   0            isobaricInhPa  500          t            grid_simple 
2            consensus    19960426     af           regular_ll   48           isobaricInhPa  100          t            grid_simple 
2            consensus    19960428     af           regular_ll   0            isobaricInhPa  100          t            grid_simple 
6 of 6 grib messages in out.grib

iainrussell · 2018-09-21T08:05:26Z

@alexamici - thanks for the very quick response! I can confirm that the code in the master branch works perfectly for this file now, and I plotted the fields with metview to check that they look identical (they do)!

iainrussell · 2018-09-24T12:27:00Z

Still looking good, I'm now checking more 'awkward' subarea definitions like this one, which plots correctly:

I'm using this Metview script to do it quite easily:
compare_grib_plots.py.tar.gz

alexamici · 2018-11-12T17:23:52Z

Note that right now master can re-write to disk most opened GRIB files, because we use the GRIB_ attributes read in input.

Doing an operation on the resulting xr.Dataset but keeping the attributes may make them incoherent. We apply some check and autodetect the values only in the limited set of cases described in the issue text.

We consider write support in Alpha

alexamici · 2019-01-01T20:49:20Z

Note that with 90c8106 very generic GRIB files can be written when setting the all the GRIB_ keys correctly. See #39 .

AhmedMIssawi · 2021-06-04T11:38:46Z

hi, can you help me with this code

import cfgrib
import xarray as xr
import pandas as pd
import sys
import glob
import os

def Grib2files():

folder      = input('Folder PAth:')
val         = input('Variable Name:')
#latitude    = input('latitude:')
longitude   = input('longitude:')

data_conc=[]

files = glob.glob(rf'{folder}/*.grib2')

for file in files:

    data = xr.open_dataset(
        file,
        engine='cfgrib',
        backend_kwargs={'filter_by_keys':{'typeOfLevel': 'hybrid'}})
    
    
    lon  = data.variables['longitude'].values
    lat  = data.variables['latitude'].values
    hyb  = data.variables['hybrid'].values
    time = pd.Timestamp(data.time.values) + pd.to_timedelta(data.step.values,'H')
    
    X = xr.Dataset(
        {'data':data[val].values,
        'lon': lon,
        
        },
        coords = {
                'hyb':hyb,
                'lat':lat,
            },
        )
    
    data_conc.append(X) 
return xr.concat(data_conc, dim=['time'])

data = Grib2files()

I am trying many ways to make this data to 2 d data
hybrid with all latitudes and fixed altitude and slice over time because this data is 1 file per hour and I need to run this for a whole year

xvxiuwen · 2022-07-08T06:52:28Z

I get a problem with this GRIB file "User_Guide_Example_Data.grib". It contains 3 vertical levels, each at 2 forecast steps, and when I read it, convert to xarray, then write back to GRIB, all the values are NaN. If I filter just a single step, then it works perfectly - the only differences are those you would expect when converting from GRIB 1 to GRIB 2, e.g. bounding box coords in microdegrees instead of millidegrees, etc. Here's the code:
import xarray as xr
from cfgrib import xarray_store
from cfgrib import xarray_to_grib
import cfgrib

dst = xarray_store.GribDataStore.from_path("User_Guide_Example_Data.grib")
ds = xr.open_dataset(dst)
print(ds)

cfgrib.to_grib(ds, 'User_Guide_Example_Data_from_cfgrib.grib')#, grib_keys={'centre': 'ecmf'})
And the data is attached. User_Guide_Example_Data.grib.tar.gz

sorry to bother you, but when i run code 'dst = xarray_store.GribDataStore.from_path("User_Guide_Example_Data.grib")',the computer warns there is no attribute 'GribDataStore'. my computer is windows platform and i download xarray and cfgrib in anaconda.

xvxiuwen · 2022-07-08T07:04:39Z

zxdawn · 2022-12-08T12:43:56Z

Another error:

AttributeError: 'Dataset' object has no attribute 'to_grib'

Update

We should import to_grib using the newest name.

from cfgrib.xarray_to_grib import to_grib
to_grib(ds, 'out.grib')

dasarkisov · 2024-06-04T11:19:31Z

Hi! I have an issue with writing grib files. Here I open two grib datasets, sum them up, and write the resulting dataset to grib file; then I open the new grib as dataset.

ds00 = xr.open_dataset('gfs.t00z.pgrb2.0p25.f000', engine='cfgrib')
ds06 = xr.open_dataset('gfs.t06z.pgrb2.0p25.f000', engine='cfgrib')
ds_sum = xr.concat([ds00, ds06], dim='time').sum('time')
to_grib(ds_sum, "ouf.grib")
ds_new = xr.open_dataset('ouf.grib', engine='cfgrib')

During the writing I get the following warning:
FutureWarning: GRIB write support is experimental, DO NOT RELY ON IT! warnings.warn("GRIB write support is experimental, DO NOT RELY ON IT!", FutureWarning)
failed to set key 'endStep:int' to 0
failed to set key 'stepUnits:int' to 1

Here are the three datasets, from left to right: the original one, the resulting one, and the resulting one after writing to grib

Here you can see that after the writing:

the original coordinate heightAboveGround turns into surface and its value 2.0 - into 0.0.
the name of t2m data variable turns into t.

Why do the changes take place during the writing? When I open the newly created grib file with the viewer the temperature represents to be the surface temperature (0 altitude), while originally it is 2m temperature.
Thanks!

UPDATE
Okay, what I've done was adding "heightAboveGround" option in https://github.com/ecmwf/cfgrib/blob/master/cfgrib/xarray_to_grib.py:

Now the to_grib() detects the altitude coordinate heightAboveGround and properly writes it to grib file, preserving its value, which is in my case 2.0:

The data variable name still gets changed (t2m to t), but that's no big deal. The written grib file now is being properly read via grib viewer, with t data variable represented as 2m-altitude temperature!

alexamici added the enhancement New feature or request label Sep 16, 2018

alexamici self-assigned this Sep 16, 2018

alexamici assigned iainrussell Sep 20, 2018

alexamici mentioned this issue Oct 25, 2019

GribInternalError: ('Passed array is too small (-6).', -6) #96

Closed

albertotb mentioned this issue Jul 24, 2021

Download historical GRIB files instead of CSV's albertotb/get-gfs#9

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add support to write a `xarray.Dataset` to a GRIB file. #18

Add support to write a `xarray.Dataset` to a GRIB file. #18

alexamici commented Sep 16, 2018 •

edited

Loading

iainrussell commented Sep 20, 2018 •

edited

Loading

alexamici commented Sep 21, 2018

iainrussell commented Sep 21, 2018

iainrussell commented Sep 24, 2018

alexamici commented Nov 12, 2018 •

edited

Loading

alexamici commented Jan 1, 2019

AhmedMIssawi commented Jun 4, 2021

xvxiuwen commented Jul 8, 2022

xvxiuwen commented Jul 8, 2022

zxdawn commented Dec 8, 2022 •

edited

Loading

dasarkisov commented Jun 4, 2024 •

edited

Loading

Add support to write a xarray.Dataset to a GRIB file. #18

Add support to write a xarray.Dataset to a GRIB file. #18

Comments

alexamici commented Sep 16, 2018 • edited Loading

iainrussell commented Sep 20, 2018 • edited Loading

alexamici commented Sep 21, 2018

iainrussell commented Sep 21, 2018

iainrussell commented Sep 24, 2018

alexamici commented Nov 12, 2018 • edited Loading

alexamici commented Jan 1, 2019

AhmedMIssawi commented Jun 4, 2021

xvxiuwen commented Jul 8, 2022

xvxiuwen commented Jul 8, 2022

zxdawn commented Dec 8, 2022 • edited Loading

Update

dasarkisov commented Jun 4, 2024 • edited Loading

Add support to write a `xarray.Dataset` to a GRIB file. #18

Add support to write a `xarray.Dataset` to a GRIB file. #18

alexamici commented Sep 16, 2018 •

edited

Loading

iainrussell commented Sep 20, 2018 •

edited

Loading

alexamici commented Nov 12, 2018 •

edited

Loading

zxdawn commented Dec 8, 2022 •

edited

Loading

dasarkisov commented Jun 4, 2024 •

edited

Loading