Skip to content

Losing data when reading/converting GRIB2 files to netCDF using open_dataset/to_netcdf methods #7700

@mmgamboa

Description

@mmgamboa

What is your issue?

Hi all,

I have data on GRIB2 format file and I want to convert it to netCDF format. The original dataset (confirmed by using pygrib package) has 12 messages: 6 different isobaric levels each with 2 variables (average and maximum) but when I convert the files using xarray I miss 6 out of 12 messages.

The messages of the original file are pygrib.open('filename.grib2').read():

[1:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 15000 Pa:fcst time 6 hrs:from 202001080600,
 2:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 15000 Pa:fcst time 6 hrs:from 202001080600,
 3:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 20000 Pa:fcst time 6 hrs:from 202001080600,
 4:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 20000 Pa:fcst time 6 hrs:from 202001080600,
 5:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 25000 Pa:fcst time 6 hrs:from 202001080600,
 6:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 25000 Pa:fcst time 6 hrs:from 202001080600,
 7:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 30000 Pa:fcst time 6 hrs:from 202001080600,
 8:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 30000 Pa:fcst time 6 hrs:from 202001080600,
 9:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 35000 Pa:fcst time 6 hrs:from 202001080600,
 10:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 35000 Pa:fcst time 6 hrs:from 202001080600,
 11:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 40000 Pa:fcst time 6 hrs:from 202001080600,
 12:Relative clear air turbulence (RCAT):% (instant):regular_ll:isobaricInhPa:level 40000 Pa:fcst time 6 hrs:from 202001080600]

To make the conversion I am running the following commands:

import xarray

data = xarray.open_dataset('filename.grib2', engine = 'cfgrib')
data.to_netcdf('netcdf_file.nc')

and then to read it from another file I run

import netCDF4 as nc
ds = nc.Dataset('netcdf_file.nc', engine = 'netcdf4')

In any case both data and ds objects have less levels (6). Here a screenshot of the data object

issue1

Is xarray losing data when reading grib2 file? Is it possible that the problem comes from the fact that the original messages are the same for a given isobaric level? In that case, can I rewrite the messages by adding a flag that specifies that one message is for the average (ave) and another is for the maximum (max) to the CAT parameter?

Thanks in advance,
Martín Gamboa

Metadata

Metadata

Assignees

No one assigned

    Labels

    needs triageIssue that has not been reviewed by xarray team member

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions