Skip to content

FieldSet.from_netcdf(): Uninformative error message for incorrect dimension order #1180

@VeckoTheGecko

Description

@VeckoTheGecko

Parcels version: 2.3.0
Python version: 3.10.2

Obligatory thanks for developping this package :D. Have found the notebook tutorials really handy. As a heads up, I'm pretty new to this package.

I was trying to load a NetCDF file as a FieldSet using FieldSet.from_netcdf(), however received an uninformative error message that wasn't telling me what was wrong.

On a hunch, I tried FieldSet.from_xarray_dataset() which gave me a better error message telling me that my dimensions were in the wrong order. Fixing this in xarray resolved my problem (check minimal example at the end).

Discussion

Proposed solutions:

  • Update error message of .from_netcdf() to match .from_xarray_dataset()
    • Mention in documentation that functions are sensitive to dimension ordering (I couldn't find a mention for this)
  • Make both methods insensitive to dimension ordering by resolving on read-in

Minimal reproducible example

NOTE: This example will save a small NetCDF file to your working directory

import xarray as xr
from parcels import FieldSet
import numpy as np

variables = {'U': 'UCUR', 'V': 'VCUR'}
dimensions = {'U': {'lat': 'y', 'lon': 'x', 'time': 'time'},
            'V': {'lat': 'y', 'lon': 'x', 'time': 'time'}}

def create_xarray():
    """Creates the test dataset
    """
    x = np.linspace(0, 100, 10, endpoint=True)
    y = np.linspace(0, 110, 11, endpoint=True)
    time = np.linspace(0, 120, 12, endpoint=True)
    UCUR = np.random.randn(x.shape[0], y.shape[0], time.shape[0])
    VCUR = np.random.randn(x.shape[0], y.shape[0], time.shape[0])

    ds = xr.Dataset(
        data_vars=dict(
            UCUR=(["x", "y", "time"], UCUR),
            VCUR=(["x", "y", "time"], VCUR),
        ),
        coords=dict(
            x=x, 
            y=y,
            time=time,
        )
    )
    return ds

# LOADING MINIMAL EXAMPLE
fname_min = "minimal.nc"
filenames_min = {"U": fname_min, "V": fname_min}
ds_min = create_xarray()
ds_min.to_netcdf(fname_min)



## ===================
# TESTING USING from_netcdf
## ===================
## Errors
fieldset = FieldSet.from_netcdf(filenames_min, variables, dimensions, deferred_load=False) # Errors out uninformatively
fieldset = FieldSet.from_xarray_dataset(ds_min, variables, dimensions) # Errors out informatively

The .from_netcdf() function results in this uninformative error message:

ValueError: cannot reshape array of size 1100 into shape (12,1,11,10)

whereas .from_xarray_dataset() gives:

AssertionError: Field U expecting a data shape of [tdim, ydim, xdim]. Flag transpose=True could help to reorder the data.

Fixing the problem:

## Correction required
ds_min = ds_min.transpose("time", "y", "x")
ds_min.to_netcdf(fname_min)

# These work :)
fieldset = FieldSet.from_netcdf(filenames_min, variables, dimensions, deferred_load=False)
fieldset = FieldSet.from_xarray_dataset(ds_min, variables, dimensions)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions