New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WIP BUG don't neglect grid attrs saved as dims w/out coords #140
Conversation
MAINT switch name of NV_STR to BOUNDS_STR COMPAT xr.DataArray.to_dataset call for 0.9.1
Hmm this is causing some calc tests to fail. I'm out of steam for tonight but will return to these tomorrow. Example:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spencerahill I think this fix looks good to me. I agree renaming NV_STR
to BOUNDS_STR
makes things much more readable.
aospy/data_loader.py
Outdated
if data_coord_name: | ||
data = data.rename({data_coord_name.pop(): name_int}) | ||
# Force all dimensions to have coordinates. | ||
if not data[name_int].coords: | ||
data = data.assign_coords(**{name_int: data[name_int].values}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tests that fail are for the case where time is a scalar coordinate. Therefore it gets caught in this if statement as well. I think there are two options here that would fix this:
- Change the if statement to also check if the key is a dimension in the Dataset; this will prevent a scalar time coordinate from being caught in this if statement (since it is not a dimension in the Dataset in that case):
if not data[name_int].coords and name_int in data.dims:
data = data.assign_coords(**{name_int: data[name_int].values})
- Don't change the if statement (allow a scalar time to be caught), but be sure to copy over the
attrs
dictionary to the new coordinate (this will carry over the'units'
attribute and prevent theKeyError
downstream):
if not data[name_int].coords:
attrs = data[name_int].attrs
data = data.assign_coords(**{name_int: data[name_int].values})
data[name_int].attrs = attrs
Take your pick out of those solutions; perhaps the safest would actually be to combine them. I don't see any harm copying over the attrs
dictionary, and it's probably best not to modify scalar coordinates that we don't explicitly need to.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can implement option (2) in an even simpler manner (rather than using .values
use the full object):
if not data[name_int].coords:
data = data.assign_coords(**{name_int: data[name_int]})
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Super useful. I agree, we'll do both and use your simple version of (2).
aospy/test/test_data_loader.py
Outdated
@@ -66,6 +66,20 @@ def test_rename_grid_attrs_ds(self): | |||
ds = rename_grid_attrs(self.ds) | |||
assert LAT_STR in ds | |||
|
|||
def test_rename_grid_attrs_dim_no_coord(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Whatever you decide to do in rename_grid_attrs
just make sure to update the test to reflect the logic change (e.g. make sure attributes are copied over and / or scalar grid attribute DataArrays that are not associated with dimensions are not modified).
@spencerkclark ready for another review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@spencerahill a few very minor things, but this looks pretty much ready to go!
aospy/data_loader.py
Outdated
add missing coordinates from Model objects. | ||
Search all of the dataset's coords and dims looking for matches to known | ||
grid attribute names; any that are found subsequently get renamed to the | ||
aospy name as specified in aospy.internal_names.GRID_ATTRS. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your thoughts on formatting? Should we use double back-ticks on aospy.internal_names.GRID_ATTRS?
aospy/test/test_data_loader.py
Outdated
ds[phalf_dim] = 4 | ||
ds = ds.set_coords(phalf_dim) | ||
result = rename_grid_attrs(ds) | ||
assert result[phalf_dim] == ds[phalf_dim] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we use xr.testing.assert_identical
here?
aospy/test/test_data_loader.py
Outdated
ds_orig = self.ds.copy() | ||
ds_orig[self.ALT_LAT_STR].attrs = orig_attrs | ||
ds = rename_grid_attrs(ds_orig) | ||
assert ds[LAT_STR].attrs == orig_attrs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we use self.assertEqual
here?
@spencerkclark thanks, those are all good. I also renamed rename_grid_attrs to be more descriptive. Will merge when test pass (besides AppVeyor; will fix that later). |
Closes #137.
@spencerkclark this was a little thornier than we originally thought. Ultimately for this data
bnds
was a dim without a coord, and this was missed. In addition, it was a dim without a coord, and this caused the subsequent isel/sel logic to crash. I don't fully understand that, but a solution I'm happy with is to just create a coord for any dims that don't have them.Before I go any further, would you mind assessing if this seems like an OK approach?
I also decided to switch from NV_STR to BOUNDS_STR for the sake of readability and snuck in one xarray.to_dataset compat update for 0.9.1 (it gave a warning).
Poking at similar places as #139...will merge that first and then rebase as necessary for this.
Still to do: