New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix ancillary variable confusion after resampling #2336
Conversation
Added a unit test reproducing the failure described in GH2329.
In replacing ancillary variables, use default ID keys config if no id keys are defined as metadata in the datasets in question.
Codecov Report
@@ Coverage Diff @@
## main #2336 +/- ##
=======================================
Coverage 94.67% 94.67%
=======================================
Files 328 328
Lines 48544 48554 +10
=======================================
+ Hits 45960 45970 +10
Misses 2584 2584
Flags with carried forward coverage won't be shown. Click here to find out more.
Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here. |
The unstable test failures seem to be pointing here: zarr-developers/zarr-python#1304 (merged yesterday). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM but I am not mastering this part of Satpy, so only left a comment on improving readability a bit. And hope someone else can comment on the functionality.
And I suppose you have noticed the comments from CodeScene. The |
Small rewrite in test_resample_multi_ancillary to make the code easier to read.
I agree. More than 2000 LOC and 118 test methods/functions. Not sure what the best way would be to split this up, but it needs to be done. But maybe a dedicated PR on its own? |
Yes and I think we could do something similar for the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had one inline comment about your test. If that can't be done then I'll consider this review as an approval. I do think @mraspaud needs to review this before merge as he was the one who added the ancillary walking logic and the DataID.
def test_resample_multi_ancillary(self): | ||
"""Test that multiple ancillary variables are retained after resampling. | ||
|
||
This test corresponds to GH#2329 | ||
""" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there other tests for resampling with ancillary variables? Or maybe just the data walker? I'm wondering if an existing test could be updated with the checks you include at the bottom of this test that would serve the same purpose as this test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand the data walker, but I don't think the tests involving it do anything with ancillary variables.
Well that's unfortunate then because that was the whole point of that function/generator iirc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There doesn't seem to be any test specifically for dataset_walker
. The only mention in the test suite is in a mock:
$ find . -name '*.py' -exec grep dataset_walker {} +
./test_scene.py: mock.patch('satpy.dataset.dataset_walker') as ds_walker:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. FWIW I think dataarrays loaded with satpy will always have _satpy_id_keys
set, but it's good to have a fallback
Fix ancillary variable confusion after resampling.
Previously,
replace_anc()
was only looking at_satpy_id_keys
in the variable attributes. When this was not set, it would useNone
for constructing theDataID
s used for comparison. When we passNone
as ID keys toDataID
, theDataID
becomes empty:DataID()
, such that a comparison always evaluates to True. As a consequence, instead of replacing the "correct" of N ancillary variables, it would replace the first one N times and the other ones not at all.In this PR, I am taking the
default_id_keys_config
if no_satpy_id_keys
are set in the variable attributes.I'm not sure if
_satpy_id_keys
is supposed to be always set (possibly todefault_id_keys_config
, or if it is expected that it is not always set.