-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow data to be omitted from netCDF files during cfdm.write
#222
Conversation
OK - I've finished fiddling - good to review, now, thanks. |
Codecov Report
@@ Coverage Diff @@
## master #222 +/- ##
==========================================
+ Coverage 87.66% 87.67% +0.01%
==========================================
Files 123 123
Lines 12631 12641 +10
==========================================
+ Hits 11072 11082 +10
Misses 1559 1559
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This generally looks great, implementing the feature request at hand in an elegant way, and apart from one suggestion relating to testing (see in-line) I can't pick out anything to say in fault or suggestion, but I have noticed a few things in reviewing that I wanted to check were expected, in case they are issues that mean some tweaking is required.
Namely, to investigate the new keyword I wrote out all of the example fields with omit_data="all"
set, and eyeballed the dump
on each field when read back in to compare with the original.
Everything looked good, with the re-read fields showing masked data e.g. [[--, ..., --]]
instead of [[1.0, ..., 8.0]]
but the metadata reported otherwise being reported as identical, with some exceptions, as follows. All of these differences were noticed in the 6th and 7th fields:
-
example field 6: the
Auxiliary coordinate: ncvar%z
construct gets re-labelled asAuxiliary coordinate: altitude
, which corresponds to the standard name of the bounds, but I don't think there should be a change in the label like that? Explicitly I see:- Auxiliary coordinate: ncvar%z + Auxiliary coordinate: altitude Geometry: polygon Bounds:axis = 'Z' Bounds:standard_name = 'altitude' Bounds:units = 'm' Bounds:Data(cf_role=timeseries_id(2), 3, 4) = [[[1.0, ..., --]]] m Interior Ring:Data(cf_role=timeseries_id(2), 3) = [[0, ..., --]]
where I am also wondering why the first few Data points remain for the bounds and interior ring, i.e. why they don't change and show as
[[--, ..., --]]
like the other data that gets omitted? -
example field 7: the
_FillValue = -1073741824.0
changes slightly to_FillValue = -1073741800.0
, is that expected and/or harmless? -
generally is it expected that datetimes are reported (i.e. does this indicate an under-the-hood issue, not just a representation quirk) after
omit_data
and re-reading as e.g.Data(time(1)) = [0]
? And for multiple datetimes, they sometimes have a0
entry e.g. (from example field 7 constructDimension coordinate: time
),Data(time(3)) = [--, 0, --] gregorian
, though sometimes not e.g.Bounds:Data(time(3), 2) = [[--, ..., --]] gregorian
from the same construct?
e0d61cb passes on the omission to the interior ring |
This one's an unrelated bug. If we look at the actual array we get: >>> t = g.construct('time')
>>> t.data
<Data(3): [--, 0, --] gregorian> # Wrong
>>> t.data.array
masked_array(data=[--, --, --], # Right
mask=[ True, True, True],
fill_value=1e+20,
dtype=float64) So the underlying data is correct, but the |
Unexpected, but harmless for now! It happens through a write/read cycle at v1.10.0.0, too, so isn't related to |
Again, unrelated (thankfully). If you inspect I'll have a look in the example field definition to see if it's obvious why "altitude" doesn't appear in the first place ... |
Thanks David for drilling down to investigate all of the facets. Always good to catch a bug even if it isn't immediately related 😄 So as far as I can tell from your recent comments, #222 (comment) is the only issue directly related to this PR, and you have put in a fix, is that right? In which case I just need to re-review with the new commit and then we're good to go? |
This turns out to be a feature! It's to do with the unusual case where a coordinate construct has bounds but no data. In this case we shouldn't be assigning an |
I've added to the test: # Check that a dump works
g.dump(display=False) Is that what you meant? |
Yes, precisely, just test that it doesn't fail miserably due to some facet of missing data. |
Great - are we good to go on this? (notwithstanding the two new issues that have been spawned :)) |
(sorry - I missed #222 (comment), but I think we've ended up in the same place!) |
No worries! I also missed your comments afterwards so my apologies too. I'm just taking a final look and will let you know in a moment. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
All questions I raised in first review have been answered and problems emerging from those have either been addressed by new commits that I've since reviewed, or delegated to (intended) new issues (or identified as a feature! 😆):
... notwithstanding the two new issues that have been spawned :) ...
so we just need to open these two (#222 (comment) and #222 (comment)), possibly just by linking to your comments, since I can't see any issues referenced or on the Issue Trackers so assume they haven;t been created yet. Shall I do that or would you like to?
All good, though! Thanks. Please merge.
Thanks, Sadie - Happy for you to raise new issues :) (but just ping me otherwise) |
Fixes #221