-
Notifications
You must be signed in to change notification settings - Fork 295
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix HDF4 support in geocat reader with hardcoded engine #2507
Conversation
…dler that using xarray_kwargs.
satpy/readers/geocat.py
Outdated
scene = satpy.Scene(filenames, | ||
reader='geocat', | ||
reader_kwargs={'decode_times': True}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct me if I'm wrong but this isn't how this would work. The user would be doing reader_kwargs={"xarray_kwargs": {"decode_times": True}}
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct fixed in 510c24f
satpy/readers/geocat.py
Outdated
kwargs.setdefault('xarray_kwargs', {}).setdefault( | ||
'decode_times', False) | ||
kwargs.setdefault('xarray_kwargs', {}).setdefault( | ||
'engine', "netcdf4") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about it, these can just be hardcoded can't they? So kwargs.setdefault("xarray_kwargs", {"decode_times": False, "engine": "netcdf4"})
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point, I can't think of a use case where they could be anything else.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the geocat code changes, and the unit metadata becomes compliant, I think it would be nice to be able to interpret the scan line times when reading them. I might be mistaken, but 75-78 as written would allow the user to set the decode_times back to True if it is possible to decode them?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, it looks like it would work. My suggestion for doing the entire xarray_kwargs
dictionary would also work, but would require users to make sure they include engine and decode_times if they were specifying xarray_kwargs at all. This is such a small thing and not sure it matters too much.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay. I will work on it. However, if I set as you outline above, and the user enters:
a = Scene([fn], reader="geocat", reader_kwargs={"xarray_kwargs": {"decode_times": True}})
That will strip the default engine so that what is passed to xarray.open_dataset is decode_times=True
, but no engine. As I am understanding what you are saying, and I think it is necessary, the reader should still pass engine="netcdf4"
regardless of what the call from Scene contains. Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Circling back to this again, sorry.... if the kwargs are set as in a previous commit
Doesn't that leave the engine as "netcdf4" if the user only enters a value for decode_times? If no xarray_kwargs are provided, the values would be {"engine": "netcdf4", "decode_times": False}
?
I think I am not understanding your point fully on this requested change to
kwargs.setdefault("xarray_kwargs", {"decode_times": False, "engine": "netcdf4"})
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are correct. Yours is more flexible. That's why I said it doesn't matter much. My main reason for my suggestion was that it treated xarray_kwargs as an all-or-nothing keyword argument rather than individual things. Sorry for all the confusion for something so dumb (on my part). Let me know if you want to put it back the way you had it or leave it as is. Either way I'm fine with it and we can merge regardless of the pre-commit issue...which I think is a hiccup on pre-commit's side of things.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is fine. I think either way works, so to avoid errors that can be introduced with too much more fiddling, I think I will leave it as is.
Codecov Report
@@ Coverage Diff @@
## main #2507 +/- ##
=======================================
Coverage 94.85% 94.85%
=======================================
Files 337 337
Lines 49624 49635 +11
=======================================
+ Hits 47069 47080 +11
Misses 2555 2555
Flags with carried forward coverage won't be shown. Click here to find out more.
|
Pull Request Test Coverage Report for Build 5315827401
💛 - Coveralls |
satpy/readers/geocat.py
Outdated
"""Open and perform initial investigation of NetCDF file.""" | ||
super(GEOCATFileHandler, self).__init__( | ||
filename, filename_info, filetype_info, | ||
xarray_kwargs={"engine": "netcdf4", "decode_times": False}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess the only change I'd like to consider is allowing the user to pass xarray_kwargs
in a **kwargs
. I think we had done something like kwargs.setdefault("xarray_kwargs, {...})
.
78ed846
to
b8a42de
Compare
super(GEOCATFileHandler, self).__init__( | ||
filename, filename_info, filetype_info, | ||
xarray_kwargs=kwargs["xarray_kwargs"]) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should just be **kwargs
in case the user is passing other kwargs down to the utility file handler.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I understand where you want kwargs, in the __init__
correct? The user would still send the xarray specific kwargs with the key "xarray_kwargs" so the documentation should reflect that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct. This call to __init__
should use kwargs to pass everything to the base class. That doesn't change what the user uses.
satpy/readers/geocat.py
Outdated
|
||
**Loading data with decode_times=True** | ||
|
||
By default, this reader will use xarray_kwargs={"engine": "netcdf4", "decode_times": False}. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default, this reader will use xarray_kwargs={"engine": "netcdf4", "decode_times": False}. | |
By default, this reader will use ``xarray_kwargs={"engine": "netcdf4", "decode_times": False}``. |
Fix the documentation reference for the Scene class. Co-authored-by: David Hoese <david.hoese@ssec.wisc.edu>
satpy/readers/geocat.py
Outdated
scene = satpy.Scene(filenames, | ||
reader='geocat', | ||
reader_kwargs={'xarray_kwargs': {'decode_times': True}}) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to update this to include engine="netcdf"
in this example since without it and with my proposed code that you implemented the engine will be the default where xarray tries to match/find one.
Co-authored-by: David Hoese <david.hoese@ssec.wisc.edu>
pre-commit.ci autofix |
Closing and reopening to hopefully pick up any commits from pre-commit. |
pre-commit.ci autofix |
@joleenf I checked the pre-commit.ci log and it looks like it is having some trouble merging/pushing fixes to your branch. It says to pull/merge with upstream main to fix it. You could also do the fixes yourself (after merging with main) by running |
Merge branch 'main' of github.com:pytroll/satpy into geocat_xr_kwargs
user does not have to enter both the engine and decode_times when changing one of the defaults.
Adds an
__init__
method to the GEOCATFileHandler to allow the addition of xarray_kwargs to the reader. This has been done so that the engine to the xarray_opendataset can be set as well as setting decode_times=False since the current metadata does not allow xarray to decode the times. (Even when the engine is set, decode_times=True will throw an error because xarray cannot interpret the units metadata as written by GEOCAT).Relates to the discussion on pydata/xarray#7905
I wanted to isolate this pull request from later work which will rely on being able to read these GEOCAT files with SatPy (i.e. adding aliases to facilitate RGB composites with existing recipes).