New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add reader for AMSR2 Level 2 data produced by GAASP software (amsr2_l2_gaasp) #1421
Conversation
DeepCode's analysis on #8ea80f found:
Top issues
👉 View analysis in DeepCode’s Dashboard | Configure the bot |
Codecov Report
@@ Coverage Diff @@
## master #1421 +/- ##
==========================================
+ Coverage 90.58% 90.62% +0.04%
==========================================
Files 239 241 +2
Lines 34313 34586 +273
==========================================
+ Hits 31081 31344 +263
- Misses 3232 3242 +10
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding the reader! Lots of interesting data in there.
I have some inline comments, but more generally, it would be nice if you could recheck the tests to make sure everything is tested. Codecov is showing some misses, but more seriously the preservation of float types doesn't seem to be tested for example. Are all the features of reader actually tested?
satpy/readers/amsr2_l2_gaasp.py
Outdated
def start_time(self): | ||
"""Get start time of observation.""" | ||
time_str = self.nc.attrs['time_coverage_start'] | ||
return datetime.strptime(time_str, "%Y-%m-%dT%H:%M:%S.%fZ") | ||
|
||
@property | ||
def end_time(self): | ||
"""Get end time of observation.""" | ||
time_str = self.nc.attrs['time_coverage_end'] | ||
return datetime.strptime(time_str, "%Y-%m-%dT%H:%M:%S.%fZ") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like the start time and end time is provided in the filename. Could these be checked first, and go into reading the nc values if the filename times aren't available?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In my experience the start/end time in the filenames doesn't always match the internal data and it is best to use the file content first. What would be the benefit of using the filename times?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That would be the time the user expects based on the filename. I think the internal time could be confusing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would assume the internal time is more accurate though. I have also had cases where users expect the same time as in the filename but that doesn't make them right.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I recently noticed that reading the value inside the file takes a significant amount of time when having the files on a shared filesystem over a DSL Internet connection. That happened in a case where the files were provided to satpy but ended not being loaded because the data was elsewhere (viirs sdr reader). With the advance of remote file systems like s3 buckets, I suspect this will be all the more common situation. So, if the times are similar in 99% of the cases, I'd rather use the filename time.
This maybe hints for having two kinds of times in the future, the filename ones (rough estimate) for basic filtering of the data, and a more precise one for applications that need it. But that is another issue.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this depends on the format being read. I'd have to check, but I have always assumed that xarray.open_dataset with a remote file will always load at least the global attributes. I suppose in this case since start and end are both in the filename I'd be ok doing this though.
Fallback to file content if needed
@mraspaud This is ready for re-review. Deepcode is wrong about the 1 issue it found. Otherwise, assuming the tests pass then this should be good to go. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a few questions, otherwise looks good. Also some lines aren't covered.
The CSPP team at SSEC recently released a version of the GAASP processing software which produces level 2 products for the AMSR2 instrument. This PR adds a reader to read those GAASP produced files.
flake8 satpy