New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add ability to create complex tiled AWIPS NetCDF files (formerly SCMI writer) #1402
Conversation
Codecov Report
@@ Coverage Diff @@
## master #1402 +/- ##
==========================================
+ Coverage 90.56% 91.12% +0.56%
==========================================
Files 228 241 +13
Lines 33406 35438 +2032
==========================================
+ Hits 30254 32294 +2040
+ Misses 3152 3144 -8
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
Congratulations 🎉. DeepCode analyzed your code in 7.531 seconds and we found no issues. Enjoy a moment of no bugs ☀️. 👉 View analysis in DeepCode’s Dashboard | Configure the bot |
So with this last commit I think I'm as close I can be without having example files of the tiled files we plan on reproducing. The main question mark is what the actual filename should be. There are a couple important optimizations that should be done and I need to see what needs to be fixed to get category products to work but I'll leave that for now so I can get back to other projects. |
self.assertEqual(len(all_files), 20) | ||
|
||
# these tiles should be the right-most edge of the first image | ||
first_left_edge_files = [x for x in first_files if 'O01' in x or 'O03' in x or 'U01' in x or 'U01' in x] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
F841 local variable 'first_left_edge_files' is assigned to but never used
ds = xr.open_dataset(fn, mask_and_scale=False) | ||
check_required_common_attributes(ds) | ||
assert ds.attrs['time_coverage_end'] == end_time.strftime('%Y-%m-%dT%H:%M:%S.%fZ') | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
W391 blank line at end of file
Includes better handling of flag/category products
This is interpreted as the sectorID in AWIPS
Includes hack for hanging test that is yet to be solved
I hit a very annoying road block that I've spent multiple days on with no progress so I'm putting a bandaid on it and leaving it. The functionality for the SCMI write to update existing tiles on disk with new data seems to hang if you use multiple threads (multiple dask workers) and I've only noticed it in the last couple days. I've tried and tried to make non-satpy scripts to reproduce it or to narrow down exactly what is happening, but haven't come up with anything. It all seems to be with xarray's caching of file objects and multiple threads access them. I suppose, if I had to guess, the file caching might not be thread safe. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, this seems like a lot of refactoring and testing could be done here, but maybe that belongs in a future PR as long as the new functionality is correctly tested.
Just a couple of comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
This is going to become a large collaborative pull request to add a large set of features to the current
scmi
writer. Additionally, this work should really require a lot of code refactoring because the current implementation was cruft on cruft on cruft over years of "oh that shouldn't be that hard to add". That said, let's see if I can document this as we go.The Main Task
The CSPP Geo team at the Space Science and Engineering Center (SSEC) has been tasked with reproducing some tiled AWIPS NetCDF files originally produced by another group with non-Satpy and non-CSPP Geo software. So although we could produce the same result to the client in AWIPS (images on the screen) with the current implementation of the
scmi
writer, we aren't allowed to modify the configuration files on the AWIPS server that would give us this flexibility.Therefore, the
scmi
writer needs to be updated to match the files provided by this other group. There are some complications to make this possible in the Satpy writer. These are outlined below.Renaming the writer
The name
scmi
came from the original use case for these files (sectorized cloud moisture imagery), but doesn't make sense any more. Additionally, some people associate SCMI with the ABI L1b file produced for the NWS, even though there are other files (sensors and datasets) using this same format to get their data into AWIPS.I propose renaming this writer to one of:
Thoughts?
Expected Output Overview
Here is an excerpt from the config on the AWIPS server and what it is looking for (we hope to get an actual sample NetCDF file soon). Further on I'll explain why we have some complications with what this expects.
Global Attributes
These files expect either new global attributes for data we have access to or they expect the same global attributes we already produce but with different names. The simplest solution is to provide keyword arguments to specify the name for some of these. Luckily there are only a few that are needed that AWIPS understands. In these files those are:
start_time
).platform_name
of the metadata. One issue is that Satpy uses names likeGOES-16
but this wants them asG-16
. We'll need an easy way to make that work. Perhaps a hacky dictionary to map things that is hard-coded for now?Multiple Variables
The biggest challenge will be that we currently generate a single set of tile files for each product, but these files expect multiple variables in each NetCDF file. There are a couple ways I see of solving this (that may also help with the above):
save_datasets
method, so this is just a different way of handling that.awips_netcdf
writer, we could use the same python class but allow the YAML files (satpy/etc/writers/scmi.yaml
currently) to configure the template for the file and whether or not they should be multi-variable. This configuration could include all of the attribute layouts and maybe even things like a platform_name renaming map.Ancillary Variables
The expected files expect a DQF variable to also exist. This is a little strange for the current design, but shouldn't really be a problem. The current
glm_l2
reader doesn't read this DQF variable but that's fine. It could show up as its own variable or as an ancillary_variable which should show up inmy_data_arr.attrs['ancillary_variables']
when the other datasets are loaded in the Scene.TODO
available_datasets
so all variables from the input NetCDF files can be loadedflake8 satpy
AUTHORS.md
if not there already (Nick and Ray will probably need to do this)