-
Notifications
You must be signed in to change notification settings - Fork 24
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Define the Bukovsky masking regions for use in MET. #1940
Comments
Feedback from Logan:
After some feedback from the team, it sounds like these region files are best stored in the MET repository, along with the other region files. There were two outstanding questions at the last time this issue was discussed:
From the feedback, it sounds like the answer to Q1 is yes, the 1 km finer resolution files that Melissa provided will do the trick. |
@j-opatz given the size requirements, let's go with storing the masks in one file. As you mention, that will change the way the masks are accessed, but I think that may be okay for us at EMC. Based on some of our recent EMC verification team discussions, I believe we will need to do some preprocessing and create separate masks files that are defined on the various grids we use for verification. This will allow us to save compute resources in operations. Therefore, the way the Bukovsky region masks are stored in the MET repository is less of a concern for us. Hopefully that explanation makes sense. Please let me know if it doesn't. |
From the METplus telecon today (3/7/22), this issue was elevated to a requirement for official release. It's also been affirmed that we'll use one file to store all of the regional files, which will create a list-style access in MET. When we get to the testing/acceptability phase, we need to loop in Logan (@LoganDawson-NOAA) and Mallory (@malloryprow). |
A tar file with the 1km masking regions are available in a temporary location on the web here: https://dtcenter.ucar.edu/dfiles/code/METplus/METplus_Data/Bukovsky_1km.tgz |
I've created an ensemble netCDF file of the regions: it's located on Seneca, under/d1/personal/jopatz/workbench/Bukovsky/Bukovsky_regions.nc. @JohnHalleyGotway take a look and see if this format is acceptable for MET. If not, I can tweak whatever we need. |
@j-opatz finally taking a look at this. I see that this data is stored as a 3-dimensional array of doubles: The basin and rsmc variables are floats (should be integers though)
And The upside of doing it this way would be a much smaller file. And we know that each grid point belongs to 1, and only 1, masking region. The downside is that we'd need code changes in MET to actually read those masking region names and write them to the VX_MASK column of the output. |
@j-opatz and @LoganDawson-NOAA, I have a question about the dimensions.
This seems awfully suspicious and a total missed opportunity for simplicity. Shouldn't we have made it a 6001 x 13001 grid with delta lat = delta lon = 0.01 degree spacing? Any idea if this result was on purpose or accidental? I could try using "regrid_data_plane" to shift it over to a simpler 0.01 degree grid but would like to make sure you agree on that approach. |
@j-opatz and @LoganDawson-NOAA, here's some further developments.
Once the MET tools are happy with this file, I'll turn to how we can most easily use it to define masks in MET. I'll plan to proceed with this approach unless I hear direction otherwise. |
@JohnHalleyGotway thanks for doing all this digging into the issue! I haven't used the high-resolution file myself at all, so I wasn't aware of these issues with that lat/lon grid definitions. Your proposed solution sounds reasonable to me! |
Documenting feedback we received from Melissa via email on this issue. Please see below. And this explanation makes the data much easier to work with. There is no need to process them as point data. I plan to use python embedding to manually define the grid, accounting for the shift that Melissa describes, and then regrid it to a 0.01 degree grid. From Melissa: Sorry these are causing problems for you. I've gone back and looked at what was done. Something just a bit odd seems to happen when the lat/lon arrays are attached to the netcdf file that I can't explain, though the arrays are still not exactly uniform before being written to the file either. The lat/lon arrays are produced via: lats = fspan(15.25,75.25,6059) This gives a fairly uniform spacing of 0.00990 followed by additional inconsistent decimal places. The inconsistent part is an artifact from the change to the higher resolution up until this point. When added to the files, the lat/lon values are rounded or truncated adding to the inconsistency. You can regrid these to a uniform grid. I see no problems with that - I've regridded the 0.5 deg resolution version to a variety of different model grids over the years. Or, you can try the shapefiles (the regions were regridded onto a uniform lat/lon WRF grid to facilitate the creation of the shapefiles). I'm going to re-emphasize this caveat though... All of the regions need to be shifted East by 0.25 or 0.50 degrees in the shapefiles (not sure which one, as I haven't used these yet, but I pretty confident this is related to the original resolution of the regions). The southeast region is the best example of this... there's a png of the shapefile plotted on the region in the shapefile folder (and attached), and the region mask should be better centered over FL, not offset to the left. When you take a look at these, keep that in mind -- it's an easy additive fix. And, I think it applies to the netcdf files too, not just the shapefiles. |
Feedback from @LoganDawson-NOAA on 5/2/2022. In general we should INCLUDE inland lakes inside the Bukovsky masking regions. Many model runs would not resolve those lakes anyway so they should be included. |
Based on discussion on 5/11/22, recommend the following order of operations:
|
Discussed at 6/13/2022 METplus NOAA telecon. Target to have this completed and mature by the end of July 2022. Recommend providing an initial version no later than July 1st. |
@LoganDawson-NOAA I'm looking for some direction on next steps. The question at hand is whether we want to use the 50m or 110m resolution of the Natural Earth shapefiles. Let me give you some background to illustrate.
The top image shows the 110m resolution while the bottom image shows the 50m resolution: You can see that the 110m version exactly corresponds to MET's map data because that's how we defined the map data.
While these are defined on a 0.01 degree grid, you'll actually run regrid_data_plane to regrid each mask to the grid being evaluated. Do you want to regrid FROM the coarser 110m version of the masks or from the higher resolution 50m version? FYI, once we decide on the desired CONUS resolution, the next steps are:
|
… is only for develompent purposes and not actually intended to be merged back into the develop branch.
Here's a tarfile with the two versions to consider where the boundaries are defined by the 110m vs 50m natural earth shapefiles. The remaining tasks are:
Or another alternative is enhancing the logic of the MET library code to handle these composite mask and make it easy for the to select which region should be extracted. |
@JohnHalleyGotway this looks fantastic! I brought up the resolution question during our EVS meeting this morning, and we'd like to move forward with the 50 m resolution version. Providing a simple script to interpolate these high-resolution masks to a specific grid would be preferable to enhancing the MET logic (at least in the short term). Since masking regions defined on each different verification grid can be treated as fixed files, it will be most efficient for us to do all of that regridding during our development phase before EVS code delivery as opposed to doing the regridding with each run of a MET command. The final addition I see that's needed is defining the four aggregated regions that are shown in the map in the first comment on this issue. CONUS_East, CONUS_West, CONUS_South, and CONUS_Central might be good names to use for those. Do you need confirmation of which subregions belong to each regional aggregate? |
@LoganDawson-NOAA I believe this work is complete. Please see these 3 images showing the basic Bukovsky regions, the Bukovsky region groups, and the full CONUS region: You can find the corresponding data in this tar file: It contains:
Generates 22 output files, one for each of the basic regions, region groups, and the full CONUS region. For example, Note that it has been updated to require exactly 2 arguments (in case you want to define the target grid as something other than a named grid):
@LoganDawson-NOAA please review and let me know if this is what you're looking for. Once you confirm, I'll close this issue. |
@JohnHalleyGotway this current format is exactly what we were looking for! The script easily generates the mask files that are needed for different verification grids, and we can confirm that the MET output includes the mask I have one last ask that was an oversight on my part. Will you also include a CONUS region that is the union of CONUS_East, CONUS_West, CONUS_South, and CONUS_Central? Verification over the entire CONUS without any regional breakdowns is all that's required for some model/field combinations. Having a full CONUS region available would prevent us from having to generate stats over the four region groups before aggregating when generating graphics. This |
@LoganDawson-NOAA, sure no problem. Adding in CONUS was easy enough. Please note that I updated the contents of the comment above and reposted a new tar file to include the CONUS region. Also note that I modified the gen_bukovsky.sh script to take 2 arguments instead of 1. Please re-review and let me know. |
Marking this issue as completed. I provided a NetCDF containing the Bukovsky regions to NOAA/EMC about 2 weeks ago, and they have been using them with no complaint. Note that this NetCDF file was NOT added to the MET repository itself but may be stored in the NOAA EVS repository. |
Describe the New Feature
On 9/21/2021, NOAA/EMC decided to start computing verification statistics using the Bukovsky regions. In particular, there were discussions about new regions being added (a southern region if I'm not mistaken), as well as defining CONUS with the East/West/Central/South Bukovsky definitions. This coincides with the EVSv1 and was already approved by the UFS V&V group.
Some of these areas needed to be cleaned up, specifically including the Eastern region near Maine, since part of Canada is included.
Here's the website: https://www.narccap.ucar.edu/contrib/bukovsky/
The regions are defined on a pretty coarse 1/2 degree grid, which is appropriate for climate simulations. However, applying them to high resolution regional domains will produced jagged results. Recommend coordinating with Dr. Bukovsky to define these regions in as a set of lat/lon points or shapefiles. That would enable a nice application of them to model domains of any resolution.
The tasks for this issue are:
There are 27 individual regions defined along with 13 region groups. However those 27 regions do not consider international borders. NOAA/EMC is specifically interested in 4 region groups: West, Central, East, and South. But they want those intersected with CONUS. Ideally, we'd provide definitions for each of the Bukovsky regions, along with their CONUS intersections. This may or may not be possible.
Acceptance Testing
List input data types and sources.
Describe tests required for new functionality.
Time Estimate
Estimate the amount of work required here.
Issues should represent approximately 1 to 3 days of work.
Sub-Issues
Consider breaking the new feature down into sub-issues.
Relevant Deadlines
List relevant project deadlines here or state NONE.
Funding Source
2793541
Define the Metadata
Assignee
Labels
Projects and Milestone
Define Related Issue(s)
Consider the impact to the other METplus components.
No known issues needed at this time. But consider adding one for METplus wrappers and/or METplus-Training to demonstrate their use.
New Feature Checklist
See the METplus Workflow for details.
Branch name:
feature_<Issue Number>_<Description>
Pull request:
feature <Issue Number> <Description>
Select: Reviewer(s) and Linked issues
Select: Repository level development cycle Project for the next official release
Select: Milestone as the next official version
The text was updated successfully, but these errors were encountered: