Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use inline post with cubed sphere history output #1803

Closed
CoryMartin-NOAA opened this issue Jun 16, 2023 · 13 comments · Fixed by NOAA-EMC/fv3atm#680 or #1831
Closed

Use inline post with cubed sphere history output #1803

CoryMartin-NOAA opened this issue Jun 16, 2023 · 13 comments · Fixed by NOAA-EMC/fv3atm#680 or #1831
Assignees
Labels
enhancement New feature or request

Comments

@CoryMartin-NOAA
Copy link

Description

The model now has the capability to write out cubed sphere history files in a similar format to the Gaussian grid history files. We plan to use these as the input to JEDI rather than the FMS restart files (we need the cubed sphere grid for JEDI-based DA and restart files will be unsustainable at operational resolution, particularly for 80 ensemble members x 7 valid times.).

Currently, if you try to run UFS with cubed sphere history and the inline post together, the model will error because the UPP does not accept cubed sphere grids. It is likely this configuration is the one we would need for any version of the global UFS that uses JEDI for its atmospheric data assimilation.

One other thing to note: if 'netcdf' is the file type, it will write out six tile files, only 'netcdf_parallel' will write out a single atmf006.nc file with a tile dimension. Not sure if this is by design or not, we would ideally want to use the single file with tile dimension if possible to conform to the current workflow/data structure as much as possible.

Solution

Have the model (ESMF?) interpolate to an appropriate Gaussian grid for the inline post immediately before passing the fields to the post when writing out cubed sphere history.

Alternatives

  • Allow the model to write out both Gaussian and cubed sphere (but 2x the data is inefficient...)
  • Have the UPP/inline post natively handle cubed sphere grids

Related to

Cycling with JEDI for hybrid 4DEnVar configuration as in the operational GDAS

@CoryMartin-NOAA CoryMartin-NOAA added the enhancement New feature or request label Jun 16, 2023
@CoryMartin-NOAA
Copy link
Author

Let me know if you need any further information/clarification or how I can help, thanks!

@junwang-noaa
Copy link
Collaborator

I think the solution of interpolating the native grid to Gaussian grid is doable, but this requires additional updates in both the write grid comp and the POST since offline POST won't work after this change.

@DusanJovic-NOAA
Copy link
Collaborator

Do you need both cubed sphere history files and post files created at the same time during the DA cycling?

@CoryMartin-NOAA
Copy link
Author

@DusanJovic-NOAA good question. I'm not sure if any of the "gdas" post files are actually used except the analysis (which is offline anyways). For DA cycling, we don't need any post files, but perhaps for verification purposes?

@junwang-noaa
Copy link
Collaborator

@WenMeng-NOAA I remember we do output post products for gdas, right?

@WenMeng-NOAA
Copy link
Contributor

@junwang-noaa That's right. The UPP process gdas to generate master and flux files at analysis and f00 to f09. These data files are disseminated for the public. Please see https://nomads.ncep.noaa.gov/pub/data/nccf/com/gfs/prod/gdas.20230623/00/atmos/

@DusanJovic-NOAA
Copy link
Collaborator

I made some preliminary changes to the model to allow writing cubed sphere history outputs (for top-level domain only) in addition to using other grid type for standard history output and inline post.

See : https://github.com/DusanJovic-NOAA/ufs-weather-model/tree/cubed_sphere_history_output

If in the model_configure you add:

top_parent_history_file_is_on_cubed_sphere_grid: .true.

and leave output_grid to be 'gaussian_grid' and set write_dopost to .true. you will get standard history output files and post grib file on gaussian grid, and additional set of history files with prefix 'cubed_sphere_grid_'.

For example:

output_history:          .true.
write_dopost:            .true.
top_parent_history_file_is_on_cubed_sphere_grid: .true.
num_files:               2
filename_base:           'atm' 'sfc'
output_grid:             gaussian_grid
output_file:             'netcdf_parallel'

will produce:

GFSFLX.GrbF00
GFSFLX.GrbF06
GFSFLX.GrbF12
GFSFLX.GrbF24
GFSPRS.GrbF00
GFSPRS.GrbF06
GFSPRS.GrbF12
GFSPRS.GrbF24
atmf000.nc
atmf006.nc
atmf012.nc
atmf024.nc
cubed_sphere_grid_atmf000.nc
cubed_sphere_grid_atmf006.nc
cubed_sphere_grid_atmf012.nc
cubed_sphere_grid_atmf024.nc
cubed_sphere_grid_sfcf000.nc
cubed_sphere_grid_sfcf006.nc
cubed_sphere_grid_sfcf012.nc
cubed_sphere_grid_sfcf024.nc
sfcf000.nc
sfcf006.nc
sfcf012.nc
sfcf024.nc

GFS*.Grb* atmf*.nc and sfcf*.nc will be on gaussian grid, while cubed_sphre_grid_*.nc will be on cubed sphere.

Suggestion on how we should name the new model_configure option and which file prefix we should use are welcome.

@junwang-noaa
Copy link
Collaborator

I am thinking if we would like to add post grid instead, by default (if not set), it would be the same as "output_grid", and we will not output atmf$fh.nc and sfcf$fh.nc in this case.

output_history:          .true.
write_dopost:            .true.
post_grid:                gaussian grid
num_files:               2
filename_base:           'atm' 'sfc'
output_grid:             cubed_sphere_grid
output_file:             'netcdf_parallel'

@DusanJovic-NOAA
Copy link
Collaborator

But according to the link Wen posted above with the nomads outputs from gdas we need both grib files and atm and sfc history files, both on Gaussian grid. That means means one option should control both grib and history files, which is currently 'output_grid' option.

Your suggestion with:

post_grid:               gaussian_grid
output_grid:             cubed_sphere_grid

implies that post_grid will be used to defines grid type of both grib (inlinepost) files and output (history) files, while output_grid will define the grid for these additional output files for JEDI DA, which I find confusing. Because in other applications output_grid is option used to define the grid of history files.

Since these additional JEDI DA files will always be on the cubed sphere grid, logical switch should be enough, if turned on user will get these additional files (always on cubed sphere) while output_grid will control the grid for post grib files and history files.

@aerorahul any suggestions.

@junwang-noaa
Copy link
Collaborator

My understanding is that we only need Gaussian grid for post and cubed sphere grid for history. Maybe Rahul and @CoryMartin-NOAA can confirm. I think it wastes resources if we output history files in both Gaussian grid and cubed sphere grid.

@CoryMartin-NOAA
Copy link
Author

I'm not sure about who uses the history files for things like boundary conditions (such as the RRFS, for example). However, I suspect an offline utility to go from cubed sphere history to Gaussian history would satisfy these requirements. Or those users would need to modify their codes to read the native grid. From a "GFS/GDAS" perspective, I think Jun is correct, but we may need to continue to disseminate Gaussian grid model data in netCDF for other users (or perhaps a public notice stating the change will be sufficient?)

@DusanJovic-NOAA
Copy link
Collaborator

If we decide to add completely new post_grid option which will be used to specify the grid for inline post independently from output_grid, and which can be different than output grid, that means we will have two separate set of field bundles, one for writing history files one for sending data to inline post. And if we allow that for a single global domain, then we must allow it for configurations with nests (both global with nests as well as regional with nests i.e. hafs) because the same code must be used for grid and bundle creation.

post_grid: gaussian_grid
output_grid: cubed_sphere

<post_grid_02>
post_grid:             lambert
....
<post_grid_02>

<output_grid_02>
output_grid:             rotated_latlon
....
<output_grid_02>

<post_grid_03>
post_grid:             regional_latlon
....
<post_grid_03>

<output_grid_03>
output_grid:             rotated_latlon
....
<output_grid_03>

Should hypothetical configuration like this be allowed?

Even if we say that post_grid and output_grid can be different only for parent global domain (no nests) then should this be allowed:

post_grid: global_latlon
output_grid: gaussian_grid

if it is allowed, then we'll need something like:

imo_post: 360
jmo_post: 181
imo: 384
jmo: 190

All these cases will complicate the model_configure and code substantially and I'm not sure we really need it. How likely is that any other configuration will need different grids for history and grib files other than JEDI DA configuration?

And if JEDI DA is the only configuration that needs grib and history files to be on different grids, and if in that case history files will always be on cubed sphere grid, then that looks to me like a special case, and for that case a simple logical switch should be sufficient. The type of the grid is basically assumed to be cubed_sphere and no other configuration is needed.

If the concern is that we unnecessarily create both gaussian and cubed_sphere history files (we still need to see if we can actually get rid of gaussian history files), than maybe we can hardcode something in the code to simply skip writing out gaussian history files in this specific case only.

@junwang-noaa
Copy link
Collaborator

@DusanJovic-NOAA I have some discussion with Rahul. I think at this time it might be easier to just have a new variable in model configure:

top_parent_history_file_is_on_cubed_sphere_grid: .true.

or:

history_file_on_native_grid: .true.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
No open projects
4 participants