-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add COMPOSE to EAM #2896
Add COMPOSE to EAM #2896
Conversation
ir.cpp implements a locally discrete mass conservative, multi-moment, incremental remap transport algorithm. qlt.cpp implements the QLT algorithm to achieve tracer consistency, cell mean boundedness for shape preservation, and (if needed, but not for our incremental remap algorithm) discrete mass conservation. We use it for tracer consistency and shape preservation. sl_advection.F90 integrates these two components. Other changes are to the namelist, and some CMakeLists.txt changes.
This new rule is from Taylor, Mark A., Beth A. Wingate, and Len P. Bos. "A cardinal function algorithm for computing multivariate quadrature points." SIAM Journal on Numerical Analysis 45.1 (2007): 193-205. Tighten tolerance on calc_sphere_to_ref. Both of these are to tighten discrete mass conservation to 1e-14 to 1e-15. Previously, 1e-13 or sometimes even 1e-12 were possible.
Conflicts: components/homme/CMakeLists.txt components/homme/src/preqx/CMakeLists.txt components/homme/src/preqx/sl_advection.F90 components/homme/src/share/cube_mod.F90 components/homme/src/share/prim_driver_base.F90
Add CAAS to correspond to limiter_option = 9. CEDR is a library that includes QLT. For these early Homme integration tests, I have a script to generate a single-file version of QLT. Update qlt.cpp with the latest modifications to CEDR.
This commit propagates the HORIZ and COLUMN threading schemes to sl_advection.F90, ir.cpp, and qlt.cpp.
1. Rename the monolithic files as follows: ir.cpp -> slmm.cpp qlt.cpp -> cedr.cpp Modify the compose_mod routine prefixes similarly. (Eventually we'll use the COMPOSE library installation, but for now, integration is much easier with monolithic files I create from the COMPOSE code base.) 2. Rework semi-Lagrangian transport options: ! Tracer transport algorithm type: ! 0 spectral-element Eulerian [default] ! 1 classical semi-Lagrangian (SL) ! 2 cell-integrated remap SL ! Constrained density reconstructor for SL property preservation; not used if ! transport_alg = 0: ! 0 none ! 1 Cobra ! 2 QLT [default] ! 3 CAAS ! If true, check mass conservation and shape preservation. The second ! implicitly checks tracer consistency. logical, public :: semi_lagrange_cdr_check = .false. 3. Add global CAAS as an option. Global CAAS is as simple as a CDR gets, so it's a good to have as an MPI benchmark if nothing else. 4. Implement super levels. The idea is to run a CDR over 8 adjacent vertical levels, a super level. (I could expose this number as an option, but for now it's hardcoded in an enum to 8.) This has the immediate effect of reducing comm volume by 8x. Later, it will also make adaptive QLT potentially have to do no comm at all besides the adaptation handshake of one scalar per tree edge.
This is possible because the integral is over the reference Eulerian target element. There is still more optimization to do on the remap part of the code, but this was the major one to do.
Pure p=3 GLL interpolation leads to an unstable classical semi-Lagrangian (CSL) method. Introduce stabilized variants. Horizontal and vertical threading work for all CSL methods: transport_alg = 1, 10, 11, 12. 1 is the original Fortran CSL method. 10 mimics this method in C++. 11 and 12 are two stabilized methods. 12 is more accurate than 11. Also fix an over-eager exception in CEDR/QLT.
1. Make get_src_cell robust to FP. This routine infers topology from geometry, which is fragile in floating point. I've added a backup procedure if the first search fails. 2. Limiter 8 was being used even when limiter 9 was requested when the selected transport algorithm was one of the new interpolated semi-Lagrangian schemes. Fix this bug. 3. Improve vectorization in slmm_csl. 4. Start comprehensive standalone testing, but nothing other than an additional unit test is available yet.
The standalone tester runs the nondivergent flow problem with various tracer intitial conditions and reports the L2 error.
This is the first of several phases for taking advantage of SL transport in MPI communication. In this phase, the halo is still limited to size 1, but for the interpolation SL (ISL) algorithm only, only the data determined to be necessary based on the flow field are communicated.
- #pragma ivdep helps in calc_q - need waitall in comm_lid_on_rank in initialization to be sure send buffer remains valid - remove some diagnostic code
ALE_departure_from_gll does not normalize the output point to the sphere. This is mathematically OK for intersection and departure-cell-determination calculations, and even for ref <-> sphere calculations. But it's an efficiency loss for the final one because calc_sphere_to_ref in slmm.cpp was iterating too many times because of the norm(q,2) != 1. This commit speeds up that calculation by almost 2x. Thanks to Oksana for asking about this. Oksana also pointed out the Kokkos warnings issue; warnings get printed on every rank. Fixed in this commit.
Conflicts: components/homme/src/preqx/CMakeLists.txt components/homme/src/preqx/sl_advection.F90 components/homme/src/share/prim_driver_base.F90 components/homme/src/test_src/baroclinic_inst_mod.F90
Although SL tracer advection very likely does not need hyperviscosity, HV's application in Eulerian tracer advection has positive effects beyond just controlling the advection itself. It follows that any other advection algorithm must permit HV. It's possible we'll figure out how we can avoid it, but for now permit it.
The DCMIP tests suggest that HV is need for qv only. This makes sense, as only qv couples to the dynamics at every dynamical time step; the other tracers couple directly to physics at the physics time step, and indirectly to dynamics again only at the physics time step. Applying HV to just qv is much faster than applying it to all tracers. It also retains the high resolution SL tracer advection offers in any tracer not dissipated by HV. By default, semi_lagrange_hv_q_all = .false.
CEDR is the last module to see q, qdp, and dp3d before control leaves SL tracers. Add NaN and Inf checks on q, qdp, and dp3d to the current property-preservation check routine.
Add local mesh, departure point, and max v data to output.
If semi_lagrange_nearest_point_lev is set to >= 1, then in any lev <= this parameter, the SL advection method is permitted to use an (approximate) nearest point to the target cell's halo if the departure point is outside of the halo. Fix a bug in the get_src_cell trial > 1 backup calculation. Fix output bugs in the get_src_cell failure message. Write unit tests for the nearest-point calculations. Update prim_main to hand off to the SL convergence tester for all available convergence-tested SL algorithms. Fix MPI type in the SL-related control-mod values. This bug hasn't affected us, but the type was clearly wrong, logical rather than integer.
Prints rss = min, max, mean resident memory.
The SL MPI pattern for ISL was leaking memory via MPI requests from an isend that were neither explicitly freed nor tested/waited on. Also refactor some of the old Cobra code into its own routine, isolated two large arrays to just that routine. That should reduce the memory footprint of SL w/o changing its functionality; the caveat is that perhaps those arrays, since the memory wasn't touched, never really impacted the actual memory footprint.
Default: False | ||
</entry> | ||
|
||
<entry id="semi_lagrange_nearest_point_lev" type="integer" category="se" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took me some time to process why from the top but then i remembered fastest winds are at the top.
components/homme/CMakeLists.txt
Outdated
add_definitions(-DHOMME_ENABLE_COMPOSE) | ||
set (COMPOSE_LIBRARY "compose") | ||
add_subdirectory(src/share/compose) | ||
set (USE_KOKKOS_KERNELS ON) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we now use HOMME_USE_KOKKOS and USE_KOKKOS_KERNELS? Do not remember which one is for what.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
HOMMEXX uses USE_KOKKOS_KERNELS. I want to change that to HOMME_ENABLE_KOKKOS b/c KokkosKernels is itself a thing, so we shouldn't have a definition suggesting that KokkosKernels is being used. But that's a separate issue, one to discuss with the HOMMEXX team. In this PR, I just follow what HOMMEXX does to enable Kokkos in HOMME. In summary, HOMME_USE_KOKKOS and HOMME_ENABLE_KOKKOS are not definitions in master, nor are they introduced here.
@@ -14,8 +14,30 @@ module control_mod | |||
character(len=MAX_STRING_LEN) , public :: integration ! time integration (explicit, or full imp) | |||
|
|||
! experimental option for preqx model: | |||
logical, public :: use_semi_lagrange_transport = .false. | |||
logical, public :: use_semi_lagrange_transport_local_conservation = .false. | |||
! Tracer transport algorithm type: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
awesome comments
@rljacob I am practicing merging into next. The body description is long here, do you want it anyway in the merge message? My understanding is that is required https://acme-climate.atlassian.net/wiki/spaces/ED/pages/16253965/Commit+message+template . thanks! |
Yes that is pretty long. You could make it shorter by removing the lists of files that are one-per-line and replacing it with a comma separated list of filenames without the full path or the line change counts. |
@rljacob could i just keep first 2 sentences and then say 'for description see the webpage'? |
No. Just summarize it. You could just remove all the source code pointers. |
Like this: Bring COMPOSE semi-Lagrangian tracer transport code into HOMME and E3SM. [non-BFB] for HOMME suite for 1 test, baroCamMoistSL and [NML] in the same test. Includes the following:
Miscellaneous changes:
[BFB] for E3SM |
…ose-theta-l Conflicts: components/homme/CMakeLists.txt components/homme/src/preqx_kokkos/CMakeLists.txt
I merged master again to account for PR #2831, which I knew would create a few somewhat nontrivial merge conflicts in some HOMME CMake code. |
Bring COMPOSE semi-Lagrangian tracer transport code into HOMME and E3SM. [non-BFB] for HOMME suite for 1 test, baroCamMoistSL and [NML] in the same test. Includes the following: Changes to E3SM model build system to permit SL in theta-l (only). Nontrivial changes to the HOMME standalone build system to use E3SM Kokkos submodule automatically. A cosmetic issue was changing things to move away from associating Kokkos specifically with the HOMMEXX work. If HOMME_ENABLE_COMPOSE=OFF and BUILD_HOMME_PREQX_KOKKOS=OFF, then all the Kokkos-related code is skipped. A trivial new target for testing. Will eventually expand to make a convergence test in the standalone test suite. removal of old sl_advection.F90 Add the new libcompose target. It builds COMPOSE C++ code, which is expensive relative to Fortran, just once. Then any target that wants SL links against this library. If HOMME_ENABLE_COMPOSE=OFF, this is not built. Since this is a separate lib, the files are in share/compose rather than just share. Add Fortran interfaces to libcompose. These files must be included in the list of sources for a target and so are in the share directory. They are fast to compile, like all short Fortran files. If HOMME_ENABLE_COMPOSE=OFF, code is appropriately ifdefed out, so it is innocuous to include these in the target's file list. Add a hook for a nice COMPOSE testing capability and one test. Miscellaneous changes: Remove some old code for the previous version of SL transport. Little things to support the new SL code. dcmip16_mu_s -> dcmip16_mu_s for scalars in the dynamics and dcmip16_mu_q. This is an orthogonal change we discussed a while ago. It permits SL not to do dissipation on tracers in a dcmip test as an option. Namelist and control changes for SL. [BFB] for E3SM [BFB] for all HOMME standalone tests except [non-BFB] baroCamMoistSL. Tested on skybridge with e3sm_dev against baselines made of next from commit f543616 . fail is in homme as expected, 2 fails in FAIL ERS.f09_g16_g.MALISIA.sandiatoss3_intel TPUTCOMP Error: Computation time increase > 10 pct from baseline FAIL ERS.f19_f19.ICLM45.sandiatoss3_intel TPUTCOMP Error: Computation time increase > 10 pct from baseline are not related to this PR.
Tested with e3sm_dev on skybridge against baselines made of next commit f543616 , got overall fail in homme as expected, in baroCamMoistSL (checked that it ran, fails are from cprnc) and partial fails: FAIL ERS.f09_g16_g.MALISIA.sandiatoss3_intel FAIL ERS.f19_f19.ICLM45.sandiatoss3_intel that are not related to this PR. |
Merged to next. |
Az confirmed that efficiency loss messages are ok in this case. |
) Bring COMPOSE semi-Lagrangian tracer transport code into HOMME and E3SM. [non-BFB] for HOMME suite for 1 test, baroCamMoistSL and [NML] in the same test. Includes the following: Changes to E3SM model build system to permit SL in theta-l (only). Nontrivial changes to the HOMME standalone build system to use E3SM Kokkos submodule automatically. A cosmetic issue was changing things to move away from associating Kokkos specifically with the HOMMEXX work. If HOMME_ENABLE_COMPOSE=OFF and BUILD_HOMME_PREQX_KOKKOS=OFF, then all the Kokkos-related code is skipped. A trivial new target for testing. Will eventually expand to make a convergence test in the standalone test suite. removal of old sl_advection.F90 Add the new libcompose target. It builds COMPOSE C++ code, which is expensive relative to Fortran, just once. Then any target that wants SL links against this library. If HOMME_ENABLE_COMPOSE=OFF, this is not built. Since this is a separate lib, the files are in share/compose rather than just share. Add Fortran interfaces to libcompose. These files must be included in the list of sources for a target and so are in the share directory. They are fast to compile, like all short Fortran files. If HOMME_ENABLE_COMPOSE=OFF, code is appropriately ifdefed out, so it is innocuous to include these in the target's file list. Add a hook for a nice COMPOSE testing capability and one test. Miscellaneous changes: Remove some old code for the previous version of SL transport. Little things to support the new SL code. dcmip16_mu_s -> dcmip16_mu_s for scalars in the dynamics and dcmip16_mu_q. This is an orthogonal change we discussed a while ago. It permits SL not to do dissipation on tracers in a dcmip test as an option. Namelist and control changes for SL. [BFB] for E3SM [BFB] for all HOMME standalone tests except [non-BFB] baroCamMoistSL.
Bring COMPOSE semi-Lagrangian tracer transport code into HOMME and E3SM. [non-BFB] for HOMME suite for 1 test, baroCamMoistSL and [NML] in the same test.
The following groups file changes and explains at a high level what has
changed.
Changes to E3SM model build system to permit SL in theta-l (only).
Nontrivial changes to the standalone build system to use E3SM Kokkos submodule
automatically. A cosmetic issue was changing things to move away from
associating Kokkos specifically with the HOMMEXX work. If
HOMME_ENABLE_COMPOSE=OFF and BUILD_HOMME_PREQX_KOKKOS=OFF, then all the
Kokkos-related code is skipped.
A trivial new target for my own testing. I'll eventually expand on this to make
a convergence test in the standalone test suite.
Trivial code removal:
The new libcompose target. It builds COMPOSE C++ code, which is expensive
relative to Fortran, just once. Then any target that wants SL links against this
library. If HOMME_ENABLE_COMPOSE=OFF, this is not built. Since this is a
separate lib, the files are in share/compose rather than just share.
Fortran interfaces to libcompose. These files must be included in the list of
sources for a target and so are in the share directory. They are fast to
compile, like all short Fortran files. If HOMME_ENABLE_COMPOSE=OFF, code is
appropriately ifdefed out, so it is innocuous to include these in the target's
file list.
Add a hook for a nice COMPOSE testing capability.
For now, use just this test. I plan more tests for future PRs.
Miscellaneous changes:
changes are of this sort.
sort.
dcmip16_mu_q. This is an orthogonal change we discussed a while ago. It
permits SL not to do dissipation on tracers in a dcmip test as an option.
This PR should be BFB for E3SM and BFB for all HOMME standalone tests except for
baroCamMoistSL.