Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add COMPOSE to EAM #2896

Merged
merged 78 commits into from
May 15, 2019
Merged

Conversation

ambrad
Copy link
Member

@ambrad ambrad commented May 1, 2019

Bring COMPOSE semi-Lagrangian tracer transport code into HOMME and E3SM. [non-BFB] for HOMME suite for 1 test, baroCamMoistSL and [NML] in the same test.

The following groups file changes and explains at a high level what has
changed.

    41 files changed, 11911 insertions(+), 1904 deletions(-)

Changes to E3SM model build system to permit SL in theta-l (only).

    cime/scripts/lib/CIME/build.py                              |    4 +-
    components/cam/bld/configure                                |    3 +-
    components/cam/bld/namelist_files/namelist_defaults_cam.xml |    5 +
    components/cam/bld/namelist_files/namelist_definition.xml   |   32 +-

Nontrivial changes to the standalone build system to use E3SM Kokkos submodule
automatically. A cosmetic issue was changing things to move away from
associating Kokkos specifically with the HOMMEXX work. If
HOMME_ENABLE_COMPOSE=OFF and BUILD_HOMME_PREQX_KOKKOS=OFF, then all the
Kokkos-related code is skipped.

    components/homme/CMakeLists.txt                             |   26 +-
    components/homme/cmake/HommeMacros.cmake                    |    4 +-
    components/homme/cmake/Kokkos.cmake                         |   40 +-
    components/homme/test/unit_tests/CMakeLists.txt             |    2 -

A trivial new target for my own testing. I'll eventually expand on this to make
a convergence test in the standalone test suite.

    components/homme/test_execs/CMakeLists.txt                  |    1 +
    components/homme/test_execs/stt/CMakeLists.txt              |    5 +

Trivial code removal:

    components/homme/src/preqx/sl_advection.F90                 |  912 -----
    components/homme/src/preqx_kokkos/sl_advection.F90          |  912 -----

The new libcompose target. It builds COMPOSE C++ code, which is expensive
relative to Fortran, just once. Then any target that wants SL links against this
library. If HOMME_ENABLE_COMPOSE=OFF, this is not built. Since this is a
separate lib, the files are in share/compose rather than just share.

    components/homme/src/share/compose/CMakeLists.txt           |    9 +
    components/homme/src/share/compose/compose_cedr.cpp         | 5842 ++++++++++++++++++++++++++++++++
    components/homme/src/share/compose/compose_slmm.cpp         | 3911 +++++++++++++++++++++
    components/homme/src/share/compose/compose_test.cpp         |  609 ++++
    components/homme/src/share/compose/compose_test.hpp         |    6 +

Fortran interfaces to libcompose. These files must be included in the list of
sources for a target and so are in the share directory. They are fast to
compile, like all short Fortran files. If HOMME_ENABLE_COMPOSE=OFF, code is
appropriately ifdefed out, so it is innocuous to include these in the target's
file list.

    components/homme/src/share/sl_advection.F90                 |  728 ++++
    components/homme/src/share/compose_mod.F90                  |  278 ++
    components/homme/src/share/compose_test_mod.F90             |  230 ++

Add a hook for a nice COMPOSE testing capability.

    components/homme/src/prim_main.F90                          |    2 +

For now, use just this test. I plan more tests for future PRs.

    components/homme/test/reg_test/namelists/baroCamMoist-SL.nl |    4 +-

Miscellaneous changes:

  • Remove some old code for the previous version of SL transport. Most of the (-)
    changes are of this sort.
  • Little things to support the new SL code. The (+) changes are mostly of this
    sort.
  • dcmip16_mu_s -> dcmip16_mu_s for scalars in the dynamics and
    dcmip16_mu_q. This is an orthogonal change we discussed a while ago. It
    permits SL not to do dissipation on tracers in a dcmip test as an option.
  • Namelist and control changes for SL.
    components/homme/src/pese/prim_driver_mod.F90               |    1 -
    components/homme/src/preqx/prim_advection_mod.F90           |    8 +-
    components/homme/src/preqx/share/prim_state_mod.F90         |   24 +-
    components/homme/src/preqx_kokkos/CMakeLists.txt            |    5 +-
    components/homme/src/preqx_kokkos/prim_advection_mod.F90    |    6 +-
    components/homme/src/preqx_kokkos/prim_driver_mod.F90       |    4 +-
    components/homme/src/share/bndry_mod_base.F90               |    2 +
    components/homme/src/share/coordinate_systems_mod.F90       |    4 +-
    components/homme/src/share/cube_mod.F90                     |    2 -
    components/homme/src/share/edgetype_mod.F90                 |    1 +
    components/homme/src/share/global_norms_mod.F90             |    4 +-
    components/homme/src/share/namelist_mod.F90                 |   44 +-
    components/homme/src/share/prim_advection_base.F90          |   10 +-
    components/homme/src/share/prim_driver_base.F90             |   55 +-
    components/homme/src/share/scalable_grid_init_mod.F90       |    1 +
    components/homme/src/test_src/asp_tests.F90                 |    3 +-
    components/homme/src/theta-l/prim_advection_mod.F90         |   45 +-
    components/homme/src/theta/prim_advection_mod.F90           |    2 +-
    components/homme/src/share/control_mod.F90                  |   29 +-

This PR should be BFB for E3SM and BFB for all HOMME standalone tests except for
baroCamMoistSL.

ambrad and others added 30 commits August 18, 2017 12:37
ir.cpp implements a locally discrete mass conservative,
multi-moment, incremental remap transport algorithm.

qlt.cpp implements the QLT algorithm to achieve tracer
consistency, cell mean boundedness for shape preservation,
and (if needed, but not for our incremental remap algorithm)
discrete mass conservation. We use it for tracer consistency and
shape preservation.

sl_advection.F90 integrates these two components.

Other changes are to the namelist, and some CMakeLists.txt changes.
This new rule is from
     Taylor, Mark A., Beth A. Wingate, and Len P. Bos. "A cardinal function
     algorithm for computing multivariate quadrature points." SIAM Journal on
     Numerical Analysis 45.1 (2007): 193-205.

Tighten tolerance on calc_sphere_to_ref.

Both of these are to tighten discrete mass conservation to 1e-14 to
1e-15. Previously, 1e-13 or sometimes even 1e-12 were possible.
Conflicts:
	components/homme/CMakeLists.txt
	components/homme/src/preqx/CMakeLists.txt
	components/homme/src/preqx/sl_advection.F90
	components/homme/src/share/cube_mod.F90
	components/homme/src/share/prim_driver_base.F90
Add CAAS to correspond to limiter_option = 9.

CEDR is a library that includes QLT. For these early Homme integration tests, I
have a script to generate a single-file version of QLT. Update qlt.cpp with the
latest modifications to CEDR.
This commit propagates the HORIZ and COLUMN threading schemes to
sl_advection.F90, ir.cpp, and qlt.cpp.
1. Rename the monolithic files as follows:
    ir.cpp -> slmm.cpp
    qlt.cpp -> cedr.cpp
Modify the compose_mod routine prefixes similarly. (Eventually we'll use the
COMPOSE library installation, but for now, integration is much easier with
monolithic files I create from the COMPOSE code base.)

2. Rework semi-Lagrangian transport options:

  ! Tracer transport algorithm type:
  !     0  spectral-element Eulerian      [default]
  !     1  classical semi-Lagrangian (SL)
  !     2  cell-integrated remap SL

  ! Constrained density reconstructor for SL property preservation; not used if
  ! transport_alg = 0:
  !     0  none
  !     1  Cobra
  !     2  QLT    [default]
  !     3  CAAS

  ! If true, check mass conservation and shape preservation. The second
  ! implicitly checks tracer consistency.
  logical, public  :: semi_lagrange_cdr_check = .false.

3. Add global CAAS as an option. Global CAAS is as simple as a CDR gets, so it's
a good to have as an MPI benchmark if nothing else.

4. Implement super levels. The idea is to run a CDR over 8 adjacent vertical
levels, a super level. (I could expose this number as an option, but for now
it's hardcoded in an enum to 8.) This has the immediate effect of reducing comm
volume by 8x. Later, it will also make adaptive QLT potentially have to do no
comm at all besides the adaptation handshake of one scalar per tree edge.
This is possible because the integral is over the reference Eulerian target
element. There is still more optimization to do on the remap part of the code,
but this was the major one to do.
Pure p=3 GLL interpolation leads to an unstable classical semi-Lagrangian (CSL)
method. Introduce stabilized variants.

Horizontal and vertical threading work for all CSL methods: transport_alg = 1,
10, 11, 12. 1 is the original Fortran CSL method. 10 mimics this method in
C++. 11 and 12 are two stabilized methods. 12 is more accurate than 11.

Also fix an over-eager exception in CEDR/QLT.
1. Make get_src_cell robust to FP. This routine infers topology from geometry,
which is fragile in floating point. I've added a backup procedure if the first
search fails.

2. Limiter 8 was being used even when limiter 9 was requested when the selected
transport algorithm was one of the new interpolated semi-Lagrangian schemes. Fix
this bug.

3. Improve vectorization in slmm_csl.

4. Start comprehensive standalone testing, but nothing other than an additional
unit test is available yet.
The standalone tester runs the nondivergent flow problem with various tracer
intitial conditions and reports the L2 error.
This is the first of several phases for taking advantage of SL transport in MPI
communication. In this phase, the halo is still limited to size 1, but for the
interpolation SL (ISL) algorithm only, only the data determined to be necessary
based on the flow field are communicated.
- #pragma ivdep helps in calc_q
- need waitall in comm_lid_on_rank in initialization to be sure send buffer
  remains valid
- remove some diagnostic code
ALE_departure_from_gll does not normalize the output point to the sphere. This
is mathematically OK for intersection and departure-cell-determination
calculations, and even for ref <-> sphere calculations. But it's an efficiency
loss for the final one because calc_sphere_to_ref in slmm.cpp was iterating too
many times because of the norm(q,2) != 1. This commit speeds up that calculation
by almost 2x.

Thanks to Oksana for asking about this.

Oksana also pointed out the Kokkos warnings issue; warnings get printed on every
rank. Fixed in this commit.
Conflicts:
	components/homme/src/preqx/CMakeLists.txt
	components/homme/src/preqx/sl_advection.F90
	components/homme/src/share/prim_driver_base.F90
	components/homme/src/test_src/baroclinic_inst_mod.F90
Although SL tracer advection very likely does not need hyperviscosity, HV's
application in Eulerian tracer advection has positive effects beyond just
controlling the advection itself. It follows that any other advection algorithm
must permit HV. It's possible we'll figure out how we can avoid it, but for now
permit it.
The DCMIP tests suggest that HV is need for qv only. This makes sense, as only
qv couples to the dynamics at every dynamical time step; the other tracers
couple directly to physics at the physics time step, and indirectly to dynamics
again only at the physics time step.

Applying HV to just qv is much faster than applying it to all tracers. It also
retains the high resolution SL tracer advection offers in any tracer not
dissipated by HV.

By default, semi_lagrange_hv_q_all = .false.
CEDR is the last module to see q, qdp, and dp3d before control leaves SL
tracers. Add NaN and Inf checks on q, qdp, and dp3d to the current
property-preservation check routine.
Add local mesh, departure point, and max v data to output.
If semi_lagrange_nearest_point_lev is set to >= 1, then in any lev <= this
parameter, the SL advection method is permitted to use an (approximate) nearest
point to the target cell's halo if the departure point is outside of the halo.

Fix a bug in the get_src_cell trial > 1 backup calculation.

Fix output bugs in the get_src_cell failure message.

Write unit tests for the nearest-point calculations.

Update prim_main to hand off to the SL convergence tester for all available
convergence-tested SL algorithms.

Fix MPI type in the SL-related control-mod values. This bug hasn't affected us,
but the type was clearly wrong, logical rather than integer.
Prints rss = min, max, mean resident memory.
The SL MPI pattern for ISL was leaking memory via MPI requests from an isend
that were neither explicitly freed nor tested/waited on.

Also refactor some of the old Cobra code into its own routine, isolated two
large arrays to just that routine. That should reduce the memory footprint of SL
w/o changing its functionality; the caveat is that perhaps those arrays, since
the memory wasn't touched, never really impacted the actual memory footprint.
Default: False
</entry>

<entry id="semi_lagrange_nearest_point_lev" type="integer" category="se"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Took me some time to process why from the top but then i remembered fastest winds are at the top.

add_definitions(-DHOMME_ENABLE_COMPOSE)
set (COMPOSE_LIBRARY "compose")
add_subdirectory(src/share/compose)
set (USE_KOKKOS_KERNELS ON)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we now use HOMME_USE_KOKKOS and USE_KOKKOS_KERNELS? Do not remember which one is for what.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HOMMEXX uses USE_KOKKOS_KERNELS. I want to change that to HOMME_ENABLE_KOKKOS b/c KokkosKernels is itself a thing, so we shouldn't have a definition suggesting that KokkosKernels is being used. But that's a separate issue, one to discuss with the HOMMEXX team. In this PR, I just follow what HOMMEXX does to enable Kokkos in HOMME. In summary, HOMME_USE_KOKKOS and HOMME_ENABLE_KOKKOS are not definitions in master, nor are they introduced here.

@@ -14,8 +14,30 @@ module control_mod
character(len=MAX_STRING_LEN) , public :: integration ! time integration (explicit, or full imp)

! experimental option for preqx model:
logical, public :: use_semi_lagrange_transport = .false.
logical, public :: use_semi_lagrange_transport_local_conservation = .false.
! Tracer transport algorithm type:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome comments

@oksanaguba oksanaguba added the non-BFB PR makes roundoff changes to answers. label May 7, 2019
@rljacob rljacob assigned mt5555 and oksanaguba and unassigned mt5555 May 7, 2019
@oksanaguba oksanaguba changed the title COMPOSE semi-Lagrangian tracer transport Add COMPOSE semi-Lagrangian tracer transport May 8, 2019
@oksanaguba
Copy link
Contributor

@rljacob I am practicing merging into next. The body description is long here, do you want it anyway in the merge message? My understanding is that is required https://acme-climate.atlassian.net/wiki/spaces/ED/pages/16253965/Commit+message+template . thanks!

@rljacob
Copy link
Member

rljacob commented May 9, 2019

Yes that is pretty long. You could make it shorter by removing the lists of files that are one-per-line and replacing it with a comma separated list of filenames without the full path or the line change counts.

@oksanaguba
Copy link
Contributor

@rljacob could i just keep first 2 sentences and then say 'for description see the webpage'?

@rljacob
Copy link
Member

rljacob commented May 9, 2019

No. Just summarize it. You could just remove all the source code pointers.

@rljacob
Copy link
Member

rljacob commented May 9, 2019

Like this:

Bring COMPOSE semi-Lagrangian tracer transport code into HOMME and E3SM. [non-BFB] for HOMME suite for 1 test, baroCamMoistSL and [NML] in the same test. Includes the following:

  • Changes to E3SM model build system to permit SL in theta-l (only).
  • Nontrivial changes to the HOMME standalone build system to use E3SM Kokkos submodule
    automatically. A cosmetic issue was changing things to move away from
    associating Kokkos specifically with the HOMMEXX work. If
    HOMME_ENABLE_COMPOSE=OFF and BUILD_HOMME_PREQX_KOKKOS=OFF, then all the
    Kokkos-related code is skipped.
  • A trivial new target for testing. Will eventually expand to make
    a convergence test in the standalone test suite.
  • removal of old sl_advection.F90
  • Add the new libcompose target. It builds COMPOSE C++ code, which is expensive
    relative to Fortran, just once. Then any target that wants SL links against this
    library. If HOMME_ENABLE_COMPOSE=OFF, this is not built. Since this is a
    separate lib, the files are in share/compose rather than just share.
  • Add Fortran interfaces to libcompose. These files must be included in the list of
    sources for a target and so are in the share directory. They are fast to
    compile, like all short Fortran files. If HOMME_ENABLE_COMPOSE=OFF, code is
    appropriately ifdefed out, so it is innocuous to include these in the target's
    file list.
  • Add a hook for a nice COMPOSE testing capability and one test.

Miscellaneous changes:

  • Remove some old code for the previous version of SL transport.
  • Little things to support the new SL code.
  • dcmip16_mu_s -> dcmip16_mu_s for scalars in the dynamics and
    dcmip16_mu_q. This is an orthogonal change we discussed a while ago. It
    permits SL not to do dissipation on tracers in a dcmip test as an option.
  • Namelist and control changes for SL.

[BFB] for E3SM
[BFB] for all HOMME standalone tests except
[non-BFB] baroCamMoistSL.

…ose-theta-l

Conflicts:
	components/homme/CMakeLists.txt
	components/homme/src/preqx_kokkos/CMakeLists.txt
@ambrad
Copy link
Member Author

ambrad commented May 9, 2019

I merged master again to account for PR #2831, which I knew would create a few somewhat nontrivial merge conflicts in some HOMME CMake code.

oksanaguba added a commit that referenced this pull request May 10, 2019
Bring COMPOSE semi-Lagrangian tracer transport code into HOMME and E3SM.
[non-BFB] for HOMME suite for 1 test, baroCamMoistSL and [NML] in the same test.
Includes the following:

Changes to E3SM model build system to permit SL in theta-l (only).
Nontrivial changes to the HOMME standalone build system to use E3SM Kokkos submodule
automatically. A cosmetic issue was changing things to move away from
associating Kokkos specifically with the HOMMEXX work. If
HOMME_ENABLE_COMPOSE=OFF and BUILD_HOMME_PREQX_KOKKOS=OFF, then all the
Kokkos-related code is skipped.
A trivial new target for testing. Will eventually expand to make
a convergence test in the standalone test suite.
removal of old sl_advection.F90
Add the new libcompose target. It builds COMPOSE C++ code, which is expensive
relative to Fortran, just once. Then any target that wants SL links against this
library. If HOMME_ENABLE_COMPOSE=OFF, this is not built. Since this is a
separate lib, the files are in share/compose rather than just share.
Add Fortran interfaces to libcompose. These files must be included in the list of
sources for a target and so are in the share directory. They are fast to
compile, like all short Fortran files. If HOMME_ENABLE_COMPOSE=OFF, code is
appropriately ifdefed out, so it is innocuous to include these in the target's
file list.
Add a hook for a nice COMPOSE testing capability and one test.
Miscellaneous changes:

Remove some old code for the previous version of SL transport.
Little things to support the new SL code.
dcmip16_mu_s -> dcmip16_mu_s for scalars in the dynamics and
dcmip16_mu_q. This is an orthogonal change we discussed a while ago. It
permits SL not to do dissipation on tracers in a dcmip test as an option.
Namelist and control changes for SL.
[BFB] for E3SM
[BFB] for all HOMME standalone tests except
[non-BFB] baroCamMoistSL.

Tested on skybridge with e3sm_dev against baselines made of next from commit
f543616 . fail is in homme as expected, 2 fails in

FAIL ERS.f09_g16_g.MALISIA.sandiatoss3_intel
TPUTCOMP Error: Computation time increase > 10 pct from baseline

FAIL ERS.f19_f19.ICLM45.sandiatoss3_intel
TPUTCOMP Error: Computation time increase > 10 pct from baseline

are not related to this PR.
@oksanaguba
Copy link
Contributor

Tested with e3sm_dev on skybridge against baselines made of next commit f543616 , got overall fail in homme as expected, in baroCamMoistSL (checked that it ran, fails are from cprnc) and partial fails:

FAIL ERS.f09_g16_g.MALISIA.sandiatoss3_intel
TPUTCOMP Error: Computation time increase > 10 pct from baseline

FAIL ERS.f19_f19.ICLM45.sandiatoss3_intel
TPUTCOMP Error: Computation time increase > 10 pct from baseline

that are not related to this PR.

@oksanaguba
Copy link
Contributor

Merged to next.

@oksanaguba
Copy link
Contributor

Az confirmed that efficiency loss messages are ok in this case.

@oksanaguba oksanaguba changed the title Add COMPOSE semi-Lagrangian tracer transport Merge branch 'ambrad/ambrad/homme/compose-theta-l' into master (PR #2896) May 15, 2019
@oksanaguba oksanaguba changed the title Merge branch 'ambrad/ambrad/homme/compose-theta-l' into master (PR #2896) Add COMPOSE to EAM May 15, 2019
@oksanaguba oksanaguba merged commit a282f13 into E3SM-Project:master May 15, 2019
oksanaguba added a commit that referenced this pull request May 15, 2019
)

Bring COMPOSE semi-Lagrangian tracer transport code into HOMME and E3SM.
[non-BFB] for HOMME suite for 1 test, baroCamMoistSL and [NML] in the same test.
Includes the following:

Changes to E3SM model build system to permit SL in theta-l (only).
Nontrivial changes to the HOMME standalone build system to use E3SM Kokkos submodule
automatically. A cosmetic issue was changing things to move away from
associating Kokkos specifically with the HOMMEXX work. If
HOMME_ENABLE_COMPOSE=OFF and BUILD_HOMME_PREQX_KOKKOS=OFF, then all the
Kokkos-related code is skipped.
A trivial new target for testing. Will eventually expand to make
a convergence test in the standalone test suite.
removal of old sl_advection.F90
Add the new libcompose target. It builds COMPOSE C++ code, which is expensive
relative to Fortran, just once. Then any target that wants SL links against this
library. If HOMME_ENABLE_COMPOSE=OFF, this is not built. Since this is a
separate lib, the files are in share/compose rather than just share.
Add Fortran interfaces to libcompose. These files must be included in the list of
sources for a target and so are in the share directory. They are fast to
compile, like all short Fortran files. If HOMME_ENABLE_COMPOSE=OFF, code is
appropriately ifdefed out, so it is innocuous to include these in the target's
file list.
Add a hook for a nice COMPOSE testing capability and one test.
Miscellaneous changes:

Remove some old code for the previous version of SL transport.
Little things to support the new SL code.
dcmip16_mu_s -> dcmip16_mu_s for scalars in the dynamics and
dcmip16_mu_q. This is an orthogonal change we discussed a while ago. It
permits SL not to do dissipation on tracers in a dcmip test as an option.
Namelist and control changes for SL.
[BFB] for E3SM
[BFB] for all HOMME standalone tests except
[non-BFB] baroCamMoistSL.
@ambrad
Copy link
Member Author

ambrad commented Jun 17, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Atmosphere HOMME non-BFB PR makes roundoff changes to answers.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants