-
Notifications
You must be signed in to change notification settings - Fork 334
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EAM physgrid: high-order, property preserving remap #3141
EAM physgrid: high-order, property preserving remap #3141
Conversation
Provide a module, gllfvremap_mod, whose use in dynamics/se is optionally enabled with atmosphere namelist option se_fv_phys_remap_alg=1 (default 0 uses the original remap algorithm), to remap between physics and GLL grids. The remap * is mass conserving; * produces tracers that do not violate extrema established before the remap; * remaps tendencies only, so that the background state is not affected by remap; * has order of accuracy N, where N is the parameter in the physgrid specification pgN. [BFB]
…id-ho-pp-remap Conflicts: components/cam/src/dynamics/se/dp_coupling.F90
This branch will not run with the updated, better domain files until PR #3113 is merged. |
PR #3113 has been merged. |
…id-ho-pp-remap Merge master to bring in PR 3113, which has better domain files for ne30pg2.
@@ -209,6 +211,7 @@ subroutine dyn_init1(fh, NLFileName, dyn_in, dyn_out) | |||
! Initialize FV physics grid variables | |||
if (fv_nphys > 0) then | |||
call fv_physgrid_init() | |||
if (se_fv_phys_remap_alg == 1) call gfr_init(par, elem, fv_nphys) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
would it make more sense to have this inside fv_physgrid_init()?
@@ -598,6 +600,7 @@ subroutine stepon_final(dyn_in, dyn_out) | |||
! Deallocate variables needed for the FV physics grid | |||
if (fv_nphys > 0) then | |||
call fv_physgrid_final() | |||
if (se_fv_phys_remap_alg == 1) call gfr_finish() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similar to the comment above, would it make sense to move this into fv_physgrid_final()?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks; I'll do both.
I'm running this branch on cori and seeing substantial climo differences relative to before I merged master; I'm looking into the cause. Edit: I strongly suspect the differences are due to the recent update of CLUBB. Here is master now vs master ~1.5 months ago, PRECT ANN, although so far averaged just over year 2. I'll run this out through 6 years, then provide updated data for pg2 vs np4 at this PR's SHA1. Later: Yes, the differences are due to the CLUBB update. I've updated the physgrid results page with consistent model-vs-model results. |
T_tmp(1:nphys_sq,:,:),uv_tmp(1:nphys_sq,:,:,:),& | ||
om_tmp(1:nphys_sq,:,:),q_tmp(1:nphys_sq,:,:,:)) | ||
else | ||
call gfr_dyn_to_fv_phys(par, dom_mt, tl_f, hvcoord, elem, ps_tmp, zs_tmp, & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we try to think of a naming scheme that makes it clearer what the two options are? Maybe something like dyn_to_fv_phys_low-order() and dyn_to_fv_phys_high-order()?
Maybe it's moot since there might not be a good reason to keep both in the long run.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are 5 public API functions that dynamics/se uses. Please feel free to suggest names for each:
gfr_init, &
 gfr_finish, &
 gfr_dyn_to_fv_phys, &
 gfr_fv_phys_to_dyn, &
 gfr_fv_phys_to_dyn_topo,
If you would like me to rename the original functions, too, let me know what names you would like.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initially, I thought it makes sense to use some indication of "high order" in the names, whether it's "high_order" or "HO", but that also seems a bit too general if a higher order scheme gets implemented later.
Would it be accurate to just call this "2nd order" and use "O2" in the front (or back) of the names instead of "gfr"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll be sure to mention in the added documentation (but see also the latex doc) that the OOA is N for pgN for u,v,T and max(2,N) for Q because of the strict limiter. So this current method has arbitrarily high order in N. But it's true: other methods might come in later. For example, I'm working on OOA > 1 for pg1 right now, and I might propose a boosted OOA for N < 3 later that would require a new halo exchange pattern to be implemented. On the other hand, I think we can expect any new method to be called through this interface, so I think it's safe to use the names you'd like without concern another interface will be introduced.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh ok, I misunderstood the OOA then. I guess the current name convention is fine then. Maybe it would be better to rename my scheme with something indicating "low order". In any case, let's just keep the current name convention.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what is gfr_ for here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(G)LL (F)V (r)emap. I tend to give public functions in a module a common prefix like one does in C; see, e.g., scalable space-filling curve, sfcmap and scalable grid initialization, sgi
fld(i,k,2)=D(2,1)*v1 + D(2,2)*v2 | ||
end do | ||
end do | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What's going on here? Does this affect anything outside the physgrid mapping routines?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good eye. This is inert code (i.e., it runs, but it has no effect) that I noticed while working on physgrid. I proposed to Mark and Oksana that I should remove this code chunk in this PR.
DCMIP1 experiments show that pg1 is too coarse, but perhaps this code will be of use at some point. In the meantime, it's unit tested.
* Move gfr_init/finish to fv_physgrid_init/final. * Reorganize file to group routines more logically; add delimiter between groups. * Add more comments. Also remove some 'optional' annotations that are no longer relevant.
end do | ||
|
||
call dcmip2016_append_measurements(max_w,max_precl,min_ps,tl,hybrid) | ||
end subroutine dcmip2016_test1_pg_forcing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These tests don't seem important for the main changes of the PR, but I noticed the DSS of precipitation and I'm curious why you would do that? I realize these test are for stand-alone HOMME, so I'm guessing this has something to do with the output being on the GLL grid?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right. Homme produces GLL-based output for these DCMIP tests, and precl is one output. So I need to remap it to the GLL grid (which includes DSSing it) for output.
do i = 1,np | ||
p = change_coordinates(elem(ie)%spherep(i,j)) | ||
elem(ie)%state%ps_v(i,j,timeidx) = & | ||
1.0d3*(1 + 0.05*sin(2*p%x+0.5)*sin(p%y+1.5)*sin(3*p%z+2.5)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to put a comment about where these magic numbers come from. I assume they are from the DCMIP protocol?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will. They are not. It's just a bunch of C^inf functions with which to do convergence testing.
do j = 1,np | ||
do i = 1,np | ||
p = change_coordinates(elem(ie)%spherep(i,j)) | ||
tend(i,j,:) = 0.25_real_kind*sin(3.2*p%x)*sin(4.2*p%y)*sin(2.3*p%z) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as comment above about noting where these parameter values come from.
!! Test 4. | ||
! Expensive test to determine mass conservation quality over many | ||
! iterations. | ||
#if 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you going to leave this in? If so it might be good to change this to a more descriptive CPP variable? Or does this used in other parts of HOMME?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll just take it out.
! Top-level data type and functions for high-order, shape-preserving FV <-> | ||
! GLL remap. | ||
type, private :: GllFvRemap_t | ||
integer :: nphys, npi |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be good to explicitly state what "npi" is for, since it's not obvious.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Will do.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comments are still sparse and overly terse for my taste, but otherwise the code looks good to me. It's nice to see such thorough testing.
! Interfaces to support calling inside or outside a horizontally | ||
! threaded region. | ||
interface gfr_dyn_to_fv_phys | ||
module procedure gfr_dyn_to_fv_phys_hybrid |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is _hybrid for threaded? what is _dom_mt for? _mt probably stands multithreaded?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
dom_mt is the name of the structure that contains threading information. _hybrid is the impl that is called when in a threaded region. _dom_mt is the impl that is called when not, and it then sets up a threaded region (using dom_mt) and calls _hybrid.
call abortmp('gllfvremap_mod: nphys must be <= np') | ||
end if | ||
if (qsize == 0) then | ||
call abortmp('gllfvremap_mod: qsize must be >= 1') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is qsize>0 needed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I suppose it isn't. But a use case for physgrid without any tracers seems contrived. Since I don't have a test for qsize=0, I prefer to abort on this b/c I think it's more useful to clearly say that an anomalous situation has arisen.
call get_temperature(elem(ie), wr2, hvcoord, nt) | ||
call calc_p(hvcoord, elem(ie)%state%ps_v(:,:,nt), p) | ||
call gfr_g2f_scalar(ie, elem(ie)%metdet, p, p_fv) | ||
wr2 = wr2*(p/p0)**kappa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is here Exner pressure? Is this code only for theta-l?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, it works for all dycores. The idea is to remap theta so that remap is adiabatic. theta_v would probably be even better, but I don't want to bring in qv.
if (nf > 1 .or. .not. gfr%boost_pg1) then | ||
do ie = nets,nete | ||
if (gfr%check) wr(:,:,1) = elem(ie)%state%phis | ||
call limiter_clip_and_sum(np, elem(ie)%spheremp, gfr%qmin(1,1,ie), & |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you limit topo?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. This is up for discussion, along with everything else we're thinking about re: topo. But for now I feel it's safest not to introduce new topo extrema. So the topo remap problem is: remap such that the average height is conserved and so that no new extrema are created.
end do | ||
end subroutine run | ||
|
||
subroutine gfr_check_api(hybrid, nets, nete, hvcoord, elem) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does this routine do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It runs a sequence of tests that are permitted to use gllfvremap_mod
only through its public API, i.e., the same routines called in the
full atmosphere. This is in contrast to the unit tests in
gllfvremap_mod.F90, which access internal functions. The API tests are
in three parts, as follows (copy-pasted from comments):
!! Test 1.
! Test physgrid API.
! Test that if tendency is 0, then the original field is
! recovered with error eps ~ machine precision.
! OOA for pgN should min(N, 2). The OOA is limited to 2 because
! of the way the tendencies are created. Instead of setting the FV
! value to the average over the subcell, the sampled value at the
! cell midpoint (ref midpoint mapped to sphere) is used. This
! creates a 2nd-order error. Tests 2 and 3 test whether the remaps
! achieve design OOA, in particular OOA 3 for pg3 u, v, T, phis
! fields.
!! Test 2.
! Test topo routines. For pgN, phis should have OOA min(N,2)
! because the field is limited.
!! Test 3.
! Test FV fields that were not covered in the previous tests. This
! is done by copying them to look like tendencies.
! For pg4, u,v,T should have l2 errors that are eps because pg4
! reconstructions the fields exactly, even with DSS.
! For pg4, q should have l2 errors that converge at OOA >= 2. 2
! is the formal OOA b/c of the limiter. The limiter acts on the
! tendency, which in this case is exactly what is being examined.
! For pgN, N=1,2,3, u,v,T should have OOA N.
! For pgN, q should have OOA min(N,2).
|
I've been following the conversation a bit and in response to Okasana's (4) above, I'm not sure if this would be a good time to mention issue #2680 about how Exner pressure is calculated in dp_coupling and how it is used elsewhere. |
|
For a test, we can meet but it is rather simple: See this file cime/config/e3sm/tests.py for a list of tests . _Ln5 in tests means running 5 time steps, to make it shorter. |
@AaronDonahue, I use p0 as defined in CAM (or in standalone Homme) in the calculation of Exner pressure. Does this answer the question? |
check_p reports atm.log: gfr> ERROR dp p 0.000000000000000E+000 2.651938505852153E-011 in an ne30pg2 FC5AV1C-L run, meaning dp agrees exactly with dp3d but these two defs of p agree only to ~10 digits: do k = 1,nlev p(:,:,k) = hvcoord%hyam(k)*hvcoord%ps0 + hvcoord%hybm(k)*ps end do and dp_accum(:,:,1) = hvcoord%hyai(1)*hvcoord%ps0 do k = 1,nlev dp_accum(:,:,2) = dp_accum(:,:,1) + dp3d(:,:,k) p(:,:,k) = half*(dp_accum(:,:,1) + dp_accum(:,:,2)) dp_accum(:,:,1) = dp_accum(:,:,2) end do This is because hybm or hyam do not give exactly the middle of the element. The first def is consistent with what the EAM driver currently does, so I'm switching back to it.
@ambrad, I'm guessing that there is an internal "Exner" calculation used in Homme and that as you stated it is defined correctly based on p0. The issue I referred to is related to how "Exner" is calculated for use by the physics, which is (we believe) incorrect and different. The issue points out that this can lead to confusion when someone attempts to use |
But retain omega_p itself, with instructions in comments for what to do when derived%omega_p is removed in favor of omega.
@oksanaguba, I believe I've addressed all your comments:
Thanks for the initial review round. |
@ambrad, can you summarize what you guys discussed about mass weighting the momentum fields? I'm curious about what your conclusions were. |
The choices are to remap
A nonmandatory constraint is that I think we should do the same thing for each |
I ran e3sm_developer on sandiatoss3. |
running merged next with developer suite on skybridge now. |
Tests are ok ./cs.status.20190909_171401_5e8tyq , tested against
|
…3141) Adds high-order, property-preserving remap between physics and GLL grids. Provide a module, gllfvremap_mod, whose use in dynamics/se is optionally enabled with atmosphere namelist option se_fv_phys_remap_alg=1 (default 0 uses the original remap algorithm), to remap between physics and GLL grids. The remap -- is mass conserving; -- produces tracers that do not violate extrema established before the remap; -- remaps tendencies only, so that the background state is not affected by remap; -- has order of accuracy N, where N is the parameter in the physgrid specification pgN. [BFB]
merged to next, will need bless for a new test tomorrow. |
…3141) Adds high-order, property-preserving remap between physics and GLL grids. Provide a module, gllfvremap_mod, whose use in dynamics/se is optionally enabled with atmosphere namelist option se_fv_phys_remap_alg=1 (default 0 uses the original remap algorithm), to remap between physics and GLL grids. The remap -- is mass conserving; -- produces tracers that do not violate extrema established before the remap; -- remaps tendencies only, so that the background state is not affected by remap; -- has order of accuracy N, where N is the parameter in the physgrid specification pgN. [BFB]
integer :: ithr | ||
|
||
ithr = omp_get_thread_num() | ||
nets = dom_mt(ithr+1)%start | ||
nete = dom_mt(ithr+1)%end | ||
hybrid = hybrid_create(par, ithr, hthreads) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should the access to dom_mt
be with ithr
and not ithr+1
for zero-length arrays? A seg-fault shows up with SMS_Ln5.ne4pg2_ne4pg2.FC5AV1C-L.anvil_intel.cam-thetahy_pg2
(MPI-only run with 108 tasks for ATM) for tasks 96 and up: e.g.
106: forrtl: severe (174): SIGSEGV, segmentation fault occurred
106: Image PC Routine Line Source
106: e3sm.exe 0000000002981B51 Unknown Unknown Unknown
106: e3sm.exe 000000000297FC8B Unknown Unknown Unknown
106: e3sm.exe 000000000290F104 Unknown Unknown Unknown
106: e3sm.exe 000000000290EF16 Unknown Unknown Unknown
106: e3sm.exe 000000000288B147 Unknown Unknown Unknown
106: e3sm.exe 0000000002896D40 Unknown Unknown Unknown
106: libpthread-2.12.s 00002B87996807E0 Unknown Unknown Unknown
106: e3sm.exe 000000000127A829 gllfvremap_mod_mp 1769 gllfvremap_mod.F90
106: e3sm.exe 00000000016897FA inidat_mp_read_in 466 inidat.F90
106: e3sm.exe 00000000013CF77D startup_initialco 54 startup_initialconds.F90
106: e3sm.exe 00000000011EB75F inital_mp_cam_ini 42 inital.F90
106: e3sm.exe 0000000000500F40 cam_comp_mp_cam_i 159 cam_comp.F90
106: e3sm.exe 00000000004F7E39 atm_comp_mct_mp_a 312 atm_comp_mct.F90
106: e3sm.exe 000000000043B808 component_mod_mp_ 257 component_mod.F90
106: e3sm.exe 000000000042A861 cime_comp_mod_mp_ 1337 cime_comp_mod.F90
106: e3sm.exe 00000000004388B9 MAIN__ 122 cime_driver.F90
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It should definitely be (ithr+1). I actually ran into a very similar issue, but I'm having trouble remembering what it was. I'm gonna look back through some slack conversations with Andrew to try and recall the details.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. There are two separate issues here. I will need to fix the second.
The first is whether ithr
or ithr+1
should be used for indexing in general, given that ithr
starts at 0. This depends on how dom_mt
is declared. If it's declared as a pointer
, then it inherits the index space from the allocation of the pointer. If it's declared as an assumed-shape array, then it does not inherit the index space. This issue comes up enough that I've created a single-file reproducer of the various cases:
module ptr
contains
subroutine ptr1(a)
real, pointer, intent(in) :: a(:) ! pointer attribute means a has index space of allocation
print *,'ptr1',a(0)
end subroutine ptr1
subroutine ptr2(a)
real, intent(in) :: a(:) ! non-pointer attribute means a has default index space
print *,'ptr2',a(1)
end subroutine ptr2
subroutine ptr3(a)
real, intent(in) :: a(4:6) ! non-pointer attribute also allows reindexing
print *,'ptr3',a(4)
end subroutine ptr3
#if 0
subroutine ptr4(a)
real, pointer, intent(in) :: a(4:6) ! not allowed
print *,'ptr4',a(4)
end subroutine ptr4
#endif
end module ptr
program main
use ptr
real, pointer :: a(:)
allocate(a(0:3))
a(0) = 42
a(1) = 2
call ptr1(a)
call ptr2(a)
call ptr3(a)
deallocate(a)
end program main
Thus, ithr+1
is correct in this case.
However, the second issue is a genuine bug. Here, for ne=4
, there are 96 elements. Thus, if more than 96 procs are used, then this code is indexing a dom_mt that has 0 length. No index value is correct in this case. I need to fix this code so if an MPI rank has 0 elements, it doesn't do anything.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened #3241 and will create a PR to fix this in the next few days.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just curious, wouldn't this declaration work with dom_mt(ithr)
access?
type (domain1d_t), pointer, intent(in) :: dom_mt(:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, the issue you found relates to having more processes than elements. Declaration of dom_mt is orthogonal to that problem.
Re: declaring dom_mt: As a matter of personal taste, I don't like to declare something as a pointer unless it needs to be, as doing so narrows the scope of application of the subroutine. In this case, taking an assumed-shape array as input provides the greatest generality.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that works, but another seg-fault occurs at hybrid_create
: need the par%dynproc
check.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, exactly. Unless I'm mistaken, PR #3242 takes care of exactly the issue of including a dynproc check at all necessary spots. As I report in that PR, the 108-proc run for the 96-element case now completes. Thanks again for reporting the issue.
Adds high-order, property-preserving remap between physics and GLL grids.
Provide a module, gllfvremap_mod, whose use in dynamics/se is optionally enabled
with atmosphere namelist option se_fv_phys_remap_alg=1 (default 0 uses the
original remap algorithm), to remap between physics and GLL grids. The remap
-- is mass conserving;
-- produces tracers that do not violate extrema established before the remap;
-- remaps tendencies only, so that the background state is not affected by remap;
-- has order of accuracy N, where N is the parameter in the physgrid specification pgN.
[BFB]