Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MAM4xx calcsize port #72

Merged
merged 91 commits into from
Dec 14, 2022
Merged

MAM4xx calcsize port #72

merged 91 commits into from
Dec 14, 2022

Conversation

odiazib
Copy link
Contributor

@odiazib odiazib commented Dec 5, 2022

This is the MAM4xx port of the calcsize process from e3sm.

In e3sm, this subroutine calculates the aerosol distribution parameters (dgncur_i, dgncur_c, v2ncur_i, v2ncur_c) and also updates the tendencies for the mass mixing ratio (q_aero_i and q_aero_c ) and number mixing ratios (n_mode_i, n_mode_c) for all four modes (accumulation, aitken, coarse, and primary carbon). To port calcsize, we follow the nucleation and gasaerexch implementations in mam4xx. Therefore, calcsize has a top-level compute_tendencies() function that provides the above prognostics and diagnostics quantities to the host model.
For reference, we used the refactored MAM4 code and the ported Fortran code in Haero, as recommend by Balwinder. MAM4xx calcsize builds on both CPUs and NVIDIA GPUs and passes all implemented unit and validation tests, though, vacuously so in some cases that require more data for validation.

We carry out initial high-level validation testing for MAM4xx calcsize (src/validation/calcsize/compute_tendencies.cpp) using Skywalker output from e3sm. We are able to match most of the outputs, except for the particulate diameter of the aitken mode. We will need additional output from e3sm (and validation output for other functions) to find and fix this issue.

Main points to consider in this mam4xx calcsize port:

  • Because of the data structures of tendencies and prognostics, we did not implement the same array structure as is present in MAM4.
    • To be specific, we employ 1d ColumnViews of 4 modes and each mode has 7 species.
    • Each ColumnView has a length of number of vertical levels in an atmospheric column, in contrast to MAM4's 2-rank array with dimensions of number of levels and total number of species, grouped by mode.
  • For transferring aerosols between aitken and accumulation modes, we define four variables:
    1. _n_common_species_ait_accum: number of common species between aitken and accumulation modes.
    2. _noxf_acc2ait: species in accumulation mode that are NOT part of the aitken mode.
    3. _ait_spec_in_acc: index of aitken species that are also in accumulation mode.
    4. _acc_spec_in_ait: index of accumulation species also in aitken mode.
    • These variables replace the mapping used in E3SM, and this change is necessary because of the difference in data structure between MAM4xx and MAM4-E3SM.
  • We added particle diameter (dgncur_i, dgncur_c) and number to volume ratio (v2ncur_c, v2ncur_i) diagnostic variables in aero_conf.hpp. These variables are outputs from calcsize and are needed for verification.
    • Note that, for clarity, interstitial species have been given the new subscript of i, in contrast to MAM4's a subscript (for "advected" species).
  • We moved the computation of the reciprocal of the density (inv_density) to the initialization of calcsize. We use a 4x7 data struct to match the shape of tendencies and prognostics.
  • We added Skywalker validation tests (validation/calcsize) for the following functions: get_relaxed_v2n_limits, compute_dry_volume_k, adjust_num_sizes, update_diameter_and_vol2num, and aitken_accum_exchange. We created the associated yaml inputs and python outputs for these tests in the mam_x_validation repository (currently on our own branch).
    • Currently, these tests only run for one cell (level), and we will update them when we obtain validation outputs from e3sm. The code for these tests is a first draft and we will adapt them accordingly when we obtain the validation data.
    • We plan to submit a PR to mam_x_validation once this PR in mam4xx is finalized.

Fixes #75

src/mam4xx/calcsize.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@jeff-cohere jeff-cohere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guys, this looks pretty great so far. I see a lot of FIXMEs that will lead to useful discussion.

@pbosler
Copy link
Contributor

pbosler commented Dec 5, 2022

Great write-up, thanks.

We added particle diameter (dgncur_i, dgncur_c) and number to volume ratio (v2ncur_c, v2ncur_i) diagnostic variables in aero_conf.hpp

Do we have to keep variable names like dgncur_i? Within a process impl, I'm less concerned with this kind of thing; but if we're going to put them into our Prognostics and Diagnostics classes, I'd prefer a more readable descriptor.

@pbosler
Copy link
Contributor

pbosler commented Dec 5, 2022

@jeff-cohere , @singhbalwinder : We see in this PR some common things that have shown up in others already, and will likely continue to show up (below). Thanks, @odiazib and @mjs271 for this work and for making the following discussion much easier with good code and good code comments.

  1. Conversion between diameter and volume. In some places in the fortran, like calcsize, this is done with knowledge of the modal setting, i.e., the conversion uses the standard deviation of the PDF. In other places, like water uptake, the spherical geometry formulas are used directly (without the modal standard deviation). We've got functions in conversions.hpp to hopefully make this consistent in mam4xx, using the modal standard deviation in each case.
  2. Bad constants. I know we're doing a straight refactor because of a deadline, but when bad constants exist for bad purposes, such as dubious handling of floating point arithmetic, I vote for getting rid of them. Bad constants that have physical (rather than numeric) implications likely have to stay.
  3. Variable names. In our main data structures, AeroConfig, Prognostics, and Diagnostics, I'd prefer to keep variable names readable. Within a process impl, it's less important and we can stay closer to the original fortran. This means that we all know/agree on the translation of variables from fortran to mam4xx and can keep abreast of changes to our main data structures as more processes are added... Is that realistic?
  4. Redundant data storage. Within the Atmosphere class, we've made the decision to record only one measure of water vapor content (its dry mixing ratio); when other measures (specific humidity, relative humidity) are needed, we compute them from the dry mixing ratio. We don't store multiple sources of the same information; we use conversions. Should we do the same within mam4xx, for example, with particle size (store either diameter or volume, but not both)? I vote yes. GPUs are supposed to make the computations required by conversions "free" and, more importantly, then we don't have to worry about keeping the two values consistent with each other.

@jeff-cohere
Copy link
Collaborator

I opened #74 so we can discuss names of quantities there. Unfortunately, we don't have enough collective momentum yet to let these kinds of issues get in the way of pull requests.

@jeff-cohere
Copy link
Collaborator

Regarding item 2 (bad constants): I agree that poor handling of floating point arithmetic is lamentable, but our scientist friends are not trained in numerical analysis and we might have trouble explaining departures from MAM4 implementations to them. We can address this on a case-by-case basis, but I'm not personally willing to spend a lot of energy in this direction because we haven't made a lot of progress in the past when we've attempted to do so.

@jeff-cohere
Copy link
Collaborator

@odiazib and @mjs271 , is this PR still a draft, or are you finished with development and ready for a proper review?

@mjs271
Copy link
Contributor

mjs271 commented Dec 8, 2022

I think it's in good shape to go ahead!

@odiazib odiazib marked this pull request as ready for review December 8, 2022 15:55
src/mam4xx/calcsize.hpp Outdated Show resolved Hide resolved
src/mam4xx/calcsize.hpp Outdated Show resolved Hide resolved
Copy link
Collaborator

@jeff-cohere jeff-cohere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is more or less ready to go--nice work, you two!

@pbosler raised several good points in his review, and I think we should actually discuss these things in a group setting because they pertain to several parameterizations. I have created some GitHub Issues to track these conversations. Everyone, please feel free to modify these issues and add new ones to better reflect any points you feel need to be discussed, clarified, etc.

Other reviewers: please take a look at this PR and weigh in with your thoughts.

Copy link
Contributor

@pbosler pbosler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great PR, thanks for all the work, in particular, for being so thorough with validation and testing. Most of my previous comments (on the draft PR) can be deferred to Issues and discussions with the larger group and should not delay this PR. Some things that I would like to be resolved as part of this PR are:

  1. Double-check that dgncur_i and dgncur_c need to be added to Diagnostics; I think they're already there as Diagnostics.dry_geometric_mean_diameter. We may need to update Diagnostics to account for the c/I separation for these diagnostics. If so, let's make another issue and do that in a separate PR since this one is already large.
  2. There is some if logic implemented by a multiplication by one million that I'd like to take out and replace with an actual if or bool-dependent statement.
  3. There are redundant arrays (e.g., volume and diameter) and functions (mode averaged-size) that should be replaced with already-implemented functions from conversions and mode_wet/dry_particle_size.
  4. There are some code comments that refer to the fortran mam4 constant indexing convention, which we've replaced in mam4xx by enums. Please double-check that we're using the enums in the code and not the old fortran indices, and update the comments if necessary.

src/mam4xx/calcsize.hpp Outdated Show resolved Hide resolved
src/mam4xx/calcsize.hpp Show resolved Hide resolved
src/mam4xx/calcsize.hpp Show resolved Hide resolved
src/mam4xx/calcsize.hpp Show resolved Hide resolved
src/mam4xx/calcsize.hpp Show resolved Hide resolved
src/tests/mam4_calcsize_unit_tests.cpp Show resolved Hide resolved
src/mam4xx/aero_config.hpp Outdated Show resolved Hide resolved
src/mam4xx/aero_config.hpp Outdated Show resolved Hide resolved
src/mam4xx/calcsize.hpp Show resolved Hide resolved
src/mam4xx/calcsize.hpp Outdated Show resolved Hide resolved
@odiazib
Copy link
Contributor Author

odiazib commented Dec 10, 2022

I rebase collab/ccalcsize branch against main branch because of merging issues.

@jeff-cohere
Copy link
Collaborator

Thanks for the comments, @pbosler and @singhbalwinder . I have left comments that indicate what I think should be changed and what we should defer. @mjs271 , if you can address these comments, we can do a final review and get this PR merged.

I think it's crucial for us to keep our momentum. Down the line when we have more code in main, we can polish as needed.

Copy link
Contributor

@pbosler pbosler left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It sounds like @mjs271 will create several Issues to track the outstanding comments. It's ok to merge once we have those.

src/mam4xx/calcsize.hpp Show resolved Hide resolved
src/mam4xx/calcsize.hpp Show resolved Hide resolved
const Real &init_num_i, const Real &init_num_c,
const Real &dt, const Real &v2nmin, const Real &v2nmax,
const Real &v2nminrl, const Real &v2nmaxrl,
const Real &adj_tscale_inv, const Real &close_to_one,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saving this for a separate issue/PR is ok with me.

src/mam4xx/calcsize.hpp Show resolved Hide resolved
Copy link
Contributor

@pressel pressel left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This all looks good to me too. Thanks for all this work @mjs271 and @odiazib, as well as the great reviews from @jeff-cohere and @pbosler.

@mjs271
Copy link
Contributor

mjs271 commented Dec 13, 2022

@jeff-cohere @pbosler Thanks marking those issue candidates for me. I'm at AGU this week, but I'll knock them down as I snag some free moments. Feel free to merge, though, and I can refer to the closed PR.

@jeff-cohere
Copy link
Collaborator

I've created issues for everything we should revisit for this PR, so I'm merging it.

@jeff-cohere jeff-cohere merged commit dd8da81 into main Dec 14, 2022
@jeff-cohere jeff-cohere deleted the collab/calcsize branch December 14, 2022 16:33
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Single Precision Overflow in calcsize Unit Test
6 participants