-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation updates; bug fixes for RFMIP clear sky #185
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* Missed one file in last commit * tweaks for CPU compilation * In OpenACC: allocating types as well as data components in ACC copyins, deletes * Ignoring things for Luis... * Missing statement * Aligning array sizes in kernels with arguments * Refined argument intents for some kernels * Workaround for PGI compiler problem with logicals * Input sanitizing gets it own module * Yeah, we'll need the sanitizing module too. * OpenACC-compatible checking for max and min values * OpenACC value checking working; OLCF makefile doesn't use managed memory * Logical kind chosen with pre-preprocessor flags
-- Several small bug fixes (argument intent, maximum interpolation indices, thanks to Sebastian Rast) -- Parameterized checking for out-of-range values (works also on GPU) -- Continuous integration with Travis (thanks to Valentin Clement) -- Logical type defaults to Fortran; can be set to use c_bool with -DUSE_CBOOL -- Internal build system can use environmental variables instead of specified files (Makefile.conf etc.) to define compilers, flags, choose kernel directory -- Python scripts to automate running and testing of RFMIP examples -- Update RFMIP examples to use version 1.2 of atmospheres file -- End-to-end RFMIP examples on GPU are broken; fixes pending
Remove nullify statements on declaration of pointers in subroutines to ensure thread safety for mo_gas_optics_rrtmgp. When pointers get assigned in declarations, they implicitly get a save attribute and are assumed static. This is a problem when then occurs in a threaded region, so this code was NOT thread-safe before. Removing the `=> NULL()` does not change the behavior of the code for non-threaded applications, but does ensure thread-safety.
…ript to comparision
* Shortwave RFMIP running end-to-end on GPU. Boundary conditions still on CPU. * Upper boundary condition lives on GPU in LW no-scattering calculation. * Moved optical props validation in rte_lw(); simplified data movement in gas optics. Source function still sloshing back and forth between host and device. * Surface emissivity computed on GPU in LW RFMIP example * Moved transposition of surface Planck source onto GPU, clumsily; LW RFMIP cases now running end-to-end on device. * RFMIP boundary conditions on GPU; removing async (may add back later) * Reorder kernels use a single source * Moving array-zeroing routines into mo_util_array * Single-source for array utilities * rte_sw uses array utilities to check validity of boundary conditions * Adding 1D array-zeroing routine * Some SW RFMIP boundary conditions on GPU. * Single-source for fluxes_broadband_kernels * Removing an unneeded OpenACC data transfer * Array value checking uses functions in mo_rte_lw; syntactic cleanup * Correcting mal-formed Makefile * Refined copying of one array in SW examples. * Ben Hillman spots a GPU array being initialized on the CPU. Fixed that.
Contributions from Dmitry Alexeev from Nvidia. Dramatically improves performance of clear-sky RFMIP standalone cases on GPUs. Fixes compiler bug in PGI related to private local variables. * tiling reorder, could still be improved by about 40% * using kernels in the mo_util_array, faster and more compact * constant shouldn't be explicitly allocated * tile combine_and_reorder, almost full bandwidth now * process several elements per thread in expensive kernels * fix interpolation, use gang vector * add another kernel for optical_depths_minor, only working in a special case (hopefully typical case). about 3x faster. * avoid atomics in sum_broadband, 2.5x faster * explicitly make private arrays stay in global memory, fixed possible errors and 8x faster due to collapsed loop * fuse lw_source_noscat into lw_solver_noscat, faster by about 1ms on Piz Daint * fixed subscript typo in denom * fix out-of-bounds issue * added comments about how the reorder kernels work * introduce point-wise lw_source_noscat_stencil to express concepts separation while keeping fused loops * added comments explaining bug workaround for adding routine * fixed unrolled loop bounds and added comments * removed unused variable * added missing data deletion * remove the default kernel in gas_optical_depths_minor, the optimized kernel can deal with variable number of g-points per band now
… otherwise no changes.
…PUs during calculations (#53) Should be more tightly integrated with host models already running on the GPU.
…options and doesn't download reference files automatically.
…es are larger than 1.e-5 (user-configurable)
…ection was a squash-merge.
…ty more realistic benchamrk.
Use environment.yml file to detail Python packages needed for testing and validation/verification, to be obtained from conda.
Changes a single array from assumed-size to assumed-shape.
Move CI file from Owncloud to LDEO; resolution-independent RCEMIP profiles.
More substantive content in the Jekyll pages, RTE kernels in-line documentation complete
… and no2 aren't included in the list of gases in these focing scenarios but they're needed to initialize the gas concentrations. Other approaches might be more robust.
* fix(doc): upgrade CI environment, use newer version of ruby
* Jekyll pages following Diataxis framework * Streamlined FORD pages, updates to FORD boilerplate Co-authored-by: Brad Richardson <everythingfunctional@protonmail.com>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.