Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Documentation updates; bug fixes for RFMIP clear sky #185

Merged
merged 204 commits into from
Aug 4, 2022
Merged

Conversation

RobertPincus
Copy link
Member

No description provided.

RobertPincus and others added 30 commits May 12, 2019 14:23
* Missed one file in last commit

* tweaks for CPU compilation

* In OpenACC: allocating types as well as data components in ACC copyins, deletes

* Ignoring things for Luis...

* Missing statement

* Aligning array sizes in kernels with arguments

* Refined argument intents for some kernels

* Workaround for PGI compiler problem with logicals

* Input sanitizing gets it own module

* Yeah, we'll need the sanitizing module too.

* OpenACC-compatible checking for max and min values

* OpenACC value checking working; OLCF makefile doesn't use managed memory

* Logical kind chosen with pre-preprocessor flags
-- Several small bug fixes (argument intent, maximum interpolation indices, thanks to Sebastian Rast)
-- Parameterized checking for out-of-range values (works also on GPU)
-- Continuous integration with Travis (thanks to Valentin Clement)
-- Logical type defaults to Fortran; can be set to use c_bool with -DUSE_CBOOL
-- Internal build system can use environmental variables instead of specified files (Makefile.conf etc.) to define compilers, flags, choose kernel directory
-- Python scripts to automate running and testing of RFMIP examples
-- Update RFMIP examples to use version 1.2 of atmospheres file
-- End-to-end RFMIP examples on GPU are broken; fixes pending
Remove nullify statements on declaration of pointers in subroutines to ensure
thread safety for mo_gas_optics_rrtmgp. When pointers get assigned in
declarations, they implicitly get a save attribute and are assumed static. This
is a problem when then occurs in a threaded region, so this code was NOT
thread-safe before. Removing the `=> NULL()` does not change the behavior of the
code for non-threaded applications, but does ensure thread-safety.
Open coefficients files for read-only, rather than read-write access
because we do NOT want to accidentally write to these files, nor do we
want to require users to have write-permissions to load these files.

Closes #31, #32.
* Shortwave RFMIP running end-to-end on GPU. Boundary conditions still on CPU.

* Upper boundary condition lives on GPU in LW no-scattering calculation.

* Moved optical props validation in rte_lw(); simplified data movement in gas optics. Source function still sloshing back and forth between host and device.

* Surface emissivity computed on GPU in LW RFMIP example

* Moved transposition of surface Planck source onto GPU, clumsily; LW RFMIP cases now running end-to-end on device.

* RFMIP boundary conditions on GPU; removing async (may add back later)

* Reorder kernels use a single source

* Moving array-zeroing routines into mo_util_array

* Single-source for array utilities

* rte_sw uses array utilities to check validity of boundary conditions

* Adding 1D array-zeroing routine

* Some SW RFMIP boundary conditions on GPU.

* Single-source for fluxes_broadband_kernels

* Removing an unneeded OpenACC data transfer

* Array value checking uses functions in mo_rte_lw; syntactic cleanup

* Correcting mal-formed Makefile

* Refined copying of one array in SW examples.

* Ben Hillman spots a GPU array being initialized on the CPU. Fixed that.
Contributions from Dmitry Alexeev from Nvidia. Dramatically improves performance of clear-sky RFMIP standalone cases on GPUs. Fixes compiler bug in PGI related to private local variables. 

* tiling reorder, could still be improved by about 40%

* using kernels in the mo_util_array, faster and more compact

* constant shouldn't be explicitly allocated

* tile combine_and_reorder, almost full bandwidth now

* process several elements per thread in expensive kernels

* fix interpolation, use gang vector

* add another kernel for optical_depths_minor, only working in a special case (hopefully typical case). about 3x faster.

* avoid atomics in sum_broadband, 2.5x faster

* explicitly make private arrays stay in global memory, fixed possible errors and 8x faster due to collapsed loop

* fuse lw_source_noscat into lw_solver_noscat, faster by about 1ms on Piz Daint

* fixed subscript typo in denom

* fix out-of-bounds issue

* added comments about how the reorder kernels work

* introduce point-wise lw_source_noscat_stencil to express concepts separation while keeping fused loops

* added comments explaining bug workaround for adding routine

* fixed unrolled loop bounds and added comments

* removed unused variable

* added missing data deletion

* remove the default kernel in gas_optical_depths_minor, the optimized kernel can deal with variable number of g-points per band now
…PUs during calculations (#53)

Should be more tightly integrated with host models already running on the GPU.
…options and doesn't download reference files automatically.
…es are larger than 1.e-5 (user-configurable)
RobertPincus and others added 28 commits April 10, 2022 21:37
Use environment.yml file to detail Python packages needed for testing and validation/verification, to be obtained from conda.
Changes a single array from assumed-size to assumed-shape.
Move CI file from Owncloud to LDEO; resolution-independent RCEMIP profiles.
More substantive content in the Jekyll pages, RTE kernels in-line documentation complete
… and no2 aren't included in the list of gases in these focing scenarios but they're needed to initialize the gas concentrations. Other approaches might be more robust.
* fix(doc): upgrade CI environment, use newer version of ruby
* Jekyll pages following Diataxis framework

* Streamlined FORD pages, updates to FORD boilerplate

Co-authored-by: Brad Richardson <everythingfunctional@protonmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet