Releases: GenASiS/GenASiS_Basics
GenASiS_Basics-v4.0
Version 4 of GenASiS Basics includes a name change and additions to functionality, including the facilitation of direct communication between GPUs.
A compute capsule illustrating a build environment with reproducible run may be found in codeocean.com:
https://codeocean.com/capsule/0242195/tree (DOI: 10.24433/CO.3985995.v1)
A New Version Announcement with more detailed changes for this release has been published in Computer Physics Communication:
https://www.sciencedirect.com/science/article/abs/pii/S0010465522002247 (DOI: 10.1016/j.cpc.2022.108505)
GenASiS_Basics-v3.1
This revision updates the previous release (v3.0) and is a version of the code as described on this published paper (also available in arXiv:1812.07977).
GenASiS_Basics-v3.0
This revision---Version 3 of Basics---includes a significant name change, some minor additions to functionality, and a major addition to functionality: infrastructure facilitating the offloading of computational kernels to devices such as GPUs.
This version include bug fixes for the RiemannProblem
example problem in release v2.1.
The following two papers describe this release :
arXiv:1812.07977
arXiv:1507.02506v3
GenASiS_Basics-v2.1
This version includes work with OpenMP directives to target hardware accelerators (GPUs) on Summit, a newly deployed supercomputer at the Oak Ridge Leadership Computing Facility (OLCF), demonstrating simplified access to GPU devices and useful speedup on a sample fluid dynamics problem RiemannProblem
.
At a lower level, we use the capabilities of Fortran 2003 for C interoperability to provide wrappers to the OpenMP device memory runtime library routines (currently available only in C). At a higher level, we use C interoperability and Fortran 2003 type-bound procedures to modify our workhorse class for data storage to include members and methods that significantly streamline the persistent allocation of and on-demand association to GPU memory. Where the rubber meets the road, users offload computational kernels with OpenMP target directives that are rather similar to constructs already familiar from multi-core parallelization.
In this initial example we demonstrate total wall time speedups of ~12X in ‘proportional resource tests’ that compare runs with a given percentage of nodes' GPUs with runs utilizing instead the same percentage of nodes' CPU cores, and reasonable weak scaling up to 8000 GPUs vs. 56,000 CPU cores (1333 1/3 Summit nodes).
GenASiS_Basics-v2.1-beta.1
This version includes a first iteration of work with OpenMP target directives for GPU for the fluid dynamic example RiemannProblem
. A paper and reproducibility artifact for this release was submitted to the Fifth Workshop on Accelerator Programming Using Directives (WACCPD).
GenASiS_Basics-2.0
This new version announcement accompanying this release is available from Computer Physics Communications, with doi:10.1016/j.cpc.2016.12.019
The full method paper accompanying the original 1.0 version is available at doi:10.1016/j.cpc.2015.06.001 or from arXiv.org
GenASiS_Basics-1.0
Belated packaging of version 1.0.
Method paper accompanying this release is available at: http://dx.doi.org/10.1016/j.cpc.2015.06.001 or the pre-print arXiv http://arxiv.org/abs/1507.02506