-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hypre, outerloop & CUDA changes #2397
Conversation
…ject/BOUT-dev into next-hypre-outerloop-cuda-merged
Duplicate versions of various field files, VIM temporary files committed to the repo.
Small changes to spacing adding noise, backup files not needed
Otherwise Coordinates is an incomplete type
Whitespace changes and some comments
dz has now changed to being a Coordinates::FieldMetric type, 2D or 3D depending on compile-time option.
In FieldAccessor was only set if the field had parallel slices In Field2DAccessor never set
Never initialised
Needs a field to construct a valid object, so don't allow default construction.
Coordinates fields can now be 2D or 3D, depending on compile-time switch. f2d prefix would just be confusing if it could then be Field3D.
These definitions are repeated in several places, so just move to one location.
Coordinate fields can now be 2D or 3D, depending on compile-time flag BOUT_USE_METRIC_3D. This puts the logic for indexing these fields in one place.
The Field2DAccessor code was almost the same as FieldAccessor. Rather than having two implementations, use the same implementation and specialise on Field2D types.
`f_data` was a duplicate of `data`, and other `f_` prefixes aren't needed.
Moved to field_accessor.hxx rather than single_index_ops.hxx These changes make the API more similar to the Fields.
Uses the build flags, hopefully will work for all cases on the GPU
Just indexes data array, so `fa[i]` is equivalent to `fa.data[i]`
Compiler errors, include Mesh header
Bugs in `b0xGrad_dot_Grad`, `D2DY2` etc. fixed Renamed, removing `_g` postfix. Functions can take either an `int` or an `Ind3D`. The `int` argument is assumed to be a 3D index in all cases.
Compiles with 2D metrics and no RAJA (CPU only).
Document FieldAccessor and CoordinateFieldAccessor
A simple wrapper around `BoutReal*`, which has an indexing (subscript) operator for `int` and `Ind3D` types (const and non-const).
Code can be the same, but the surrounding loop is currently different.
Now compiles again without RAJA & CUDA. Previously also had braces in the wrong location.
Can be used by both LaplaceXY2 and LaplaceXY2Hypre, so put into a common header file
f2dinit member variable no longer needed for initialisation
The PetscMatrix API (constructor args) had changed, but this LaplaceXY2 implementation had not been updated. Now compiles again, though not tested that it runs or produces correct result.
Outerloop suggestions - resolve merge conflicts
Previously nvcc would fail claiming that symbols were defined multiple times. Now needs these symbols to be defined or linker errors result.
HYPRE_SOLVER_TYPE can't be directly formatted with nvcc compiler, but needs to be first converted to a string.
More cori nvcc fixes
Outerloop suggestions
@ZedThree @johnomotani @jonesholger @dschwoerer I think this is now finally ready to be merged! |
#if 0 // disable temporarily until reconcile iteration space for parallel_forall under | ||
// nvcc |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Any update on this?
// Don't currently know why this test fails, and also causes segfault when unwinding the | ||
// tests | ||
#if 1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this #2406 ?
I've got most of the way through this, will finish it when I get back from holiday! But I have a feeling it's "good enough". There's a few bits and pieces that could be tidied up -- we should probably make issues for them and try and clean them up later perhaps.
|
Thanks @ZedThree, have a good holiday! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Obviously this is a really big PR, but I think the general shape is good.
There's a few things that I'd really like to fix at some point, so noting here perhaps for future issues:
- Use of
std::shared_ptr
where it might not be needed - Use of
malloc
/new
- Timer tests have some
#if 1/0
bits (might be related to TimerTest.ListAllInfo only works once #2406 ?)
- Added to CMakeLists; previously wasn't compiled with CMake builds - Removed shared_ptr around PetscMatrix - Moved initialisation of indexConverter and matrix to initializer list
Laplacexy2 modifications
Only makes sense with 2D metrics, fails to compile if 3D metrics are enabled.
ok @ZedThree I think this is now finally ready to go in |
We can update that elsewhere. |
Ready for review & merge
Quite a large PR, adding support for GPUs in BOUT++. The two main parts are
FieldAccessor
andCoordinatesAccessor
for efficient access to field and coordinates data respectively.Includes manual and tests.