Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixes and testing for Tomboulides class in gpu builds #267

Merged
merged 4 commits into from
Jul 2, 2024

Conversation

trevilo
Copy link
Contributor

@trevilo trevilo commented Apr 3, 2024

This PR adds some bug fixes and testing for the low Mach flow solver in gpu builds. Three notes:

  1. Much of the work of the flow solver is not offloaded to the gpu yet, so there is development to do to get better performance. This is left for a separate PR, but this PR establishes some regression tests that can be used during that effort.
  2. Some tolerances had to be relaxed (namely in lomach-lequere.test), which is expected b/c the environment is different from that where the regression tests were established.
  3. Non-deterministic behavior (at levels below the tolerances) has been observed in gpu builds (specifically cuda; hip has not been examined yet). This is attributed to non-determinism in linear algebra, specifically cusparse_SpMV using CUSPARSE_SPMV_CSR_ALG1 (see documentation here) but this should be examined more carefully.

…vice

By default for device-enabled runs, all ParGridFunction objects set
UseDevice(true) for their underlying Vector.  This can break the curl
calculations (ComputeCurl{2D,3D,Axi}) because they *assume* all host
is valid.  This commit is the minimal necessary for these fcns to
protect themselves and return valid results (on the host of course).
The u_next_rad_comp_gf_ ParGridFunction seems to be causing problems
on some gpu systems.  This is intended to alias the first
(i.e. radial) component of the velocity grid function.  It is
constructed via

u_next_rad_comp_gf_ = new ParGridFunction(pfes_, *u_next_gf_);

But, this leads to failures like the following:

Verification failed: (IsHostMemory(h_mt)) is false

in mfem::MFEM_VERIFY_TYPES, which is called (eventually) from
Vector::HostRead.  Initially I thought this had to do with a failure
to properly call Vector::SyncMemory, since the base memory (in
u_next_gf_) is being modified.  But, an attempt to use this had no
impact on the problem.

So, in this workaround, I eliminate the alias and simply copy the
first component into a new grid function.  This seems to work, but it
would be nice to understand the problem with the original approach,
b/c clearly we'd rather not copy when it shouldn't be necessary.
@trevilo trevilo marked this pull request as ready for review July 2, 2024 20:40
@trevilo trevilo merged commit 0286e96 into main Jul 2, 2024
18 checks passed
@trevilo trevilo deleted the fix-tomboulides-gpu branch July 19, 2024 17:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant