UVM dependency removal #937

ldh4 · 2022-02-23T18:29:55Z

[WIP] Modifications to make Nalu-wind buildable without Cuda UVM

Pre-requisite work for converting Nalu-wind to be compatible with HIP.
This is to make Nalu wind to build without Cuda UVM

@tasmith4 @alanw0 @jhux2 @ddement

tasmith4 · 2022-02-24T01:06:29Z

After further investigation, @ldh4 and I learned from @alanw0 that tests without NGP in the name are not expected to pass in any build. After running the unit tests with that filter, they all pass on ascicgpu both with and without UVM.

alanw0 · 2022-02-24T02:19:19Z

After further investigation, @ldh4 and I learned from @alanw0 that tests without NGP in the name are not expected to pass in any build. After running the unit tests with that filter, they all pass on ascicgpu both with and without UVM.

@tasmith4 @ldh4 More precisely, tests without NGP in the name are not expected to pass in a Cuda build. All unit tests are expected to pass in a CPU build. However, I am surprised to hear that all of the NGP unit-tests pass on Cuda without UVM. For example, the *.NGP* filter would include at least 3 unit tests that have tpetra usage (MixtureFractionKernelHex8Mesh.NGP_adv_diff_edge_tpetra*). I guess it's possible those tests don't use the graph assembly code, but I would be surprised. (And the one thing I think we're pretty certain about is that the graph assembly code is not using the correct types for at least some of its Views, and fixing those would then require one or more deep_copy calls.)

In any case, I'll work with you and Dong Hun tomorrow to help establish where we're at with all this.

tasmith4 · 2022-02-24T14:21:55Z

You're right @alanw0, I should have been more precise, thanks for clarifying. I was also very surprised all the NGP tests pass without UVM (and am willing to be skeptical of the result, although I double-checked my build), we'll have to discuss this more today as you say.

alanw0 · 2022-02-24T14:30:32Z

@tasmith4 In particular we're passing those views (that this commit changes to host views) into the setAllIndices method on the tpetra crsgraph. I guess the explanation could be that we're getting those host views from a dual view, and inside setAllIndices tpetra is doing the sync for us and then using the device views. If that's what is happening then great!

tasmith4 · 2022-02-24T14:39:51Z

I don't think that's what's happening, so I'm now even more confused by why these tests pass.

alanw0 · 2022-02-24T14:51:12Z

I guess I didn't look closely... Now that I look at this commit again, it doesn't change the types of the views we're passing into setAllIndices. It is changing the types of other stuff. So we'll just have to work through this today and see what is going on. We'll probably want to put in a (temporary) check in the code, in TpetraLinearSystem.C, for whether uvm is actually on or not.

Now need to try with uvm off

include/SolverAlgorithm.h

PaulMullowney · 2022-06-03T19:26:09Z

src/FixPressureAtNodeAlgorithm.C

@@ -113,6 +113,8 @@ FixPressureAtNodeAlgorithm::execute()
      }
    });
  });
+
+  eqSystem_->linsys_->free_coeff_applier(deviceCoeffApplier);


Why is this necessary? Are we leaking memory?

On the TpetraLinearSystem class, we had to change it so that it is no longer holding onto the host/device coeff appliers as its member variables. So, a new coeff applier is constructed on each request. Because of this, we had to introduce a way to properly deallocate coeff appliers when no longer needed. This is one of the changes that we will likely to revisit to find a better way to handle this.
Just as a note, free_coeff_applier is an empty function in LinearSystem.h, so this line will not affect HypreLinearSystem.

src/SolverAlgorithm.C

include/LinearSystem.h

PaulMullowney · 2022-06-03T19:29:14Z

src/overset/AssembleOversetDecoupledAlgorithm.C

@@ -40,6 +40,8 @@ void AssembleOversetDecoupledAlgorithm::execute()
    fringeNodes.size(), KOKKOS_LAMBDA(const size_t& i) {
      coeffApplier->resetRows(1, &fringeNodes(i), 0, numDof, 1.0, 0.0);
    });
+
+  eqSystem_->linsys_->free_coeff_applier(coeffApplier);


We're doing this again. We must have been leaking memory?

Same reason as the above.

ldh4 · 2022-06-21T22:54:02Z

A couple notes on this commit:

This builds and runs with and without unified memory support (tested only with cuda backend for now).
Building without unified memory currently shows deficiencies in both memory usage and runtime performance. We are still investigating the source of this and will have follow-on tasks to resolve this.
Building with unified memory does not have this problem and maintains the pervious performance. When tested with nrel5mw_refined_rcb problem (~700M nodes) on 90 nodes of summit, the build with this patch ran to its completion where as the build from Nalu-wind master timed out.

ldh4 · 2022-06-21T22:55:38Z

I am actually running nrel5mw_refined_rcb problem again with a smaller termination count, just to see how comparable the runtimes between the two builds are.

ldh4 · 2022-06-22T17:48:53Z

Checking numbers from nrel5mw_refined_rcb runs again with its termination count reduced by half showed that there isn't any significant difference in runtime and memory usage between this patched version (uvm turned on) and the master, as expected. Runtime had ~5% speedup.

ldh4 added 8 commits April 7, 2022 11:51

initial

75584b2

Up to coeff applier static instance removal

f096109

MatrixFreeLowMachEquationSystem modification

a1e2f6b

WIP segfaults on a unit test

cc7ab24

Segfault reverted, also reverted some of UVMSpace to MemSpace usage

75b7773

Fixed the last memspace incompatibility problem with UVM space

80e51f7

Now need to try with uvm off

NGP unit tests passing with UVM off

774d9e8

Took out deprecated Tpetra function usages: getLocalViewDevice/Host

0ac0661

ldh4 force-pushed the uvm_removal branch from c040128 to 0ac0661 Compare April 14, 2022 21:06

PaulMullowney reviewed Jun 3, 2022

View reviewed changes

include/SolverAlgorithm.h Outdated Show resolved Hide resolved

PaulMullowney reviewed Jun 3, 2022

View reviewed changes

src/SolverAlgorithm.C Outdated Show resolved Hide resolved

PaulMullowney reviewed Jun 3, 2022

View reviewed changes

include/LinearSystem.h Outdated Show resolved Hide resolved

PaulMullowney reviewed Jun 3, 2022

View reviewed changes

Reverted NGPApplyCoeff changes in SolverAlgorithm

c0ee4ee

ldh4 changed the title ~~[WIP] Start of UVM dependency removal~~ [WIP] UVM dependency removal Jun 6, 2022

ldh4 added 2 commits June 6, 2022 15:09

Removed commented lines

e0e9abf

Adding in a nullptr check in the CoeffApplierDestructor

6b678fb

ldh4 changed the title ~~[WIP] UVM dependency removal~~ UVM dependency removal Jun 21, 2022

ldh4 marked this pull request as ready for review June 21, 2022 22:30

ldh4 requested review from alanw0 and ddement June 21, 2022 22:56

ldh4 requested review from jhux2 and tasmith4 June 21, 2022 22:56

tasmith4 approved these changes Jun 22, 2022

View reviewed changes

ldh4 merged commit 673de80 into Exawind:master Jun 22, 2022

ldh4 mentioned this pull request Jun 28, 2022

Fix memory leak #987

Merged

psakievich mentioned this pull request Sep 12, 2022

Memory leak in in SolverAlgorithm #1037

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

UVM dependency removal #937

UVM dependency removal #937

ldh4 commented Feb 23, 2022 •

edited

Loading

tasmith4 commented Feb 24, 2022

alanw0 commented Feb 24, 2022 •

edited

Loading

tasmith4 commented Feb 24, 2022

alanw0 commented Feb 24, 2022

tasmith4 commented Feb 24, 2022

alanw0 commented Feb 24, 2022

PaulMullowney Jun 3, 2022 •

edited

Loading

ldh4 Jun 6, 2022 •

edited

Loading

PaulMullowney Jun 3, 2022

ldh4 Jun 6, 2022

ldh4 commented Jun 21, 2022

ldh4 commented Jun 21, 2022

ldh4 commented Jun 22, 2022

UVM dependency removal #937

UVM dependency removal #937

Conversation

ldh4 commented Feb 23, 2022 • edited Loading

tasmith4 commented Feb 24, 2022

alanw0 commented Feb 24, 2022 • edited Loading

tasmith4 commented Feb 24, 2022

alanw0 commented Feb 24, 2022

tasmith4 commented Feb 24, 2022

alanw0 commented Feb 24, 2022

PaulMullowney Jun 3, 2022 • edited Loading

Choose a reason for hiding this comment

ldh4 Jun 6, 2022 • edited Loading

Choose a reason for hiding this comment

PaulMullowney Jun 3, 2022

Choose a reason for hiding this comment

ldh4 Jun 6, 2022

Choose a reason for hiding this comment

ldh4 commented Jun 21, 2022

ldh4 commented Jun 21, 2022

ldh4 commented Jun 22, 2022

ldh4 commented Feb 23, 2022 •

edited

Loading

alanw0 commented Feb 24, 2022 •

edited

Loading

PaulMullowney Jun 3, 2022 •

edited

Loading

ldh4 Jun 6, 2022 •

edited

Loading