Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Quotient Graph into miniEM #6

Closed
JacobDomagala opened this issue Nov 21, 2022 · 10 comments
Closed

Integrate Quotient Graph into miniEM #6

JacobDomagala opened this issue Nov 21, 2022 · 10 comments
Assignees
Labels
Milestone

Comments

@JacobDomagala
Copy link
Collaborator

No description provided.

@JacobDomagala JacobDomagala changed the title Integrate Quotient Graph into MiniEM Integrate Quotient Graph into miniEM Nov 21, 2022
@JacobDomagala JacobDomagala added this to the Task 2 milestone Nov 21, 2022
@egboman
Copy link
Collaborator

egboman commented Dec 6, 2022

Quotient graph is already an option in MueLu, so only need to change the xml file in miniEM. Note the rebalancing method is specified two or three times (for different subproblems).

@JacobDomagala
Copy link
Collaborator Author

Changes needed to switch from MJ to Quotient:
e21cf85

@JacobDomagala
Copy link
Collaborator Author

Also we'll probably want to update miniEM to read HIP config file (this will need some changes to source code)

@JacobDomagala
Copy link
Collaborator Author

I've created new branch with changes needed:
https://github.com/NexGenAnalytics/Trilinos/tree/zoltan2-test-quotient-with-miniem

@JacobDomagala
Copy link
Collaborator Author

JacobDomagala commented Dec 15, 2022

OK, we probably will want to enable ParMetis aswell (since QuotientAlg uses ParmetisAlg). The issue is that I don't see parmetis module, hopefully we won't have to build it from source
UPDATE:

  1. I was able to update miniEM to read HIP config files (changes needed are on my branch)
  2. When using default values in maxwell-large.xml the Zoltan2 algorithm is not used, I had to change the values to 80 (for each dimension), only then partitioning is actually used (and it fails, see more info below)

Command I used:
srun -t 00:10:00 -A ${PROJECT_NAME} -N 16 PanzerMiniEM_BlockPrec.exe --stacked-timer --solver=MueLu-RefMaxwell --numTimeSteps=3 --linAlgebra=Tpetra --inputFile=maxwell-large.xml

Ok so the failing part. As I mentioned in the comment above, we will actually need to enable ParMetis in our build script, right now it's missing and the exception is thrown when QuotientAlg is trying to build internal ParMetis alg.

@egboman
Copy link
Collaborator

egboman commented Dec 15, 2022

Yes, you will need ParMetis. Sorry forgot about that. I believe it's in a module on Crusher?

@JacobDomagala
Copy link
Collaborator Author

I don't see it (running module spider parmetis doesn't show any results). I've seen module load parmetis/4.0.3 in old build scripts, but this module is no longer present.

@egboman
Copy link
Collaborator

egboman commented Dec 19, 2022

A couple thoughts on how to proceed:

  1. I will ask olcf-help what happened to the parmetis module. I also can't find it, but I think superlu-dist needs it, so must be somewhere?
  2. I'll check if I can remove the dependency on ParMetis in the Quotient algorithm. In principle, we could use Zoltan/PHG instead but this requires changing the Zoltan2 code and might not be straightforward in practice.

@egboman
Copy link
Collaborator

egboman commented Dec 19, 2022

The OLCF folks promptly installed Parmetis for us:

module load parmetis/4.0.3

We will need to update the Trilinos build script to use $OLCF_PARMETIS_ROOT

@JacobDomagala
Copy link
Collaborator Author

JacobDomagala commented Jan 3, 2023

OK, the parmetis package is there, but I think we also need metis

EDIT. While metis is not a viable module, the library is present on the machine (${OLCF_PARMETIS_ROOT}/../metis-5.1.0-ialj45dt3hh66bnl3vslxlduihz7i5dy)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants