Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile PETSc version bump #2552

Merged
merged 6 commits into from
Feb 26, 2023
Merged

Dockerfile PETSc version bump #2552

merged 6 commits into from
Feb 26, 2023

Conversation

garth-wells
Copy link
Member

Also add PETSc garbage collection to tests.

@garth-wells garth-wells added testing Test system issues ci Continuous Integration labels Feb 25, 2023
@jorgensd
Copy link
Sponsor Member

jorgensd commented Feb 25, 2023

Are you sure about this? We downgraded due to the various issues listed in: #2483
Are all of them resolved ?

@garth-wells
Copy link
Member Author

Are you sure about this? We downgraded due to the various issues listed in: #2483 Are all of them resolved ?

PETSc.garbage_cleanup() has been added. See https://gitlab.com/petsc/petsc/-/issues/1309.

@jorgensd
Copy link
Sponsor Member

Are you sure about this? We downgraded due to the various issues listed in: #2483 Are all of them resolved ?

PETSc.garbage_cleanup() has been added. See https://gitlab.com/petsc/petsc/-/issues/1309.

But have they merged in the fix for https://gitlab.com/petsc/petsc/-/issues/1288
into a release?

@garth-wells
Copy link
Member Author

Tested inside Docker locally. Seems ok with latest MPI releases.

@garth-wells garth-wells merged commit 15642f3 into main Feb 26, 2023
@garth-wells garth-wells deleted the docker-petsc-bump branch February 26, 2023 11:17
@francesco-ballarin
Copy link
Member

Are we sure that all issues are solved?

I have heavily parametrized tests in multiphenicsx
https://github.com/multiphenics/multiphenicsx/blob/update/tests/unit/fem/test_assembler_restriction.py
and I am still getting failures like

internal_Dist_graph_create_adjacent(125).: MPI_Dist_graph_create_adjacent(comm=0xc40013e0, indegree=0, sources=(nil), sourceweights=0x7f5eeba42af4, outdegree=0, destinations=(nil), destweights=0x7f5eeba42af4, MPI_INFO_NULL, reorder=0, comm_dist_graph=0x7ffde86a6b08) failed
MPIR_Dist_graph_create_adjacent_impl(319): 
MPII_Comm_copy(913)......................: 
MPIR_Get_contextid_sparse_group(591).....: Too many communicators (0/2048 free on this process; ignore_id=0)
Abort(71919887) on node 0 (rank 0 in comm 368): application called MPI_Abort(comm=0xC40013E0, 71919887) - process 0

I am quite confident the error is not with my code, since:

  • by installing pytest-repeat I can see the failed test moving to a different parameterization value depending on the value of --count.
  • tests run ok on my local machine (with openmpi, rather than mpich)

@garth-wells
Copy link
Member Author

The CI works with the latest OpenMPI and MPICH (4.1).

We can't sit on PETSc 3.17 forever, and there is little indication that any PETSc changes are forthcoming.

You can try destroying some PETSc objects when you're finished with them. This got me further with the test suite. Another possibility is running pytest in batches.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ci Continuous Integration testing Test system issues
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants