Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Attempting to use an MPI routine (internal_Wtime) before initializing or after finalizing MPICH #312

Closed
sagitter opened this issue Jul 31, 2023 · 4 comments
Assignees

Comments

@sagitter
Copy link

Hi all.

This errors break many Sundials-6.6.0's tests in Fedora 39 i686with MPICH-4.1.2 + GCC-13.1.1:

83: Test command: /builddir/build/BUILD/sundials-6.6.0/buildmpich_dir/build/examples/sunnonlinsol/fixedpoint/test_sunnonlinsol_fixedpoint "2" "0.5"
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:695 83: Working Directory: /builddir/build/BUILD/sundials-6.6.0/buildmpich_dir/build/examples/sunnonlinsol/fixedpoint
/builddir/build/BUILD/cmake-3.27.1/Source/cmCTest.cxx:309    Current_Time: Jul 30 20:16 UTC
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:806 83: Test timeout computed to be: 1500
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83: Attempting to use an MPI routine (internal_Wtime) before initializing or after finalizing MPICH
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83: Solve the nonlinear system:
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     3x - cos((y-1)z) - 1/2 = 0
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     x^2 - 81(y-0.9)^2 + sin(z) + 1.06 = 0
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     exp(-x(y-1)) + 20z + (10 pi - 3)/3 = 0
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83: Analytic solution:
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     x = 0.5
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     y = 1
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     z = -0.523599
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83: Solution method: Anderson accelerated fixed point iteration.
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     tolerance = 1.49012e-06
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     max iters = 20
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     accel vec = 2
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 83:     damping   = 0.5
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:985 83/84 Test #83: test_sunnonlinsol_fixedpoint_2_0.5 ..................
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:988 Testing test_sunnonlinsol_fixedpoint_2_0.5 ... 
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:256 ***Failed    0.00 sec
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:260 Attempting to use an MPI routine (internal_Wtime) before initializing or after finalizing MPICH
Solve the nonlinear system:
    3x - cos((y-1)z) - 1/2 = 0
    x^2 - 81(y-0.9)^2 + sin(z) + 1.06 = 0
    exp(-x(y-1)) + 20z + (10 pi - 3)/3 = 0
Analytic solution:
    x = 0.5
    y = 1
    z = -0.523599
Solution method: Anderson accelerated fixed point iteration.
    tolerance = 1.49012e-06
    max iters = 20
    accel vec = 2
    damping   = 0.5
/builddir/build/BUILD/cmake-3.27.1/Source/cmCTest.cxx:309    Current_Time: Jul 30 20:16 UTC
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestMultiProcessHandler.cxx:168 test 84
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:526       Start 84: test_sunnonlinsol_petscsnes
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:687 
84: Test command: /builddir/build/BUILD/sundials-6.6.0/buildmpich_dir/build/examples/sunnonlinsol/petsc/test_sunnonlinsol_petscsnes
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:695 84: Working Directory: /builddir/build/BUILD/sundials-6.6.0/buildmpich_dir/build/examples/sunnonlinsol/petsc
/builddir/build/BUILD/cmake-3.27.1/Source/cmCTest.cxx:309    Current_Time: Jul 30 20:16 UTC
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:806 84: Test timeout computed to be: 1500
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: Solution:
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: y1 = 0.78521
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: y2 = 0.496611
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: y3 = 0.369923
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: Solution Error:
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: e1 = 1.35104e-05
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: e2 = 6.26116e-11
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: e3 = 4.13929e-11
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: Number of nonlinear iterations: 3
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:39 84: SUCCESS
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:985 84/84 Test #84: test_sunnonlinsol_petscsnes .........................
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:988 Testing test_sunnonlinsol_petscsnes ... 
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestRunTest.cxx:256    Passed    0.27 sec
/builddir/build/BUILD/cmake-3.27.1/Source/cmCTest.cxx:309    Current_Time: Jul 30 20:16 UTC
/builddir/build/BUILD/cmake-3.27.1/Source/cmCTest.cxx:309    Current_Time: Jul 30 20:16 UTC
/builddir/build/BUILD/cmake-3.27.1/Source/cmCTest.cxx:309    Current_Time: Jul 30 20:16 UTC
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:613 
25% tests passed, 63 tests failed out of 84
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:629 
Total Test time (real) =   6.28 sec
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:668 
The following tests FAILED:
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  1 - ark_analytic (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  2 - cvRoberts_dns (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  3 - cvsRoberts_dns (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  4 - idaRoberts_dns (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  5 - idasRoberts_dns (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  6 - kinAnalytic_fp (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  7 - kinAnalytic_fp_--m_aa_2 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  8 - kinAnalytic_fp_--m_aa_2_--delay_aa_2 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	  9 - kinAnalytic_fp_--m_aa_2_--orth_aa_1 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 10 - kinAnalytic_fp_--m_aa_2_--orth_aa_2 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 11 - kinAnalytic_fp_--m_aa_2_--orth_aa_3 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 12 - test_nvector_serial_1000_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 13 - test_nvector_serial_10000_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 19 - test_nvector_manyvector_1000_100_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 20 - test_nvector_manyvector_100_1000_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 23 - test_nvector_openmp_1000_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 24 - test_nvector_openmp_1000_2_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 25 - test_nvector_openmp_1000_4_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 26 - test_nvector_openmp_10000_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 27 - test_nvector_openmp_10000_2_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 28 - test_nvector_openmp_10000_4_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 31 - test_sunmatrix_dense_100_100_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 32 - test_sunmatrix_dense_200_1000_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 33 - test_sunmatrix_dense_2000_100_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 34 - test_sunmatrix_band_10_2_3_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 35 - test_sunmatrix_band_300_7_4_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 36 - test_sunmatrix_band_1000_8_8_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 37 - test_sunmatrix_band_5000_3_20_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 38 - test_sunmatrix_sparse_400_400_0_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 39 - test_sunmatrix_sparse_450_450_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 40 - test_sunmatrix_sparse_200_1000_0_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 41 - test_sunmatrix_sparse_6000_350_0_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 42 - test_sunmatrix_sparse_500_5000_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 43 - test_sunmatrix_sparse_4000_800_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 44 - test_sunlinsol_band_10_2_3_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 45 - test_sunlinsol_band_300_7_4_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 46 - test_sunlinsol_band_1000_8_8_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 47 - test_sunlinsol_band_5000_3_100_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 48 - test_sunlinsol_dense_10_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 49 - test_sunlinsol_dense_100_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 50 - test_sunlinsol_dense_500_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 51 - test_sunlinsol_dense_1000_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 52 - test_sunlinsol_spgmr_serial_100_1_1_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 53 - test_sunlinsol_spgmr_serial_100_2_1_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 54 - test_sunlinsol_spgmr_serial_100_1_2_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 55 - test_sunlinsol_spgmr_serial_100_2_2_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 56 - test_sunlinsol_spfgmr_serial_100_1_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 57 - test_sunlinsol_spfgmr_serial_100_2_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 58 - test_sunlinsol_spbcgs_serial_100_1_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 59 - test_sunlinsol_spbcgs_serial_100_2_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 60 - test_sunlinsol_sptfqmr_serial_100_1_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 61 - test_sunlinsol_sptfqmr_serial_100_2_100_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 62 - test_sunlinsol_pcg_serial_100_500_1e-13_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 74 - test_sunlinsol_klu_300_0_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 75 - test_sunlinsol_klu_300_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 76 - test_sunlinsol_klu_1000_0_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 77 - test_sunlinsol_klu_1000_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 78 - test_sunlinsol_superlumt_300_0_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 79 - test_sunlinsol_superlumt_300_1_1_0 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 80 - test_sunnonlinsol_newton (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 81 - test_sunnonlinsol_fixedpoint (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 82 - test_sunnonlinsol_fixedpoint_2 (Failed)
/builddir/build/BUILD/cmake-3.27.1/Source/CTest/cmCTestTestHandler.cxx:682 	 83 - test_sunnonlinsol_fixedpoint_2_0.5 (Failed)
@balos1
Copy link
Member

balos1 commented Jul 31, 2023

Can you provide the CMakeCache.txt?

@sagitter
Copy link
Author

If you need the CMake configuration, this is the full build log:
https://kojipkgs.fedoraproject.org//work/tasks/1020/104151020/build.log

@balos1
Copy link
Member

balos1 commented Jul 31, 2023

The cause is that the SUNDIALS profiler uses MPI_WTime when SUNDIALS is built with MPI, but our serial examples do not initialize MPI. The error will go away if you set SUNDIALS_BUILD_WITH_PROFILING=OFF. We will work on a fix.

@balos1 balos1 added bug and removed triage labels Jul 31, 2023
@balos1 balos1 added this to the SUNDIALS Next milestone Jul 31, 2023
@balos1 balos1 mentioned this issue Aug 11, 2023
1 task
@balos1 balos1 modified the milestones: SUNDIALS 6.6.1, SUNDIALS Next Sep 13, 2023
gardner48 added a commit that referenced this issue Nov 2, 2023
Updated `SUNProfiler` to note rely on `MPI_WTime`. Fixes #312 

---------

Co-authored-by: David Gardner <gardner48@llnl.gov>
@balos1
Copy link
Member

balos1 commented Nov 2, 2023

Fixed by #317.

@balos1 balos1 closed this as completed Nov 2, 2023
gardner48 added a commit that referenced this issue Dec 18, 2023
Updated `SUNProfiler` to note rely on `MPI_WTime`. Fixes #312

---------

Co-authored-by: David Gardner <gardner48@llnl.gov>
balos1 added a commit that referenced this issue Dec 18, 2023
Updated `SUNProfiler` to note rely on `MPI_WTime`. Fixes #312

---------

Co-authored-by: David Gardner <gardner48@llnl.gov>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants