Test Ifpack2_unit_tests_MPI_4 unit tests randomly failing in many ATDM and PR builds since at least 2021-08-30 #10016
Labels
impacting: tests
The defect (bug) is primarily a test failure (vs. a build failure)
PA: Linear Solvers
Issues that fall under the Trilinos Linear Solvers Product Area
pkg: Ifpack2
Primary Build
Added by triager to mark failures affecting primary builds
Secondary Build
Added by triager to mark failures affecting secondary builds
type: bug
The primary issue is a bug in Trilinos code or tests
Projects
CC: @trilinos/ifpack2, @<triage-contact> (Trilinos <product-area-name> Triage Contact (or "Current ATDM contact"))
Next Action Status
Description
As shown in this query (click "Shown Matching Output" in upper right) the tests:
Ifpack2_unit_tests_MPI_4
in the builds:
PR-9483-test-Trilinos_pullrequest_clang_10.0.0-3559
PR-9483-test-Trilinos_pullrequest_gcc_7.2.0_debug-3527
PR-9483-test-Trilinos_pullrequest_gcc_7.2.0_debug-3591
PR-9627-test-Trilinos_pullrequest_cuda_10.1.105-2132
PR-9627-test-Trilinos_pullrequest_cuda_10.1.105_uvm_off-1129
PR-9660-test-Trilinos_pullrequest_gcc_7.2.0_debug-3528
PR-9660-test-Trilinos_pullrequest_gcc_7.2.0_debug-3538
PR-9676-test-Trilinos_pullrequest_clang_10.0.0-3585
PR-9691-test-Trilinos_pullrequest_clang_10.0.0-3641
PR-9691-test-Trilinos_pullrequest_gcc_7.2.0_debug-3614
PR-9691-test-Trilinos_pullrequest_gcc_7.2.0_debug-3648
PR-9758-test-Trilinos_pullrequest_gcc_7.2.0_debug-3747
PR-9768-test-Trilinos_pullrequest_clang_10.0.0-3765
PR-9773-test-rhel7_sems-clang-7.0.1-openmpi-1.10.1-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-package-enables-19
PR-9810-test-Trilinos_pullrequest_gcc_7.2.0_debug-3839
PR-9836-test-Trilinos_pullrequest_clang_10.0.0-3913
PR-9859-test-rhel7_sems-clang-7.0.1-openmpi-1.10.1-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-package-enables-49
PR-9859-test-rhel7_sems-clang-7.0.1-openmpi-1.10.1-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-package-enables-53
PR-9866-test-rhel7_sems-gnu-7.2.0-openmpi-1.10.1-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-package-enables-77
PR-9876-test-rhel7_sems-gnu-7.2.0-openmpi-1.10.1-serial_debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-package-enables-81
PR-9876-test-rhel7_sems-gnu-7.2.0-openmpi-1.10.1-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-package-enables-142
PR-9876-test-rhel7_sems-gnu-8.3.0-openmpi-1.10.1-openmp_release-debug_static_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-package-enables-188
PR-9883-test-Trilinos_pullrequest_clang_10.0.0-3937
PR-9920-test-Trilinos_pullrequest_gcc_7.2.0_debug-4045
PR-9929-test-Trilinos_pullrequest_gcc_7.2.0_debug-4120
PR-9990-test-rhel7_sems-gnu-7.2.0-openmpi-1.10.1-serial_release-debug_shared_no-kokkos-arch_no-asan_no-complex_no-fpic_mpi_no-pt_no-rdc_no-package-enables-202
PR-9999-test-Trilinos_pullrequest_clang_10.0.0-4135
PR-Experimental-test-Trilinos_pullrequest_caraway-29
Trilinos-atdm-sems-rhel7-clang-7.0.1-openmp-shared-release
Trilinos-atdm-sems-rhel7-clang-7.0.1-openmp-shared-release-debug
Trilinos-atdm-sems-rhel7-intel-18.0.5-openmp-shared-debug
Trilinos-atdm-sems-rhel7-intel-18.0.5-openmp-shared-release-debug
started failing on testing day 2021-08-30.
When the unit test
Ifpack2Chebyshev_double_int_longlong_Test0_UnitTest
fails it seems to be missing the tolerance by just a little as shown here showing:It looks like other unit tests are randomly failing as well failing to meet the tolerance.
If you run this query and then click "Shown Matching Output" you can see by how much the tolerance is being missed in these various tests.
Current Status on CDash
Run the above query adjusting the "Begin" and "End" dates to match today any other date range or just click "CURRENT" in the top bar to see results for the current testing day.
Steps to Reproduce
One should be able to reproduce this failure as described in:
and the system-specific instructions at:
Just log into any of the associated machines and copy and paste the full CDash build name
<build-name>
listed above and run commands like:where
<package-name>
is any package that you want to enable to reproduce build and/or test results.Again, for exact system-specific details on what commands to run to build and run tests, see:
If you can't figure out what commands to run to reproduce the problem given this documentation, then please post a comment here and we will give you the exact minimal commands.
The text was updated successfully, but these errors were encountered: