Tests Anasazi_Epetra_ModalSolversTester_MPI_4 and Anasazi_Epetra_OrthoManagerGenTester_[0,1]_MPI_4 failing in 'debug' builds on white/ride #2473

bartlettroscoe · 2018-03-28T23:08:31Z

CC: @trilinos/anasazi, @mhoemmen

Next Action Status

PR #2621 merged on 4/24/2018 that re-enables the tests Anasazi_Epetra_ModalSolversTester_MPI_4 and Anasazi_Epetra_OrthoManagerGenTester_[0,1]_MPI_4 . Tests ran and passed in all promoted ATDM Trilinos builds between 5/20/2018 and 6/7/2018.

Description

The tests:

Anasazi_Epetra_ModalSolversTester_MPI_4
Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4
Anasazi_Epetra_OrthoManagerGenTester_1_MPI_4

failed in Trilinos-atdm-hansen-shiller-cuda-debug build on 'ride' as shown at:

This build is targeted to be an auto PR build for Trilinos (see #2464) so we desire to clean up this build more quickly.

Intrestingly, these tests did not fail in what should be the idential Trilinos-atdm-hansen-shiller-cuda-debug build on the identical machine 'white' as shown at:

Strangely, those tests did fail on Trilinos-atdm-hansen-shiller-cuda-debug build on 'white' yestrday shown at:

https://testing-vm.sandia.gov/cdash/viewTest.php?buildid=3396229

A) Anasazi_Epetra_ModalSolversTester_MPI_4:

Test failing test Anasazi_Epetra_ModalSolversTester_MPI_4 today with details shown at:

https://testing-vm.sandia.gov/cdash/testDetails.php?test=44732780&build=3398699

showed the failure:

************* Householder Apply Test *************

             orthonorm error of V: 7.08978e-16
            orthonorm error of VQ: 0.375867
ERROR:  V*Q failed.
    orthonorm error of applyHouse: 0.375867
ERROR:  applyHouse failed.
        error(VQ - house(V,H,tau): 2.64481e-16

************* DirectSolver Test *************

Looking at all of the builds today that ran that test shown at:

https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-03-28&limit=0&filtercount=1&showfilters=1&field1=testname&compare1=61&value1=Anasazi_Epetra_ModalSolversTester_MPI_4

this test fails in the same way (i.e. a numerical problem) on the builds Linux-gcc-4.8.4-MPI_RELEASE_12.12.1 and Linux-gcc-4.8.4-MPI_RELEASE_12.12.1_SHARED on the machine hansel.sandia.gov so this problem is not isolated to ATDM builds of Trilinos.

Also note that this test failed for the ATDM builds Trilinos-atdm-white-ride-gnu-opt-openmp and Trilinos-atdm-white-ride-gnu-opt-openmp with segfaults, but that is already being addressed by #2454 and is likely unrelated.

B) Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4:

The failing test Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 today with details shown at:

https://testing-vm.sandia.gov/cdash/testDetails.php?test=44732781&build=3398699

showed:

Anasazi in Trilinos 12.13 (Dev)

 Generating Y1,Y2 for project() : testing... 
   || <Y1,Y1> - I || : 6.47718e-16
   || <Y2,Y2> - I || : 7.20309e-16
   || <X1,Y2> ||     : 1.64775e-16
   || <X1b,Y2> ||     : 6.9984e-15

p=3: *** Caught standard std::exception of type 'std::runtime_error' :

 /home/jenkins/ride/workspace/Trilinos-atdm-white-ride-cuda-debug/SRC_AND_BUILD/Trilinos/packages/anasazi/epetra/test/OrthoManager/cxx_gentest.cpp:274:
 
 Throw number = 1
 
 Throw test that evaluated to true: err > TOL
 
 New X1 did not meet tolerance: orthog(X1,Y2) == 0.547032

Looking at all of the builds today that ran that test shown at:

https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-03-28&limit=0&filtercount=1&showfilters=1&field1=testname&compare1=61&value1=Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4

you can see that this test also failed in a similar (numerical) way in the builds Linux-gcc-4.9.3-Sierra_MPI_release_DEV_ETI_SERIAL-ON_OPENMP-ON_PTHREAD-OFF_CUDA-OFF_COMPLEX-ON and Linux-GCC-4.9.3-openmpi-1.8.7_Debug_DEV_Werror so it looks like this problem is not isolated to ATDM builds of Trilinos. Note that one of those is a "Sierra' build of Trilinos.

The text was updated successfully, but these errors were encountered:

bartlettroscoe · 2018-03-29T23:07:52Z

This test Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 newly failed in the build Trilinos-atdm-white-ride-cuda-debug on 'white' today as shown at:

https://testing-vm.sandia.gov/cdash/testDetails.php?test=44769659&build=3400803

showing:

Generating Y1,Y2 for project() : testing... 
   || <Y1,Y1> - I || : 7.13673e-16
   || <Y2,Y2> - I || : 7.85286e-16
   || <X1,Y2> ||     : 1.71386e-16
   || <X1b,Y2> ||     : 7.10285e-15

p=1: *** Caught standard std::exception of type 'std::runtime_error' :

 /home/rabartl/WHITE/ATDM_Driver/Trilinos-atdm-white-ride-cuda-debug/SRC_AND_BUILD/Trilinos/packages/anasazi/epetra/test/OrthoManager/cxx_gentest.cpp:274:
 
 Throw number = 1
 
 Throw test that evaluated to true: err > TOL
 
 New X1 did not meet tolerance: orthog(X1,Y2) == 0.356233

...

It passed yesterday in the same build as shown at:

https://testing-vm.sandia.gov/cdash/testDetails.php?test=44410112&build=3398630

Looking at the history of this test on this build on 'white' in the query:

https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercombine=and&filtercombine=and&filtercount=4&showfilters=1&filtercombine=and&field1=buildname&compare1=61&value1=Trilinos-atdm-white-ride-cuda-debug&field2=testname&compare2=61&value2=Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4&field3=site&compare3=61&value3=white&field4=buildstarttime&compare4=84&value4=now

it fails three other times on various days going back to 3/12/2018. This suggests non-deterministic behavior causing the test to randomly fail.

Does this test cause some non-deterministic behavior about Anasazi or the underlying software being used? Could this be exposing a weakness in Trilinos software that could bite a user in a CUDA build?

In any case, I think this test should be disabled for now on these CUDA debug builds so that we can promote this build Trilinos-atdm-white-ride-cuda-debug to the "ATDM" CDash Track/Group which opens the door to using it as an auto PR build for Trilinos (which will be huge for stabilizing Trilinos for ATDM customers). Then, someone can debug this test offline when they get some time.

@mhoemmen, what do you think about this? Is it okay to disable this test for now until someone can debug what is causing the non-deterministic behavior?

mhoemmen · 2018-03-30T04:55:40Z

@bartlettroscoe wrote:

Is it okay to disable this test for now until someone can debug what is causing the non-deterministic behavior?

@hkthorn may have something to say, but I think it would be best to disable the test for now, as long as we don't "forget" (e.g., as long as we open a separate issue for the failing tests).

bartlettroscoe · 2018-03-30T16:41:37Z

I think it would be best to disable the test for now, as long as we don't "forget" (e.g., as long as we open a separate issue for the failing tests).

@mhoemmen and @hkthorn,

One option is to leave these issues open with the label "Disabled Tests" and assign it to the Product Lead for the area. Who is the Product Lead for Anasazi? Is that @srajama1?

srajama1 · 2018-03-30T17:12:27Z

Anasazi is a problem child that got stuck with a (linear solvers) family where it may not belong :). Yes, I am the lead. Let us wait for what @hkthorn says.

I worry this might be exposing something non-deterministic underneath.

bartlettroscoe · 2018-03-31T13:35:13Z

These randomly failing tests triggered the following CDash error email for the newly promoted build ??? this morning.

Can I go ahead and disable these randomly failing test in these builds? The tests will only be disabled for these builds and not others where the test is passing consistently.

From: CDash [mailto:trilinos-regression@sandia.gov]
Sent: Saturday, March 31, 2018 2:48 AM
To: Bartlett, Roscoe A rabartl@sandia.gov
Subject: FAILED (t=2): Trilinos/Anasazi - Trilinos-atdm-white-ride-gnu-debug-openmp - ATDM

A submission to CDash for the project Trilinos has failing tests.
You have been identified as one of the authors who have checked in changes that are part of this submission or you are listed in the default contact list.

Details on the submission can be found at https://testing.sandia.gov/cdash/buildSummary.php?buildid=3474500

Project: Trilinos
SubProject: Anasazi
Site: white
Build Name: Trilinos-atdm-white-ride-gnu-debug-openmp
Build Time: 2018-03-31T06:45:53 UTC
Type: ATDM
Tests failing: 2

Tests failing
Anasazi_Epetra_ModalSolversTester_MPI_4 (https://testing.sandia.gov/cdash/testDetails.php?test=46065301&build=3474500)
Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 (https://testing.sandia.gov/cdash/testDetails.php?test=46065302&build=3474500)

-CDash on testing.sandia.gov

mhoemmen · 2018-04-01T20:22:39Z

@bartlettroscoe Please do; thanks!

hkthorn · 2018-04-02T17:25:27Z

@bartlettroscoe @srajama1 @mhoemmen Go ahead and disable the failing tests for this platform, I have seen this issue before. Thanks!

bartlettroscoe · 2018-04-03T14:47:55Z

From @hkthorn:

Go ahead and disable the failing tests for this platform, I have seen this issue before. Thanks!

Okay, I will disable these failing tests. However, also note that we saw two new failing Anasazi tests for this build today shown in the below email.

The first test Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4 was a segfault. The last two look to be diffs.

Should we disable these tests as well? If not, does someone on the Linear Solvers area have time to triage these some more? We either need to fix the test or disable them (and then leave this issue as a reminder to fix them along with other approaches that we can consider to keep reminders of disabled tests).

From: CDash [mailto:trilinos-regression@sandia.gov]
Sent: Tuesday, April 03, 2018 1:32 AM
To: Bartlett, Roscoe A
Subject: FAILED (t=3): Trilinos/Anasazi - Trilinos-atdm-white-ride-gnu-debug-
openmp - ATDM

A submission to CDash for the project Trilinos has failing tests.
You have been identified as one of the authors who have checked in changes
that are part of this submission or you are listed in the default contact list.

Details on the submission can be found at
https://testing.sandia.gov/cdash/buildSummary.php?buildid=3480083

Project: Trilinos
SubProject: Anasazi
Site: ride
Build Name: Trilinos-atdm-white-ride-gnu-debug-openmp
Build Time: 2018-04-03T07:30:22 UTC
Type: ATDM
Tests failing: 3

Tests failing
Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4
(https://testing.sandia.gov/cdash/testDetails.php?test=46173794&build=3480083)
Anasazi_Epetra_OrthoManagerGenTester_1_MPI_4
(https://testing.sandia.gov/cdash/testDetails.php?test=46173795&build=3480083)
Anasazi_Epetra_LOBPCG_solvertest_MPI_4
(https://testing.sandia.gov/cdash/testDetails.php?test=46173813&build=3480083)

-CDash on testing.sandia.gov

bartlettroscoe · 2018-04-03T17:17:45Z

If you look at the query:

https://testing-vm.sandia.gov/cdash/queryTests.php?project=Trilinos&date=2018-04-03&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercombine=and&filtercount=9&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=status&compare2=62&value2=Passed&field3=buildname&compare3=62&value3=Trilinos-atdm-white-ride-cuda-opt&field4=buildname&compare4=62&value4=Trilinos-atdm-white-ride-gnu-opt-openmp&field5=testname&compare5=65&value5=Anasazi&field6=buildstarttime&compare6=84&value6=2018-04-04&field7=buildstarttime&compare7=83&value7=2018-03-20&field8=testname&compare8=62&value8=Anasazi_Epetra_BlockDavidson_auxtest_MPI_4&field9=testname&compare9=62&value9=Anasazi_Epetra_LOBPCG_auxtest_MPI_4

(which shows all of the failing Anasazi tests in the last two weeks that have not already been disabled (see #2455) or are not in the 'opt' builds on white/ride (see #2454)), you can see that the tests:

Anasazi_Epetra_ModalSolversTester_MPI_4
Anasazi_Epetra_OrthoManagerGenTester_0_MPI_4
Anasazi_Epetra_OrthoManagerGenTester_1_MPI_4

fail multiple times on various days in the two builds:

Trilinos-atdm-white-ride-cuda-debug
Trilinos-atdm-white-ride-gnu-debug-openmp

All three of these tests failed multiple days in the Trilinos-atdm-white-ride-cuda-debug build which is being targeted for an auto PR testing build (see #2464). Therefore, these should be disabled (as @hkthorn noted above).

The test Anasazi_Epetra_LOBPCG_solvertest_MPI_4 only failed today in the build Trilinos-atdm-white-ride-gnu-debug-openmp as shown in the above query. Therefore, this might have been a fluke so we should not disable this yet.

…ide (trilinos#2473) These tests randomly fail with massive diffs. Very strange behavior. See

…ide (trilinos#2473) These tests randomly fail with massive diffs. Very strange behavior. See Trilinos GitHub issue trilinos#2473 for history and more details.

bartlettroscoe · 2018-04-03T18:17:20Z

FYI: I created PR #2501 to disable these three randomly failing tests. I requested a review from @mhoemmen and/or @hkthorn.

bartlettroscoe · 2018-04-03T18:26:56Z

Just realized that the @trilinos/framework team ran into these same randomly failing tests in #1393 and they resolved the issue by disabling those tests as well. So it looks like this is the right decision to disable these tests in the ATDM builds.

But it also suggests that perhaps the problems with these tests should be studied more carefully or these tests just need to be disabled all together. That way, other people and projects will not run into these randomly failing tests over and over again. And if these are the only real tests for "ModelSolvers" in Anasazi, then perhaps that feature is not ready to be used by people and should be disabled by default as experimental code or something? Then we set up some build of Trilinos for all of this "Experimental" code so at least we know how it is doing.

…ing-anasazi-tests Disable 3 Anasazi tests that randomly fail in debug builds on white/ride (#2473)

bartlettroscoe · 2018-04-03T21:07:55Z

The PR #2501 was merged just now merging the commit 2e9da0c. Therefore, we should see these three tests disabled for these builds white/ride tomorrow.

Putting this issue in review

…evelop * 'develop' of https://github.com/trilinos/Trilinos: (560 commits) Disabling Stefan Boltzmann tests 1 and 2 due to an unresolved hang. Also, resetting the default problem size for Helmholtz to 16x16. Disable 3 Anasazi tests that randomly fail in debug builds on white/ride (trilinos#2473) TrilinosCouplings: Output iteration count Tpetra: use KokkosKernels addition (trilinos#2444) Tank solve and value correspond for all parameters TrilinosCouplings: OK. Now compiling TrilinosCouplings: More RTC updates Disabling failing test. Stokhos: Allow mean-based PCE preconditioner with double scalar type. (Painstakingly) reimplemented every tank equation individually. Now have solve and value working correctly together. TrilinosCouplings: Turning off file default output Kokkos: fix compilation for GCC 4.8.4 TrilinosCouplings: Adding block / RTC materials support to Tpetra example (take2) Kokkos: disable failing CUDA+DEBUG test TrilinosCouplings: Adding block / RTC materials support to Tpetra example adding doxygen for nd method Added comment Fixed warnings. Panzer: fix race condition in unit test exodus writer for CurlLaplacian example Fixed some problems in tank example. Solve and value are at least consistent when theta=1 ...

hkthorn · 2018-04-04T16:30:24Z

@bartlettroscoe @mhoemmen @srajama1 I have found the underlying issue in these tests. They use a Teuchos::SerialDenseMatrix, which is a serial object without MPI communication or implied synchronization of values. These matrices are randomized on each processor an then used to perform tests of the orthogonalization routines and modal solvers. Again, there is no explicit synchronization of Teuchos SDM objects, so when the randomization generates different matrices on different processors, the tests fail because the explicit expectations of the classes being tested, orthogonalization and modal solvers, are violated. I have a feeling this pattern might be in Belos as well. I will fix this today.

mhoemmen · 2018-04-04T17:52:19Z

@hkthorn Wow! Thanks for finding this; sounds tricky!

bartlettroscoe · 2018-04-04T18:16:40Z

@hkthorn, so this is a defect in the tests not the library code that users depend on?

Let me know when you have merged the fix into the Trilinos 'develop' branch and then I will re-enable these tests and we will let them run in the ATDM builds of Trilinos.

hkthorn · 2018-04-04T18:40:46Z

@bartlettroscoe @mhoemmen Absolutely, this is a defect in the design of the test. I will let you know when the fix is in Trilinos 'develop' branch so we can re-enable the tests for ATDM builds.

The longstanding test failures for the ModalSolvers and OrthoManager have been tracked down to the randomization of Teuchos::SerialDenseMatrix objects in parallel. There is no expectation that calling random() on an object that is locally owned to one MPI process will result in a SerialDenseMatrix that has the SAME random numbers in it on every MPI processor. It's that easy. #2473

bartlettroscoe · 2018-04-09T16:59:45Z

It looks like the test Anasazi_Epetra_LOBPCG_solvertest_MPI_4 may also also have some random failures. We saw the following failure for this test in the build Trilinos-atdm-white-ride-gnu-debug-openmp on 'white' on 4/18/2018:

https://testing.sandia.gov/cdash/testDetails.php?test=46354623&build=3488812

which showed:

Anasazi in Trilinos 12.13 (Dev)

Testing solver(default,default) with standard eigenproblem...
Testing solver(default,default) with generalized eigenproblem...
Testing solver(nev,false) with standard eigenproblem...
Testing solver(nev,true) with standard eigenproblem...
Testing solver(nev,false) with generalized eigenproblem...
Testing solver(nev,true) with generalized eigenproblem...
Testing solver(2*nev,false) with standard eigenproblem...
Testing solver(2*nev,true) with standard eigenproblem...
Testing solver(2*nev,false) with generalized eigenproblem...
Testing solver(2*nev,true) with generalized eigenproblem...
[white25:127665] *** Process received signal ***
[white25:127665] Signal: Segmentation fault (11)
[white25:127665] Signal code: Address not mapped (1)
[white25:127665] Failing at address: 0x10024850020
[white25:127665] [ 0] [0x100000050478]
[white25:127665] [ 1] [0x3ff0000000000000]
[white25:127665] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 3 with PID 127665 on node white25 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

Looking at the query:

https://testing.sandia.gov/cdash/queryTests.php?project=Trilinos&filtercount=6&showfilters=1&filtercombine=and&field1=buildname&compare1=65&value1=Trilinos-atdm-&field2=testname&compare2=61&value2=Anasazi_Epetra_LOBPCG_solvertest_MPI_4&field3=status&compare3=62&value3=Passed&field4=buildstarttime&compare4=84&value4=now&field5=buildname&compare5=62&value5=Trilinos-atdm-white-ride-cuda-opt&field6=buildname&compare6=62&value6=Trilinos-atdm-white-ride-gnu-opt-openmp

it looks like this test also failed on 'ride' in the same build on 4/3/2018 with the output:


Anasazi in Trilinos 12.13 (Dev)

Testing solver(default,default) with standard eigenproblem...
Testing solver(default,default) with generalized eigenproblem...
Testing solver(nev,false) with standard eigenproblem...
Testing solver(nev,true) with standard eigenproblem...
Testing solver(nev,false) with generalized eigenproblem...
Testing solver(nev,true) with generalized eigenproblem...
Testing solver(2*nev,false) with standard eigenproblem...
Testing solver(2*nev,true) with standard eigenproblem...
[ride13:114533] *** Process received signal ***
[ride13:114533] Signal: Segmentation fault (11)
[ride13:114533] Signal code: Address not mapped (1)
[ride13:114533] Failing at address: 0x10036020010
[ride13:114533] [ 0] [0x100000050478]
[ride13:114533] [ 1] [0x3ff0000000000000]
[ride13:114533] *** End of error message ***
--------------------------------------------------------------------------
mpiexec noticed that process rank 1 with PID 114533 on node ride13 exited on signal 11 (Segmentation fault).
--------------------------------------------------------------------------

We can keep an eye to see if this test fails again in this build or some other build. But if it does, we should likely disable this test for now.

hkthorn · 2018-04-09T17:34:36Z

I'll give the test a look to see if there are any bad patterns there. I have merged the PR that fixes the testing for the OrthoManager and ModalSolvers:

#2517

Thanks!

…able-random-failing-anasazi-tests" (trilinos#2473) This reverts commit 2e9da0c, reversing changes made to c828f5a. The merge branch in PR trilinos#2517 should allow these tests to pass now.

…ests Revert "Merge pull request #2501 from bartlettroscoe/2473-disable-random-failing-anasazi-tests" (#2473) This will allow these tests to run again in these ATDM builds and then we can see if they pass or not.

bartlettroscoe · 2018-04-24T17:28:00Z

The PR #2621 was merged that re-enables these tests. Now we wait and see how they run and if they fail or not in the coming days and weeks. I am removing the "Disabled Tests" label.

bartlettroscoe · 2018-04-24T22:41:13Z

NOTE: The test Anasazi_Epetra_LOBPCG_solvertest_MPI_4 that was randomly failing as described above is still randomly failing with a segfault, as recent as 2018-04-23. Therefore, since PR #2517 did not fix this test, we can assume it is unrelated to the other Anasazi tests covered in this issue. I created the new issue #2633 to address the issues with that test.

Therefore, all that is left for this current issue is to watch and see if we see any more random failures with the tests Anasazi_Epetra_ModalSolversTester_MPI_4 and Anasazi_Epetra_OrthoManagerGenTester_[0,1]_MPI_4 ...

bartlettroscoe · 2018-06-07T19:45:52Z

Looking at the recent history for these tests on CDash after 5/19/2018 (when the NETLIB BLAS and LAPACK got put back as described in #2454 (comment)) in the following queries:

We can see these tests did not fail a single time and it shows these tests running in the Trilinos-atdm-white-ride-gnu-debug-openmp and Trilinos-atdm-white-ride-cuda-debug builds.

Therefore, this issue appears to be resolved.

Closing as complete.

While addressing issue trilinos#2473, I found other places where a random serial dense matrix was used and expected to be the same in parallel. The synchronization method that was used to address the issue in Anasazi has been moved to the Teuchos serial dense helpers file so that other packages can use this utility in the generation of tests. In particular, this utility needed to be integrated into the MVOP testers for Belos and Anasazi, as well as the Belos orthogonalization tester. The assumption that a call to generate a random variable will return the same value on all processors is false and could have unknown consequences for testing. While it is unknown if any random failures can be tracked to these changes at this time, previous issues with Anasazi have been caused by this bad assumption. So, it is better to fix it.

bartlettroscoe added type: bug The primary issue is a bug in Trilinos code or tests pkg: Anasazi client: ATDM Any issue primarily impacting the ATDM project labels Mar 28, 2018

bartlettroscoe added this to the Initial cleanup of new ATDM builds of Trilinos milestone Mar 28, 2018

mhoemmen assigned bartlettroscoe and mhoemmen Mar 30, 2018

bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Apr 3, 2018

Disable 3 Anasazi tests that randomly fail in debug builds on white/r…

e05c814

…ide (trilinos#2473) These tests randomly fail with massive diffs. Very strange behavior. See

bartlettroscoe mentioned this issue Apr 3, 2018

Disable 3 Anasazi tests that randomly fail in debug builds on white/ride (#2473) #2501

Merged

bartlettroscoe added a commit that referenced this issue Apr 3, 2018

Merge pull request #2501 from bartlettroscoe/2473-disable-random-fail…

2e9da0c

…ing-anasazi-tests Disable 3 Anasazi tests that randomly fail in debug builds on white/ride (#2473)

bartlettroscoe added the stage: in review Primary work is completed and now is just waiting for human review and/or test feedback label Apr 3, 2018

hkthorn mentioned this issue Apr 6, 2018

Anasazi: Fixes issue with Teuchos::SerialDenseMatrix randomization in parallel #2517

Merged

9 tasks

bartlettroscoe mentioned this issue Apr 23, 2018

Revert "Merge pull request #2501 from bartlettroscoe/2473-disable-random-failing-anasazi-tests" (#2473) #2621

Merged

9 tasks

bartlettroscoe closed this as completed Jun 7, 2018

bartlettroscoe mentioned this issue Jun 8, 2018

Belos_pseudo_stochastic_pcg_hb_[0,1]_MPI_4 tests failing due to max iterations limit seemingly randomly in the Trilinos-atdm-white-ride-cuda-debug build on 'white' #2920

Closed

hkthorn mentioned this issue Oct 19, 2018

Fixes randomization issue with Teuchos::SerialDenseMatrix in parallel #3683

Merged

9 tasks

bartlettroscoe added the PA: Linear Solvers Issues that fall under the Trilinos Linear Solvers Product Area label Nov 30, 2018

trilinos-autotester mentioned this issue Oct 15, 2021

Trilinos Master Merge PR Generator: Auto PR created to promote from master_merge_20211014_000549 branch to master #9812

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Tests Anasazi_Epetra_ModalSolversTester_MPI_4 and Anasazi_Epetra_OrthoManagerGenTester_[0,1]_MPI_4 failing in 'debug' builds on white/ride #2473

Tests Anasazi_Epetra_ModalSolversTester_MPI_4 and Anasazi_Epetra_OrthoManagerGenTester_[0,1]_MPI_4 failing in 'debug' builds on white/ride #2473

bartlettroscoe commented Mar 28, 2018 •

edited

bartlettroscoe commented Mar 29, 2018

mhoemmen commented Mar 30, 2018

bartlettroscoe commented Mar 30, 2018

srajama1 commented Mar 30, 2018

bartlettroscoe commented Mar 31, 2018

mhoemmen commented Apr 1, 2018

hkthorn commented Apr 2, 2018

bartlettroscoe commented Apr 3, 2018

bartlettroscoe commented Apr 3, 2018

bartlettroscoe commented Apr 3, 2018

bartlettroscoe commented Apr 3, 2018

bartlettroscoe commented Apr 3, 2018

hkthorn commented Apr 4, 2018

mhoemmen commented Apr 4, 2018

bartlettroscoe commented Apr 4, 2018

hkthorn commented Apr 4, 2018

bartlettroscoe commented Apr 9, 2018

hkthorn commented Apr 9, 2018

bartlettroscoe commented Apr 24, 2018

bartlettroscoe commented Apr 24, 2018

bartlettroscoe commented Jun 7, 2018

Tests Anasazi_Epetra_ModalSolversTester_MPI_4 and Anasazi_Epetra_OrthoManagerGenTester_[0,1]_MPI_4 failing in 'debug' builds on white/ride #2473

Tests Anasazi_Epetra_ModalSolversTester_MPI_4 and Anasazi_Epetra_OrthoManagerGenTester_[0,1]_MPI_4 failing in 'debug' builds on white/ride #2473

Comments

bartlettroscoe commented Mar 28, 2018 • edited

Next Action Status

Description

bartlettroscoe commented Mar 29, 2018

mhoemmen commented Mar 30, 2018

bartlettroscoe commented Mar 30, 2018

srajama1 commented Mar 30, 2018

bartlettroscoe commented Mar 31, 2018

mhoemmen commented Apr 1, 2018

hkthorn commented Apr 2, 2018

bartlettroscoe commented Apr 3, 2018

bartlettroscoe commented Apr 3, 2018

bartlettroscoe commented Apr 3, 2018

bartlettroscoe commented Apr 3, 2018

bartlettroscoe commented Apr 3, 2018

hkthorn commented Apr 4, 2018

mhoemmen commented Apr 4, 2018

bartlettroscoe commented Apr 4, 2018

hkthorn commented Apr 4, 2018

bartlettroscoe commented Apr 9, 2018

hkthorn commented Apr 9, 2018

bartlettroscoe commented Apr 24, 2018

bartlettroscoe commented Apr 24, 2018

bartlettroscoe commented Jun 7, 2018

bartlettroscoe commented Mar 28, 2018 •

edited