Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pliris: Pliris tests failing in PR ATS2 CUDA build which should not even be running #10931

Closed
bartlettroscoe opened this issue Aug 25, 2022 · 7 comments
Labels
Disabled Tests Issue has been partially addressed by disabling *all* of the failing tests related to the issue PA: Framework Issues that fall under the Trilinos Framework Product Area PA: Linear Solvers Issues that fall under the Trilinos Linear Solvers Product Area pkg: Pliris type: bug The primary issue is a bug in Trilinos code or tests

Comments

@bartlettroscoe
Copy link
Member

bartlettroscoe commented Aug 25, 2022

Bug Report

@trilinos/framework

Internal Issues:

Description

So it seems that Pliris tests:

  • Pliris_vector_random_MPI_3
  • Pliris_vector_random_MPI_4

are failing the PR CUDA build:

  • ats2_cuda-10.1.243-gnu-8.3.1-spmpi-rolling_release_static_Volta70_Power9_no-asan_no-complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables

as shown here for PR #10930.

This is the first time these Pliris tests have ever been run in this ATS2 PR build in all recorded history as shown in this index.php query showing:

image

and in this queryTests.php query showing:

image

However, these tests are explicitly disabled in the PR CUDA build:

  • `rhel7_sems-cuda-11.4.2-sems-gnu-10.1.0-sems-openmpi-4.0.5_release_static_Volta70_no-asan_complex_no-fpic_mpi_pt_no-rdc_no-uvm_deprecated-on_no-package-enables-482

as shown here showing:

Processing enabled package: Pliris (Libs, Tests, Examples)
     Building the double precision(default) library
-- Pliris_vector_random_MPI_3: NOT added test because Pliris_vector_random_MPI_3_DISABLE='ON'!
-- Pliris_vector_random_MPI_4: NOT added test because Pliris_vector_random_MPI_4_DISABLE='ON'!

Steps to Reproduce

Follow instructions at:

@bartlettroscoe bartlettroscoe added the type: bug The primary issue is a bug in Trilinos code or tests label Aug 25, 2022
@bartlettroscoe
Copy link
Member Author

So how is it that PR #10930 is the first time these Pliris tests have even been run for the ATS2 CUDA PR build even when Trilinos_ENABLE_ALL_PACKAGES=ON? Looking at the last PR of mine that ran that enabled all packages was on 2022-08-12 for PR #10813:

Looking at the configure output for that PR build here, we see:

-- Setting Trilinos_ENABLE_ALL_PACKAGES = ON

...


Final set of non-enabled packages:  Pliris Komplex TriKota Compadre Moertel PyTrilinos NewPackage 7

Final set of non-enabled SE packages:  Pliris ... 23

The Trilinos/PackagesList.cmake file shows:

  Pliris                packages/pliris                   ST

So why is Pliris getting disabled on that build? The GenConfig fragment file for that build generatedPRFragment.cmake shows:

set(Trilinos_ENABLE_SECONDARY_TESTED_CODE OFF CACHE BOOL "from .ini configuration")

So that ATS2 PR build is only testing Primary Tested code and not Secondary Tested code?

So an ST package like Pliris will only get enabled if it is explicitly enabled, in this case, because I touched the file packages/pliris/CMakeLists.txt.

Looking at the file:

  • packages/framework/ini-files/config-specs.ini

we see:

[USE-PT|YES]
opt-set-cmake-var Trilinos_ENABLE_SECONDARY_TESTED_CODE BOOL : OFF

[USE-PT|NO]
opt-set-cmake-var Trilinos_ENABLE_SECONDARY_TESTED_CODE BOOL : ON

and:

[ats2_cuda-10.1.243-gnu-8.3.1-spmpi-rolling_release_static_Volta70_Power9_no-asan_no-complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables]
...
use USE-PT|YES
...

So it seems that the only PR builds that have Trilinos_ENABLE_SECONDARY_TESTED_CODE=OFF are the ATS2 builds. Why?

@bartlettroscoe
Copy link
Member Author

There are a few ways to fix this problem but given that they are trying to get rid of this ATS2 build, the simplest thing is likely to just force disable Pliris in the ATS2 builds. You should not even be enabling Pliris in any ATS2 build because it will not be enabled unless someone actually changes a file in Pliris. So that is what I will do in PR #10930.

@bartlettroscoe bartlettroscoe added pkg: Pliris PA: Framework Issues that fall under the Trilinos Framework Product Area PA: Linear Solvers Issues that fall under the Trilinos Linear Solvers Product Area labels Aug 25, 2022
bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Aug 25, 2022
Because Seconddary Tested code is disable in ATS2 builds, we don't want a
change to Pliris to trigger the enable since it is already broken on thse ATS2
CUDA builds.  For all of the details, see trilinos#10931.
bartlettroscoe added a commit to bartlettroscoe/Trilinos that referenced this issue Aug 25, 2022
For some reason, CMake is not respecting the force disable of
Trilinos_ENABLE_Pliris from the last commit.  Rather than try to debug this, I
am just disabling the tests which will allow PR trilinos#10930 to pass the PR builds
and merge.

There are other issues with the GenConfig files that will need to be addressed
that I am seeing.
@bartlettroscoe
Copy link
Member Author

FYI: I have disabled these tests in PR #10930 in some new commits I pushed just now.

@bartlettroscoe
Copy link
Member Author

With the merge of PR #10930, Pliris is now disabled in the ATS2 PR build. Therefore, I will put Disabled Tests label on this issue and keep it open in case someone wants to fix this. If there is no chance this will ever get fixed, then please close it.

@bartlettroscoe bartlettroscoe added the Disabled Tests Issue has been partially addressed by disabling *all* of the failing tests related to the issue label Aug 26, 2022
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 26, 2022
…s:develop' (8906842).

* trilinos-develop: (128 commits)
  Intrepid2: update TensorData.setFirstComponentExtentInDimension0 to modify extents_[0] (trilinos#10929)
  Tpetra: Adding configure option to disable Kokkos integration test
  Automatic snapshot commit from tribits at 142e5362
  Disable Pliris tests in ATS2 GenConfig builds (trilinos#10931)
  Force disable Pliris in ATS2 builds (trilinos#10931)
  Automatic snapshot commit from tribits at ab419429
  Change cmake_minimum_required() from 3.17.1 to 3.0 (TriBITSPub/TriBITS#522)
  Pliris: Remove local var hiding cache var Pliris_ENABLE_DREAL (trilinos#10774, TriBITSPub/TriBITS#516)
  Remove printing of vars that are now empty (TriBITSPub/TriBITS#299)
  Panzer: move periodic helper typedefs into namespace
  Revert incorrect fix in previous commit
  Fix typos in some docs
  fix scratch typos
  STK: Snapshot 08-22-22 12:44
  Phalanx: remove cuda compiler warnings and add test for new use case for vov
  changed a double to a scalar_type to compile for complex arith
  MueLu: Fix signed vs unsigned comparison in Aggregates_kokkos.cpp
  Amesos2 : trying to fix MKL header including issues
  MueLu: Add Aggregates_kokkos.ComputeNodesInAggregate
  Testing on Geminga: Do not disable Kokkos in Epetra build
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 27, 2022
…s:develop' (8906842).

* trilinos-develop: (130 commits)
  Intrepid2: update TensorData.setFirstComponentExtentInDimension0 to modify extents_[0] (trilinos#10929)
  Tpetra: Adding configure option to disable Kokkos integration test
  MueLu: Allow to print Kokkos config when default node type is used
  Automatic snapshot commit from tribits at 142e5362
  Disable Pliris tests in ATS2 GenConfig builds (trilinos#10931)
  Force disable Pliris in ATS2 builds (trilinos#10931)
  Automatic snapshot commit from tribits at ab419429
  Change cmake_minimum_required() from 3.17.1 to 3.0 (TriBITSPub/TriBITS#522)
  Pliris: Remove local var hiding cache var Pliris_ENABLE_DREAL (trilinos#10774, TriBITSPub/TriBITS#516)
  Remove printing of vars that are now empty (TriBITSPub/TriBITS#299)
  Panzer: move periodic helper typedefs into namespace
  Revert incorrect fix in previous commit
  Fix typos in some docs
  fix scratch typos
  STK: Snapshot 08-22-22 12:44
  Phalanx: remove cuda compiler warnings and add test for new use case for vov
  changed a double to a scalar_type to compile for complex arith
  MueLu: Fix signed vs unsigned comparison in Aggregates_kokkos.cpp
  Amesos2 : trying to fix MKL header including issues
  MueLu: Add Aggregates_kokkos.ComputeNodesInAggregate
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 28, 2022
…s:develop' (8906842).

* trilinos-develop: (130 commits)
  Intrepid2: update TensorData.setFirstComponentExtentInDimension0 to modify extents_[0] (trilinos#10929)
  Tpetra: Adding configure option to disable Kokkos integration test
  MueLu: Allow to print Kokkos config when default node type is used
  Automatic snapshot commit from tribits at 142e5362
  Disable Pliris tests in ATS2 GenConfig builds (trilinos#10931)
  Force disable Pliris in ATS2 builds (trilinos#10931)
  Automatic snapshot commit from tribits at ab419429
  Change cmake_minimum_required() from 3.17.1 to 3.0 (TriBITSPub/TriBITS#522)
  Pliris: Remove local var hiding cache var Pliris_ENABLE_DREAL (trilinos#10774, TriBITSPub/TriBITS#516)
  Remove printing of vars that are now empty (TriBITSPub/TriBITS#299)
  Panzer: move periodic helper typedefs into namespace
  Revert incorrect fix in previous commit
  Fix typos in some docs
  fix scratch typos
  STK: Snapshot 08-22-22 12:44
  Phalanx: remove cuda compiler warnings and add test for new use case for vov
  changed a double to a scalar_type to compile for complex arith
  MueLu: Fix signed vs unsigned comparison in Aggregates_kokkos.cpp
  Amesos2 : trying to fix MKL header including issues
  MueLu: Add Aggregates_kokkos.ComputeNodesInAggregate
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 29, 2022
…s:develop' (8906842).

* trilinos-develop: (130 commits)
  Intrepid2: update TensorData.setFirstComponentExtentInDimension0 to modify extents_[0] (trilinos#10929)
  Tpetra: Adding configure option to disable Kokkos integration test
  MueLu: Allow to print Kokkos config when default node type is used
  Automatic snapshot commit from tribits at 142e5362
  Disable Pliris tests in ATS2 GenConfig builds (trilinos#10931)
  Force disable Pliris in ATS2 builds (trilinos#10931)
  Automatic snapshot commit from tribits at ab419429
  Change cmake_minimum_required() from 3.17.1 to 3.0 (TriBITSPub/TriBITS#522)
  Pliris: Remove local var hiding cache var Pliris_ENABLE_DREAL (trilinos#10774, TriBITSPub/TriBITS#516)
  Remove printing of vars that are now empty (TriBITSPub/TriBITS#299)
  Panzer: move periodic helper typedefs into namespace
  Revert incorrect fix in previous commit
  Fix typos in some docs
  fix scratch typos
  STK: Snapshot 08-22-22 12:44
  Phalanx: remove cuda compiler warnings and add test for new use case for vov
  changed a double to a scalar_type to compile for complex arith
  MueLu: Fix signed vs unsigned comparison in Aggregates_kokkos.cpp
  Amesos2 : trying to fix MKL header including issues
  MueLu: Add Aggregates_kokkos.ComputeNodesInAggregate
  ...
@e10harvey
Copy link
Contributor

So it seems that the only PR builds that have Trilinos_ENABLE_SECONDARY_TESTED_CODE=OFF are the ATS2 builds. Why?

The answer here may be lost. This was ported over from the old PR infrastructure. Would the addition of a "seconday tested (st) config option in ats2_cuda-10.1.243-gnu-8.3.1-spmpi-rolling_release_static_Volta70_Power9_no-asan_no-complex_no-fpic_mpi_pt_no-rdc_uvm_deprecated-on_no-package-enables clear this up?

We would then have:
ats2_cuda-10.1.243-gnu-8.3.1-spmpi-rolling_release_static_Volta70_Power9_no-asan_no-complex_no-fpic_mpi_pt_no-st_no-rdc_uvm_deprecated-on_no-package-enables

jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 29, 2022
…s:develop' (8906842).

* trilinos-develop: (130 commits)
  Intrepid2: update TensorData.setFirstComponentExtentInDimension0 to modify extents_[0] (trilinos#10929)
  Tpetra: Adding configure option to disable Kokkos integration test
  MueLu: Allow to print Kokkos config when default node type is used
  Automatic snapshot commit from tribits at 142e5362
  Disable Pliris tests in ATS2 GenConfig builds (trilinos#10931)
  Force disable Pliris in ATS2 builds (trilinos#10931)
  Automatic snapshot commit from tribits at ab419429
  Change cmake_minimum_required() from 3.17.1 to 3.0 (TriBITSPub/TriBITS#522)
  Pliris: Remove local var hiding cache var Pliris_ENABLE_DREAL (trilinos#10774, TriBITSPub/TriBITS#516)
  Remove printing of vars that are now empty (TriBITSPub/TriBITS#299)
  Panzer: move periodic helper typedefs into namespace
  Revert incorrect fix in previous commit
  Fix typos in some docs
  fix scratch typos
  STK: Snapshot 08-22-22 12:44
  Phalanx: remove cuda compiler warnings and add test for new use case for vov
  changed a double to a scalar_type to compile for complex arith
  MueLu: Fix signed vs unsigned comparison in Aggregates_kokkos.cpp
  Amesos2 : trying to fix MKL header including issues
  MueLu: Add Aggregates_kokkos.ComputeNodesInAggregate
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 29, 2022
…s:develop' (8906842).

* trilinos-develop: (130 commits)
  Intrepid2: update TensorData.setFirstComponentExtentInDimension0 to modify extents_[0] (trilinos#10929)
  Tpetra: Adding configure option to disable Kokkos integration test
  MueLu: Allow to print Kokkos config when default node type is used
  Automatic snapshot commit from tribits at 142e5362
  Disable Pliris tests in ATS2 GenConfig builds (trilinos#10931)
  Force disable Pliris in ATS2 builds (trilinos#10931)
  Automatic snapshot commit from tribits at ab419429
  Change cmake_minimum_required() from 3.17.1 to 3.0 (TriBITSPub/TriBITS#522)
  Pliris: Remove local var hiding cache var Pliris_ENABLE_DREAL (trilinos#10774, TriBITSPub/TriBITS#516)
  Remove printing of vars that are now empty (TriBITSPub/TriBITS#299)
  Panzer: move periodic helper typedefs into namespace
  Revert incorrect fix in previous commit
  Fix typos in some docs
  fix scratch typos
  STK: Snapshot 08-22-22 12:44
  Phalanx: remove cuda compiler warnings and add test for new use case for vov
  changed a double to a scalar_type to compile for complex arith
  MueLu: Fix signed vs unsigned comparison in Aggregates_kokkos.cpp
  Amesos2 : trying to fix MKL header including issues
  MueLu: Add Aggregates_kokkos.ComputeNodesInAggregate
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 29, 2022
…s:develop' (8906842).

* trilinos-develop: (130 commits)
  Intrepid2: update TensorData.setFirstComponentExtentInDimension0 to modify extents_[0] (trilinos#10929)
  Tpetra: Adding configure option to disable Kokkos integration test
  MueLu: Allow to print Kokkos config when default node type is used
  Automatic snapshot commit from tribits at 142e5362
  Disable Pliris tests in ATS2 GenConfig builds (trilinos#10931)
  Force disable Pliris in ATS2 builds (trilinos#10931)
  Automatic snapshot commit from tribits at ab419429
  Change cmake_minimum_required() from 3.17.1 to 3.0 (TriBITSPub/TriBITS#522)
  Pliris: Remove local var hiding cache var Pliris_ENABLE_DREAL (trilinos#10774, TriBITSPub/TriBITS#516)
  Remove printing of vars that are now empty (TriBITSPub/TriBITS#299)
  Panzer: move periodic helper typedefs into namespace
  Revert incorrect fix in previous commit
  Fix typos in some docs
  fix scratch typos
  STK: Snapshot 08-22-22 12:44
  Phalanx: remove cuda compiler warnings and add test for new use case for vov
  changed a double to a scalar_type to compile for complex arith
  MueLu: Fix signed vs unsigned comparison in Aggregates_kokkos.cpp
  Amesos2 : trying to fix MKL header including issues
  MueLu: Add Aggregates_kokkos.ComputeNodesInAggregate
  ...
jmgate pushed a commit to tcad-charon/Trilinos that referenced this issue Aug 29, 2022
…s:develop' (8906842).

* trilinos-develop: (130 commits)
  Intrepid2: update TensorData.setFirstComponentExtentInDimension0 to modify extents_[0] (trilinos#10929)
  Tpetra: Adding configure option to disable Kokkos integration test
  MueLu: Allow to print Kokkos config when default node type is used
  Automatic snapshot commit from tribits at 142e5362
  Disable Pliris tests in ATS2 GenConfig builds (trilinos#10931)
  Force disable Pliris in ATS2 builds (trilinos#10931)
  Automatic snapshot commit from tribits at ab419429
  Change cmake_minimum_required() from 3.17.1 to 3.0 (TriBITSPub/TriBITS#522)
  Pliris: Remove local var hiding cache var Pliris_ENABLE_DREAL (trilinos#10774, TriBITSPub/TriBITS#516)
  Remove printing of vars that are now empty (TriBITSPub/TriBITS#299)
  Panzer: move periodic helper typedefs into namespace
  Revert incorrect fix in previous commit
  Fix typos in some docs
  fix scratch typos
  STK: Snapshot 08-22-22 12:44
  Phalanx: remove cuda compiler warnings and add test for new use case for vov
  changed a double to a scalar_type to compile for complex arith
  MueLu: Fix signed vs unsigned comparison in Aggregates_kokkos.cpp
  Amesos2 : trying to fix MKL header including issues
  MueLu: Add Aggregates_kokkos.ComputeNodesInAggregate
  ...
cgcgcg pushed a commit to cgcgcg/Trilinos that referenced this issue Sep 12, 2022
Because Seconddary Tested code is disable in ATS2 builds, we don't want a
change to Pliris to trigger the enable since it is already broken on thse ATS2
CUDA builds.  For all of the details, see trilinos#10931.
cgcgcg pushed a commit to cgcgcg/Trilinos that referenced this issue Sep 12, 2022
For some reason, CMake is not respecting the force disable of
Trilinos_ENABLE_Pliris from the last commit.  Rather than try to debug this, I
am just disabling the tests which will allow PR trilinos#10930 to pass the PR builds
and merge.

There are other issues with the GenConfig files that will need to be addressed
that I am seeing.
@sebrowne
Copy link
Contributor

@bartlettroscoe can we call this closed since ATS2 is gone as far as PR/MM testing are concerned?

@bartlettroscoe
Copy link
Member Author

Since ats2 PR build is gone, this is a non-issue.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Disabled Tests Issue has been partially addressed by disabling *all* of the failing tests related to the issue PA: Framework Issues that fall under the Trilinos Framework Product Area PA: Linear Solvers Issues that fall under the Trilinos Linear Solvers Product Area pkg: Pliris type: bug The primary issue is a bug in Trilinos code or tests
Projects
None yet
Development

No branches or pull requests

3 participants