Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-2 failing(timeout) in ATDM cuda builds #3833

Closed
fryeguy52 opened this issue Nov 8, 2018 · 3 comments
Assignees
Labels
client: ATDM Any issue primarily impacting the ATDM project PA: Discretizations Issues that fall under the Trilinos Discretizations Product Area pkg: Panzer type: bug The primary issue is a bug in Trilinos code or tests
Projects

Comments

@fryeguy52
Copy link
Contributor

fryeguy52 commented Nov 8, 2018

CC: @trilinos/panzer , @mperego (Trilinos Discretizations Product Lead), @bartlettroscoe

Next Action Status

PR #3800 merged on 11/9/2018 resulted in this test passing in all CUDA 'opt' builds and being disabled in all CUDA 'debug' builds starting 11/10/2018 and for three consecutive days as of 11/12/2018.

Description

As shown in this query the tests:

  • PanzerAdaptersSTK_MixedPoissonExample-ConvTest-Hex-Order-2

are failing in the builds:

  • Trilinos-atdm-white-ride-cuda-9.2-opt
  • Trilinos-atdm-waterman-cuda-9.2-debug
  • Trilinos-atdm-waterman-cuda-9.2-release-debug
  • Trilinos-atdm-white-ride-cuda-9.2-opt
  • Trilinos-atdm-white-ride-cuda-9.2-debug
  • Trilinos-atdm-hansen-shiller-cuda-8.0-opt
  • Trilinos-atdm-hansen-shiller-cuda-8.0-debug
  • Trilinos-atdm-hansen-shiller-cuda-9.0-debug
  • Trilinos-atdm-hansen-shiller-cuda-9.0-opt

Are these related to #2446 or #2751?

The timeouts started on 2018-10-31 for a few of the builds then others started in the following days.
shown here on waterman cuda-9.2-debug

The new commits that were pulled on 2018-11-03 that look like possible causes are:

e8051f3:  Panzer: update CurlLaplacian example
Author: Roger Pawlowski <rppawlo@sandia.gov>
Date:   Tue Oct 23 09:51:52 2018 -0600

de16afa:  Panzer: enable hierarchic parallelism in evaluators
Author: Roger Pawlowski <rppawlo@sandia.gov>
Date:   Thu Oct 18 15:18:43 2018 -0600

@rppawlo can you look and see if anything in these commits may have caused this test to start timing out?

Current Status on CDash

The current status of these tests/builds for the current testing day can be found at:

Steps to Reproduce

One should be able to reproduce this failure waterman as described in:

$ cd <some_build_dir>/

$ source $TRILINOS_DIR/cmake/std/atdm/load-env.sh Trilinos-atdm-waterman-cuda-9.2-debug

$ cmake \
  -GNinja \
  -DTrilinos_CONFIGURE_OPTIONS_FILE:STRING=cmake/std/atdm/ATDMDevEnv.cmake \
  -DTrilinos_ENABLE_TESTS=ON -DTrilinos_ENABLE_Panzer=ON \
  $TRILINOS_DIR

$ make NP=20

$ bsub -x -Is -n 20 ctest -j20
@fryeguy52 fryeguy52 added type: bug The primary issue is a bug in Trilinos code or tests pkg: Panzer client: ATDM Any issue primarily impacting the ATDM project labels Nov 8, 2018
@rppawlo
Copy link
Contributor

rppawlo commented Nov 9, 2018

This will most likely be fixed when trilinos atdm builds on cuda enable the hierarchic dfad support in #3800 . The commits you mentioned changed panzer to using hierarchic dfad objects. But if this capability is not enabled in sacado at configure time, then the new code can be slower. AFter we see results from #3800 being pushed, we can revisit this ticket.

@rppawlo rppawlo self-assigned this Nov 9, 2018
@rppawlo
Copy link
Contributor

rppawlo commented Nov 13, 2018

According to last night's build, everything is now passing. @fryeguy52 and @bartlettroscoe - can we close this?

@bartlettroscoe
Copy link
Member

According to last night's build, everything is now passing. @fryeguy52 and @bartlettroscoe - can we close this?

As shown in the below tables, for the past 3 days since the merge of PR #3800, this test is passing in all of the CUDA 'opt' builds or is disabled (missing) in the CUDA 'debug' builds.

Closing as complete!

Tests with issue trackers Passed: twip=4

Site Build Name Test Name Status Details Consec­utive Days Pass Nopass last 30 Days Pass last 30 Days Tracker
hansen Trilinos-atdm-hansen-shiller-cuda-8.0-opt PanzerAdaptersSTK­_MixedPoissonExample-ConvTest-Hex-Order-2 Passed Completed 3 13 25 #3833
hansen Trilinos-atdm-hansen-shiller-cuda-9.0-opt PanzerAdaptersSTK­_MixedPoissonExample-ConvTest-Hex-Order-2 Passed Completed 3 7 22 #3833
waterman Trilinos-atdm-waterman-cuda-9.2-opt PanzerAdaptersSTK­_MixedPoissonExample-ConvTest-Hex-Order-2 Passed Completed 3 10 20 #3833
white Trilinos-atdm-white-ride-cuda-9.2-opt PanzerAdaptersSTK­_MixedPoissonExample-ConvTest-Hex-Order-2 Passed Completed 3 7 22 #3833

Tests with issue trackers Missing: twim=4

Site Build Name Test Name Status Details Consec­utive Days Missing Nopass last 30 Days Pass last 30 Days Tracker
hansen Trilinos-atdm-hansen-shiller-cuda-8.0-debug PanzerAdaptersSTK­_MixedPoissonExample-ConvTest-Hex-Order-2 Missing Missing 3 8 19 #3833
hansen Trilinos-atdm-hansen-shiller-cuda-9.0-debug PanzerAdaptersSTK­_MixedPoissonExample-ConvTest-Hex-Order-2 Missing Missing 3 7 18 #3833
waterman Trilinos-atdm-waterman-cuda-9.2-debug PanzerAdaptersSTK­_MixedPoissonExample-ConvTest-Hex-Order-2 Missing Missing 3 10 17 #3833
white Trilinos-atdm-white-ride-cuda-9.2-debug PanzerAdaptersSTK­_MixedPoissonExample-ConvTest-Hex-Order-2 Missing Missing 3 7 18 #3833

P.S. The above tables were copied and pasted from emails generated by the tool being developed in #2933. In the (hopefully near) future, we will set up an automated job to update issues with info like this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client: ATDM Any issue primarily impacting the ATDM project PA: Discretizations Issues that fall under the Trilinos Discretizations Product Area pkg: Panzer type: bug The primary issue is a bug in Trilinos code or tests
Projects
Development

No branches or pull requests

3 participants