Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CI system runs examples with CTest if tests are defined in examples CMakeList.txt. #57

Merged
merged 3 commits into from
Feb 16, 2020

Conversation

jdeaton
Copy link
Member

@jdeaton jdeaton commented Jan 17, 2020

Previously, the Travis-CI system only checked that all of the separate example executables successfully compile, but did not execute them. Here we implement a solution that lets the Travis-CI system automatically test the execution of the examples if appropriate CTest "tests" are defined for them and they are identified to be quick executing.

We use CTest to handle the execution of examples on Travis-CI. As a result, running an example requires add_test() statements for its CMake target. In addition, since many of the examples have long runtime and we currently only want to test execution of fast ones on Travis-CI, we need to apply a "SHORT" label to the test target. The following code shows what to add to an example's CMakeLists.txt to enable both a serial and parallel MPI execution on Travis-CI. Currently, structural examples 1, 4, and 7 provide good references for "short" examples. Structural example 3 defines tests that are NOT labeled as "SHORT". They can be executed by CTest locally, but will not run on Travis-CI since it this example currently has longer runtime.

# Test on single processor.
add_test(NAME example_name
    COMMAND $<TARGET_FILE:example_target_name> -petsc_solver_options -other_command_line_options)
set_tests_properties(test_name
    PROPERTIES
        LABELS "SHORT;SEQ")

# Test multiple processors.
add_test(NAME example_name_mpi
    COMMAND ${MPIEXEC_EXECUTABLE} -np 2 $<TARGET_FILE:example_target_name>
            -petsc_solver_options -other_command_line_options)
set_tests_properties(example_name
    PROPERTIES
        LABELS "SHORT;MPI")

Note that Travis CI workers only have 2 processors so we should only use -np 2 when running there and I have currently hard-coded this in.

Future enhancements in new pull requests will:

  • check the results of all "SHORT" labeled examples against known verification data.
  • incorporate a wrapper to start execution of long examples, but then kill them once the solvers start. This will detect potential linking errors in build and ensure that basic problem setup is correct.

@jdeaton
Copy link
Member Author

jdeaton commented Jan 17, 2020

@manavbhatia I could use some advice on what solver options to setup for our examples with different physics. Note since the workers only have 2 processors and we are limited to 50 minute job times, we might not want to run the ones that take longer to run.

In this first cut of an implementation, all that is required to have examples run on Travis-CI is to add add_test() CMake statements to the corresponding example's CMakeLists.txt. (see for example: examples/structural/example_7/CMakeLists.txt in commit b8fdca5 above) CTest will grab them up automatically.

One issue I'm seeing now is that my Linux Travis-CI worker is pretty fragile it seems due to older versions of PETSc/SLEPc (3.6) compared to the macOS worker (3.12). Serial runs seem to work fine, but on parallel runs the Linux worker with 3.6 has various problems:

Same jobs go through just fine on macOS worker with 3.12. Depending on what solvers we want to check things with, I may end up needing to upgrade the Linux ones.

Anyways, I'll take whatever suggestions you have for settings for each example/physics. I figure these serve as good documentation as well.

@manavbhatia
Copy link
Member

@manavbhatia I could use some advice on what solver options to setup for our examples with different physics. Note since the workers only have 2 processors and we are limited to 50 minute job times, we might not want to run the ones that take longer to run.

Are you looking to run examples from fluid/thermal/structural?

Out of the three the structural examples require the most aggressive solver options and I recommend using -ksp_type preonly -pc_type lu. This should automatically select mumps for the two processor case and will default to the builtin PETSc direct solver for single processor. If we limit the size of analysis mesh for the fluid and thermal problems then we could use the direct solver for there as well.

Since we are limited to 2 cpus and 50 minutes, I am not sure if have the bandwidth to run all cases.
For the fluid examples we may be able to select some benign inviscid/viscous cases that do not have strongly transient behavior. Likewise for the FSI case that does flutter solve.

In this first cut of an implementation, all that is required to have examples run on Travis-CI is to add add_test() CMake statements to the corresponding example's CMakeLists.txt. (see for example: examples/structural/example_7/CMakeLists.txt in commit b8fdca5 above) CTest will grab them up automatically.

cool! Seems straightforward.

One issue I'm seeing now is that my Linux Travis-CI worker is pretty fragile it seems due to older versions of PETSc/SLEPc (3.6) compared to the macOS worker (3.12). Serial runs seem to work fine, but on parallel runs the Linux worker with 3.6 has various problems:

What do we do about this then? Do we make the code backwards compatible or upgrade to more recent versions?

I would not recommend this preconditioner for structural problems. Instead, just stick to mumps.

Same jobs go through just fine on macOS worker with 3.12. Depending on what solvers we want to check things with, I may end up needing to upgrade the Linux ones.

Anyways, I'll take whatever suggestions you have for settings for each example/physics. I figure these serve as good documentation as well.

- Example 2 & 4 currently tagged as "SHORT".
- Example 3 takes too long to run on CI currently.
- Example 3 currently failing.
@jdeaton jdeaton force-pushed the feature-ci-runs-short-examples branch from b8fdca5 to 2e233eb Compare February 15, 2020 19:43
@jdeaton
Copy link
Member Author

jdeaton commented Feb 15, 2020

I'm finally getting back around to this to close improvement in the CI out.

@manavbhatia I could use some advice on what solver options to setup for our examples with different physics. Note since the workers only have 2 processors and we are limited to 50 minute job times, we might not want to run the ones that take longer to run.

Are you looking to run examples from fluid/thermal/structural?

On Travis-CI I would like to make sure that at least ALL of the examples we create will build and execute. For the ones that are really fast, let them run to completion and eventually maybe check against some reference output. The examples that take a long time, I think maybe we can come up with a strategy to start them, and then kill the execution after they run for a little bit. We could then take the full runs of the long examples offline, or onto a different system (there is something like this being slowly put together on our end for MAST and other tools).

In this first cut of an implementation, all that is required to have examples run on Travis-CI is to add add_test() CMake statements to the corresponding example's CMakeLists.txt. (see for example: examples/structural/example_7/CMakeLists.txt in commit b8fdca5 above) CTest will grab them up automatically.

cool! Seems straightforward.

John and I figured out to do labels on CTests. So I've went through and tagged the examples I have implemented so far that can run to completion on Travis-CI as "SHORT" and improved the test detection so that only those will be run currently. We also have been tagging both examples and unit tests as "SEQ" or "MPI" to allow selective execution of those.

One issue I'm seeing now is that my Linux Travis-CI worker is pretty fragile it seems due to older versions of PETSc/SLEPc (3.6) compared to the macOS worker (3.12). Serial runs seem to work fine, but on parallel runs the Linux worker with 3.6 has various problems:

What do we do about this then? Do we make the code backwards compatible or upgrade to more recent versions?

I figured this issue out. PETSc 3.6 doesn't automatically pick something external like mumps if you request a direct solver in an MPI run. Also, at PETSc 3.9, they changed from -pc_factor_mat_solver_package mumps to -pc_factor_mat_solver_type mumps. The new versions still allow -pc_factor_mat_solver_package, but have deprecated it so I'm sure it will quit working with some version soon. For the time being, I'm using that to specify external parallel linear solver until we upgrade the Linux CI system.

I would not recommend this preconditioner for structural problems. Instead, just stick to mumps.

I have everything structural running with direct solver on the structural side now.

@jdeaton
Copy link
Member Author

jdeaton commented Feb 15, 2020

@manavbhatia Are there some options I need to set for structural example 2? It looks to be failing after computing a residual of 0.0 in the first evaluation, which would imply to me that it's not getting any load. I was wondering if that was something that needed specified via command line in this example?

You can see this by scrolling through the debug build (first one) in: https://travis-ci.com/MASTmultiphysics/mast-multiphysics/jobs/287647766#L859

Note there may be another error where the MPI run is hanging on the same example on the release build (second build) at the end, not sure if these are related.

- Also added 1 minute timeout on example runs in CI.
@jdeaton
Copy link
Member Author

jdeaton commented Feb 16, 2020

The basic functionality that I was trying to accomplish in this PR is completed. The Travis-CI system will now pickup and automatically execute all example programs that are defined with CMake/CTest add_test() statements and are also labeled as "SHORT".

I'm going to go ahead and merge this PR into master since its functionality appears fairly robust. The small issues described above are also now documented in Issue #73 and #74 for us to revisit.

@jdeaton jdeaton marked this pull request as ready for review February 16, 2020 18:17
@jdeaton jdeaton merged commit 4770957 into master Feb 16, 2020
@jdeaton jdeaton deleted the feature-ci-runs-short-examples branch February 16, 2020 18:23
@manavbhatia
Copy link
Member

Some of the examples may not run with the default options. I have been running them at my end with a specific set of options. Certainly, it will be possible to modify the default options so that the examples can run without issues. I will take a look at these.

I think a version each example can be run. For example, the continuation solver can be run with a relatively benign nonlinear case and we can compare results with a gold-standard file. It will be good to make sure that the test cover all the relevant functionality in the library.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants