Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix interference between test_tracetools and ros2lifecycle tests #96

Conversation

christophebedard
Copy link
Member

Part of #94

Relates to #95, but is still separate

test_tracetools's test_lifecycle_node test and ros2lifecycle's test_cli test can interfere with each other if run in parallel, making one of them -- or both of them -- fail. For example, test_tracetools fails in this nightly job here: https://ci.ros2.org/view/nightly/job/nightly_linux_debug/2988/.

This is mainly because both tests use a lifecycle node with the same name and without any specific namespace:

  1. test_tracetools's test_lifecycle_node test uses test_lifecycle_node
  2. ros2lifecycle's test_cli test uses ros2lifecycle_test_fixtures's simple_lifecycle_node, which is named test_lifecycle_node

This can lead to two separate failures:

  1. Since the lifecycle nodes have the same name, without any specific namespace, the tests can end up triggering transitions for the wrong lifecycle node
    • I think this sometimes results in test_tracetools's test_lifecycle_node test timing out because the state machine ends up in a weird state (not sure why the timeouts in test_lifecycle_node.cpp don't make it exit before the 60-second timeout, though)
  2. The trace that test_tracetools's test_lifecycle_node test collects can include transitions from ros2lifecycle

The two separate solutions are:

  1. Use a namespace (/test_tracetools) for the node in test_tracetools's test_lifecycle_node test to avoid transition/topic-level interference
  2. Make sure to only consider transition trace events for the right state machine instead of assuming that all transitions in the trace come from the expected state machine and then checking if they correspond to the expected state machine

Signed-off-by: Christophe Bedard <christophe.bedard@apex.ai>
@christophebedard christophebedard added the enhancement New feature or request label Mar 13, 2024
@christophebedard christophebedard self-assigned this Mar 13, 2024
Copy link
Contributor

@clalancette clalancette left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.

For CI, I think it might make sense to run CI tests both sequential and parallel, though I'll leave it up to you.

@christophebedard
Copy link
Member Author

Yeah I can do that. Tracing tests might fail in parallel without #95, but this particular test should pass.

@christophebedard
Copy link
Member Author

Sequential testing of tracetools_test/test_tracetools and above only:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

Parallel testing of everything:

  • Linux Build Status
  • Linux-aarch64 Build Status
  • Windows Build Status

@christophebedard
Copy link
Member Author

  • Linux-aarch64 Build Status

Those test failures are "expected" (see #95) and not related to this PR.

@christophebedard
Copy link
Member Author

Alright, CI results are acceptable, I'll merge this and rebase #95.

@christophebedard christophebedard merged commit f1c66eb into rolling Mar 14, 2024
7 of 9 checks passed
@christophebedard christophebedard deleted the christophebedard/fix-test-tracetools-test-lifecycle-node-interference branch March 14, 2024 21:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants