New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable timing tests in DEBUG mode #3012
Disable timing tests in DEBUG mode #3012
Conversation
3ebeb65
to
350e45a
Compare
Codecov Report
@@ Coverage Diff @@
## master #3012 +/- ##
==========================================
- Coverage 61.81% 61.79% -0.02%
==========================================
Files 370 370
Lines 33178 33171 -7
==========================================
- Hits 20506 20494 -12
- Misses 12672 12677 +5
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's quite pragmatic. I mostly agree that we should not test timing of debug builds.
I'm not sure plain disabling them (silently!) is a good idea though.
From my side I wanted to migrate all such tests into https://github.com/captain-yoshi/moveit_benchmark_suite because timing tests should simply not be hard tests at all.
But I can't afford to spend work on improving that project at this time. @captain-yoshi works on it on the side and keeps making various incremental improvements. (Thanks you David!)
Lastly, please consider #2107 . The main offender for runtime in debug mode is https://github.com/ros-planning/moveit/blob/master/moveit_ros/robot_interaction/test/locked_robot_state_test.cpp , which should probably also just be disabled due to overly long runtime if we go with your proposed approach.
@@ -87,6 +87,12 @@ class CollisionDetectorTest : public ::testing::Test | |||
std::string kinect_dae_resource_; | |||
}; | |||
|
|||
#ifdef NDEBUG // Don't perform timing tests in Debug mode (but evaluate expression) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#ifdef NDEBUG // Don't perform timing tests in Debug mode (but evaluate expression) | |
#ifndef NDEBUG // Don't perform timing tests in Debug mode (but evaluate expression) |
Or do I misunderstand this expression?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. In my original commit, it was the correct expression. Here, it is inverted.
@v4hn, thanks for pointing out #2107. However, here I'm fighting flaky tests due to timing, while #2107 complains about slow tests in general. Factoring out the test into a benchmark suite isn't appropriate IMO. It actually tests that collision checking
|
Somewhat off-topic for this thread, but anyway
I fully agree that all functional tests belong to the core tests and for improved efficiency all tests should also check more functional aspects than just "doesn't crash" (at least one of them doesn't O.o).
I absolutely disagree with this aspect. The threshold is arbitrary and there is no sane threshold that could account for the machine running the test, system load, compiler, file system, moon phase, ... Tests can be arbitrarily slow on insane configurations, but should always pass if they compute the correct results. Thus I would prefer to leave any time measurement out of our gtests. You are right that this is out of scope here, and it makes sense to keep it until we have something better. |
I agree with you, Michael, that a benchmark suite as you describe it, would be great. However, I'm not aware of any framework to compare benchmarks of different snapshots in a similar fashion like codecov. Is there something? |
However, I'm not aware of any framework to compare benchmarks of different snapshots in a similar fashion like codecov. Is there something?
As I wrote above, it was/is my goal for the benchmark suite that was started in last year's GSoC project and [MBS aims to compare different software stacks](https://github.com/captain-yoshi/moveit_benchmark_suite/blob/master/.docs/regression.md). But I don't think it's in a state we could integrate into CI (or a custom runner for fixed ) right away. That being said I did not actually test the current state for some time...
If not, we should keep some key timing tests as part of our hard tests. I agree that they are highly unreliable, but they ensure that we don't break anything completely by accident.
I agree, that's why I marked my remarks as offtopic. :) Let's hope we can remove them at some point once we have benchmarking setup (if we ever get there...)
|
Debug builds are significantly slower and thus shouldn't be considered for timing-based tests.
They frequently fail, e.g. https://github.com/ros-planning/moveit/runs/4648164688?check_suite_focus=true