Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add death tests and build in debug mode #323

Merged
merged 12 commits into from
Mar 5, 2024
Merged

Conversation

nahueespinosa
Copy link
Member

@nahueespinosa nahueespinosa commented Feb 17, 2024

Proposed changes

This makes it so we can test our assertions on CI and not run into surprises when trying to create a build with debugging information.

Type of change

  • 馃悰 Bugfix (change which fixes an issue)
  • 馃殌 Feature (change which adds functionality)
  • 馃摎 Documentation (change which fixes or extends documentation)

Checklist

  • Lint and unit tests (if any) pass locally with my changes
  • I have added tests that prove my fix is effective or that my feature works
  • I have added necessary documentation (if appropriate)
  • All commits have been signed for DCO

@nahueespinosa nahueespinosa self-assigned this Feb 17, 2024
@nahueespinosa nahueespinosa added cpp Related to C++ code infra Related to infrastructure and CI labels Feb 17, 2024
@nahueespinosa nahueespinosa marked this pull request as draft February 17, 2024 03:32
@nahueespinosa nahueespinosa force-pushed the nahuel/add-death-tests branch 2 times, most recently from 7edd7ea to 688077c Compare February 21, 2024 19:19
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
beluga/CMakeLists.txt Outdated Show resolved Hide resolved
@nahueespinosa nahueespinosa marked this pull request as ready for review February 21, 2024 20:01
@nahueespinosa nahueespinosa added the do-not-land Do not merge these changes into the main branch label Feb 21, 2024
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
@nahueespinosa
Copy link
Member Author

beluga_amcl tests in Noetic have a strange error 馃.
https://github.com/Ekumen-OS/beluga/actions/runs/8010185309/job/21880630571?pr=323#step:8:9192

@hidmic Have you seen this before?

Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
@nahueespinosa
Copy link
Member Author

nahueespinosa commented Feb 25, 2024

@hidmic I got some logs, there are "connection refused" errors preventing the test from completing: rostest-d47fb478aa75-29009.log

IIUC, ROS master is trying to communicate with the node but the connection is refused (possibly because the node is too slow?).

@hidmic
Copy link
Collaborator

hidmic commented Feb 26, 2024

Sorry for the delay. I had at least two false starts with this. test_amcl_nodelet is SEGFAULT'ing. Running on gdb I get:

0x000055a52592b606 in std::vector<Sophus::SE2<double, 0>, std::allocator<Sophus::SE2<double, 0> > >::size (this=0x18)
    at /usr/include/c++/9/bits/stl_vector.h:916             
916           { return size_type(this->_M_impl._M_finish - this->_M_impl._M_start); }                                     
(gdb) bt                                                                                                                  
#0  0x000055a52592b606 in std::vector<Sophus::SE2<double, 0>, std::allocator<Sophus::SE2<double, 0> > >::size (this=0x18) 
    at /usr/include/c++/9/bits/stl_vector.h:916
#1  0x000055a52591f6a8 in beluga::TupleContainer<beluga::Vector, std::tuple<Sophus::SE2<double, 0>, beluga::Numeric<double
, beluga::WeightTag, void> > >::size (this=0x0)
    at /ws/install/beluga/include/beluga/beluga/containers/tuple_vector.hpp:94
#2  0x00007f964c52d6cd in beluga_amcl::AmclNodelet::initialize_from_map (this=0x55a525b2ed20)
    at /ws/src/beluga/beluga_amcl/src/amcl_nodelet.cpp:581
#3  0x00007f964c526805 in beluga_amcl::AmclNodelet::handle_map_with_default_initial_pose (this=0x55a525b2ed20, map=...)
    at /ws/src/beluga/beluga_amcl/src/amcl_nodelet.cpp:363
#4  0x00007f964c525cfe in beluga_amcl::AmclNodelet::map_callback (this=0x55a525b2ed20, message=...)
    at /ws/src/beluga/beluga_amcl/src/amcl_nodelet.cpp:326
...

The weirdest thing is that I see:

(gdb) frame 2
#2  0x00007f964c52d6cd in beluga_amcl::AmclNodelet::initialize_from_map (this=0x55a525b2ed20)
    at /ws/src/beluga/beluga_amcl/src/amcl_nodelet.cpp:581
581       NODELET_INFO(
(gdb) list
576       }
577
578       particle_filter_->initialize_from_map();
579       enable_tf_broadcast_ = true;
580
581       NODELET_INFO(
582           "Particle filter initialized with %ld particles distributed across the map",
583           particle_filter_->particles().size());
584
585       return true;
(gdb) p particle_filter_
$12 = std::unique_ptr<beluga_ros::Amcl> = {get() = 0x0}

which makes no sense, the code could have never reached that point if particle_filter_ was nullptr.

Could it be a concurrency issue?

@nahueespinosa
Copy link
Member Author

馃憖

ThreadSanitizer: reported 277 warnings

There are data races and lock-order-inversion warnings. I'm packed these days, but will try to look into this at some point during the week.

Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
@nahueespinosa nahueespinosa changed the title Add death tests and enable assertions when testing Add death tests and build in debug mode Mar 1, 2024
@nahueespinosa nahueespinosa removed the do-not-land Do not merge these changes into the main branch label Mar 1, 2024
@nahueespinosa
Copy link
Member Author

@hidmic Data races are fixed now. PTAL

Copy link
Collaborator

@hidmic hidmic left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM though codecov isn't happy (and a 5% deviation is substantial).

beluga_amcl/src/amcl_nodelet.cpp Show resolved Hide resolved
Signed-off-by: Nahuel Espinosa <nespinosa@ekumenlabs.com>
@nahueespinosa
Copy link
Member Author

The coverage report looks weird:

https://app.codecov.io/gh/Ekumen-OS/beluga/pull/323/blob/beluga_ros/include/beluga_ros/tf2_sophus.hpp#L189

@hidmic should we keep calculating coverage from the release build?

@hidmic
Copy link
Collaborator

hidmic commented Mar 4, 2024

Hmm, it does look weird. I can buy it is an issue with release optimization. Let's skip coverage computations for release builds to see what happens.

@nahueespinosa
Copy link
Member Author

Assuming the weirdness in coverage reports is because codecov is trying to compare a report from a release build in BASE and a debug build in HEAD, I will merge this PR and monitor the reports in the main branch.

@nahueespinosa nahueespinosa merged commit 0808d43 into main Mar 5, 2024
6 of 8 checks passed
@nahueespinosa nahueespinosa deleted the nahuel/add-death-tests branch March 5, 2024 14:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cpp Related to C++ code infra Related to infrastructure and CI
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants