Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RobustPGO Teardown #22

Closed
marcusabate opened this issue Aug 6, 2019 · 3 comments · Fixed by #24
Closed

RobustPGO Teardown #22

marcusabate opened this issue Aug 6, 2019 · 3 comments · Fixed by #24
Assignees

Comments

@marcusabate
Copy link
Member

While running a unit test from Spark VIO I have a strange issue. It could be a problem in gtests, or in VIO, or here. But the issue popped up with RobustPGO so I'll put it here for now.

When I run the full test executable everything is fine. But if I just run a series of tests that have the PGO stuff (or even just the loop closure detector tests) then I a segfault. Running with gdb shows this to be the cause:

Thread 1 "testSparkVio" received signal SIGSEGV, Segmentation fault.
0x0000555555a505ec in std::default_delete<RobustPGO::OutlierRemoval>::operator() (this=0x5555563697c0, 
    __ptr=0x555556000000)
    at /usr/include/c++/7/bits/unique_ptr.h:78
78		delete __ptr;

So it looks like there is some problem with the teardown of RobustPGO class members. Again this could be a problem with the way I set them up in VIO or even just the test suite. Looking into it today, but @yunzc if you have any thoughts they'd be appreciated.

@marcusabate
Copy link
Member Author

As an update, it seems to be an issue with RobustPGO somewhere. The problem is only related to the std::unique_ptr<OutlierRemoval> outlier_removal_; class member of RobustSolver. This is initialized based on parameter selection in the RobustSolver constructor. One of the options is NONE, which calls outlier_removal_ = nullptr;. This concerns me a bit but looking through RobustPGO I see that every time outlier_removal_ is used it is first checked so that cases in which nullptr is the selection aren't used.

In VIO I have a class member of type RobustPGO:

std::unique_ptr<RobustPGO::RobustSolver> pgo_;

Which is initialized in the constructor like so:

RobustPGO::RobustSolverParams pgo_params;
  pgo_params.setPcmSimple3DParams(lcd_params_.pgo_trans_threshold_,
      lcd_params_.pgo_rot_threshold_, RobustPGO::Verbosity::VERBOSE);
pgo_ = VIO::make_unique<RobustPGO::RobustSolver>(pgo_params);

Inside of testLoopClosureDetector.cpp I simply have the class member:

std::unique_ptr<LoopClosureDetector> lcd_detector_;

Which is initialized in the fixture like so:

LoopClosureDetectorParams params;
params.parseYAML(lcd_FLAGS_test_data_path_+"/testLCDParameters.yaml");

lcd_detector_ = VIO::make_unique<LoopClosureDetector>(params, false);

Not a lot to go wrong there.

I've added a destructor to GenericSolver.h:

virtual ~GenericSolver() = default;

And a destructor to RobustSolver.h:

~RobustSolver() = default;

And I've added the EIGEN_MAKE_ALIGNED_OPERATOR_NEW macro as a public member of the pcm.h class since it has a bunch of gtsam::Matrix members which are typedefs of Eigen::Matrix.

None of these changes have solved the original problem, which still pops up when I run LoopClosureDetector tests only but not when I run the entire Spark VIO testing suite.

@yunzc I'm pretty stumped here so any thoughts you have on why unique_ptrs in RobustPGO would be misbehaving during tear down at the end of runtime would be appreciated.

@yunzc
Copy link
Member

yunzc commented Aug 8, 2019

Fixed by getting rid of c++14 dependency and defining make_unique in RobustPGO

@yunzc yunzc closed this as completed Aug 8, 2019
@marcusabate
Copy link
Member Author

Reopening because the problem is not yet solved it seems. I still get the same segfault in VIO sometimes.

Thread 1 "testSparkVio" received signal SIGSEGV, Segmentation fault.
std::unique_ptr<RobustPGO::OutlierRemoval, std::default_delete<RobustPGO::OutlierRemoval> >::~unique_ptr (
    this=0x55556515d200, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
268		  get_deleter()(__ptr);
(gdb) backtrace
#0  0x00007ffff792c1ad in std::unique_ptr<RobustPGO::OutlierRemoval, std::default_delete<RobustPGO::OutlierRemoval> >::~unique_ptr() (this=0x55556515d200, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
#1  0x00007ffff792c1ad in RobustPGO::RobustSolver::~RobustSolver() (this=0x55556515d170, __in_chrg=<optimized out>)
    at /home/marcus/code/RobustPGO/RobustPGO/RobustSolver.h:40
#2  0x00007ffff792c1ad in std::default_delete<RobustPGO::RobustSolver>::operator()(RobustPGO::RobustSolver*) const (this=<optimized out>, __ptr=0x55556515d170) at /usr/include/c++/7/bits/unique_ptr.h:78
#3  0x00007ffff792c1ad in std::unique_ptr<RobustPGO::RobustSolver, std::default_delete<RobustPGO::RobustSolver> >::~unique_ptr() (this=0x5555559bd0f0, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
#4  0x00007ffff792c1ad in VIO::LoopClosureDetector::~LoopClosureDetector() (this=0x5555559bcee0, __in_chrg=<optimized out>)
    at /home/marcus/code/VIO/src/LoopClosureDetector.cpp:107
#5  0x00007ffff792c4f9 in VIO::LoopClosureDetector::~LoopClosureDetector() (this=0x5555559bcee0, __in_chrg=<optimized out>)
    at /home/marcus/code/VIO/src/LoopClosureDetector.cpp:109
#6  0x00005555556206e2 in std::default_delete<VIO::LoopClosureDetector>::operator()(VIO::LoopClosureDetector*) const (this=<optimized out>, __ptr=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:78
#7  0x00005555556206e2 in std::unique_ptr<VIO::LoopClosureDetector, std::default_delete<VIO::LoopClosureDetector> >::~unique_ptr() (this=0x5555559bc3c0, __in_chrg=<optimized out>) at /usr/include/c++/7/bits/unique_ptr.h:268
#8  0x00005555556206e2 in LCDFixture::~LCDFixture() (this=0x5555559bc390, __in_chrg=<optimized out>)
    at /home/marcus/code/VIO/tests/testLoopClosureDetector.cpp:41

Basically this shows that it is the fault of the OutlierRemoval class. Looking at that, I noticed the Base class has no constructor. The derived class Pcm calls this nonexistent constructor. It also does not initialize all data members in its own constructor. I think I'll start with these things if @yunzc you do not see any reason why they should stay that way or where else the segfault could be coming from.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants