Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault in mapping thread #2

Closed
salehahr opened this issue Jan 15, 2021 · 9 comments
Closed

Segmentation fault in mapping thread #2

salehahr opened this issue Jan 15, 2021 · 9 comments

Comments

@salehahr
Copy link

Hello,

First of all thanks for providing the DefSLAM library! I'm getting a segmentation fault while running the DefSLAM executable (I modified simple_camera.cc to load the mandala left camera images) with the Mandala0 dataset.

The sequence runs for several frames, but then crashes due to the segmentation fault.
The error occurs in the mapping thread, specifically at line 188 of NormalEstimator::ObtainK1K2().

Here is the gdb output:

NORMAL ESTIMATOR IN - Reprojection error: 5.12225
Points considered: 268
00033 244 0 551
/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936004110.png i:  34

Thread 5 "DefSLAM" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc2197700 (LWP 21137)]
0x00007ffff7880e48 in std::vector<defSLAM::SurfacePoint, std::allocator<defSLAM::SurfacePoint> >::operator[] (this=0x8, __n=<optimized out>)
    at /usr/include/c++/7/bits/stl_vector.h:798
798             return *(this->_M_impl._M_start + __n);

The backtrace:

(gdb) bt
#0  0x00007ffff7880e48 in std::vector<defSLAM::SurfacePoint, std::allocator<defSLAM::SurfacePoint> >::operator[](unsigned long) (this=0x8, __n=<optimized out>) at /usr/include/c++/7/bits/stl_vector.h:798
#1  0x00007ffff7880e48 in defSLAM::Surface::getNormalSurfacePoint(unsigned long, cv::Vec<float, 3>&) (this=0x0, ind=30, N=...)
    at /home/user3/slam/DefSLAM/Modules/Mapping/Surface.cc:95
#2  0x00007ffff783c828 in defSLAM::NormalEstimator::ObtainK1K2() (this=this@entry=0x7fffc218f530)
    at /home/user3/slam/DefSLAM/Modules/Mapping/NormalEstimator.cc:188
#3  0x00007ffff783a896 in defSLAM::DefLocalMapping::NRSfM() (this=this@entry=0x5555633b1470)
    at /home/user3/slam/DefSLAM/Modules/Mapping/DefLocalMapping.cc:178
#4  0x00007ffff783ad60 in defSLAM::DefLocalMapping::insideTheLoop() (this=this@entry=0x5555633b1470)
    at /home/user3/slam/DefSLAM/Modules/Mapping/DefLocalMapping.cc:125
#5  0x00007ffff783adce in defSLAM::DefLocalMapping::Run() (this=0x5555633b1470)
    at /home/user3/slam/DefSLAM/Modules/Mapping/DefLocalMapping.cc:85
#6  0x00007ffff54b66df in  () at /usr/lib/x86_64-linux-gnu/libstdc++.so.6
#7  0x00007ffff445a6db in start_thread (arg=0x7fffc2197700) at pthread_create.c:463
#8  0x00007ffff4f1171f in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:95

Any ideas what could have gone wrong here?
Thanks!

@JoseLamarca
Copy link
Collaborator

Thanks for your comment! It seems like is trying to access to a surface point that is out of the index in the vector. It seems a bug, I will take a look this weekend and I tell you if I find some bug on Monday,

Thanks!

@JoseLamarca
Copy link
Collaborator

I try it but did not get any segmentation fault. Can you send me more information to try to replay the error?

@salehahr
Copy link
Author

Thank you for looking into it! Sure, I'll include more details below.

Summary

Change comparison on Github

Compiling

  • I compiled the DefSLAM library on Ubuntu 18.4 on Windows (using WSL).
  • I used the 'Debug' build type to get debugging symbols
  • The other only significant change in my CMakeLists.txt was changing from the simple_camera.cc executable to the modified version vi.cc

Modifications to 'simple_camera.cc' --> 'vi.cc'

In a nutshell, here I basically replaced the loading of a video file with the loading of images from a folder).
So my version of the DefSLAM executable now takes as arguments, instead of the video filepath input (argv[3] in simple_camera.cc), the folderpath containing images as well as the filepath to the image timestamps.
The diff between the modified file vi.cc and the original simple_camera.cc can be found here: https://www.diffchecker.com/AreQ8R9t

Dataset used

I used the Mandala0 dataset from this link in the readme file, i.e. from the most recent commit which provides the stereo0.yaml file for all Mandala0~4 datasets.

  • For the timestamps, I used the file 'Mandala0/timestamps/timestamps (copy).txt' which was provided.
  • The images I loaded in the my 'vi.cc' file, however, are only the left images, as I want to test the monocular case and not stereo.

Running DefSLAM

My arguments for executing DefSLAM are

orb_voc=/home/user3/slam/DefSLAM/Vocabulary/ORBvoc.txt
yaml=/home/user3/slam/datasets/mandala0/stereo0_2.yaml
imgs=/home/user3/slam/datasets/mandala0/images
ts="/home/user3/slam/datasets/mandala0/timestamps/timestamps_copy.txt"

(1) Run with debugger

gdb --args ./DefSLAM $orb_voc $yaml $imgs $ts

Running DefSLAM in debug mode, with gdb, I got the segfault error like I posted above (crash at the 30-somethingth frame).

Some more output lines:

Defomation tracking Parameters:
- Reg. Inextensibility: 12000
- Reg. Laplacian: 700
- Reg. Temporal: 0.05
- Reg. LocalZone: 2
[New Thread 0x7fffc225c700 (LWP 8645)]
[New Thread 0x7fffc1a5b700 (LWP 8646)]
/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936003113.png i:  0
[New Thread 0x7fffc0ffe700 (LWP 8647)]
[New Thread 0x7fffbbfff700 (LWP 8648)]
[New Thread 0x7fffbb7fe700 (LWP 8649)]
New map created with 264 points
[New Thread 0x7fffb9ec4700 (LWP 8650)]
/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936003142.png i:  1
NORMAL ESTIMATOR IN - NORMALS REESTIMATED : 0 - 0
 NORMAL ESTIMATOR OUTPoints potential : 0  70
Number Of normals 0 0x555563483bd0
Not enough normals
POINTS matched:154
Reprojection error: 6.27704
Points considered: 154
00001 154 0 263
/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936003171.png i:  2

...
...
...

/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936004140.png i:  35
POINTS matched:278
Finding by Schwarp
-0.829431  0.55392 -0.550824-0.829431  0.55392 -0.550824New points : 30
Calculating final Schwarp
[New Thread 0x7fffa310c700 (LWP 9263)]
[New Thread 0x7fffa210a700 (LWP 9264)]
[Thread 0x7fffa310c700 (LWP 9263) exited]
[Thread 0x7fffa210a700 (LWP 9264) exited]
[New Thread 0x7fffa210a700 (LWP 9265)]
[New Thread 0x7fffa310c700 (LWP 9266)]
[Thread 0x7fffa210a700 (LWP 9265) exited]
[Thread 0x7fffa310c700 (LWP 9266) exited]
-0.829431  0.55392 -0.550824Reprojection error: 4.72921
Points considered: 310
00035 313 0 927
[New Thread 0x7fffa310c700 (LWP 9267)]
[New Thread 0x7fffa210a700 (LWP 9268)]
/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936004169.png i:  36
[Thread 0x7fffa210a700 (LWP 9268) exited]
[Thread 0x7fffa310c700 (LWP 9267) exited]
POINTS matched:186
[New Thread 0x7fffa210a700 (LWP 9269)]
[New Thread 0x7fffa310c700 (LWP 9270)]
[Thread 0x7fffa210a700 (LWP 9269) exited]
[Thread 0x7fffa310c700 (LWP 9270) exited]
[New Thread 0x7fffa310c700 (LWP 9271)]
[New Thread 0x7fffa210a700 (LWP 9272)]
[Thread 0x7fffa310c700 (LWP 9271) exited]
[Thread 0x7fffa210a700 (LWP 9272) exited]
Points to reestimate : 446 473
[New Thread 0x7fffa210a700 (LWP 9273)]
[Thread 0x7fffa210a700 (LWP 9273) exited]
[New Thread 0x7fffa210a700 (LWP 9274)]
[Thread 0x7fffa210a700 (LWP 9274) exited]
[New Thread 0x7fffa210a700 (LWP 9275)]
[Thread 0x7fffa210a700 (LWP 9275) exited]
[New Thread 0x7fffa210a700 (LWP 9276)]
[Thread 0x7fffa210a700 (LWP 9276) exited]
[New Thread 0x7fffa210a700 (LWP 9277)]
[Thread 0x7fffa210a700 (LWP 9277) exited]
[New Thread 0x7fffa210a700 (LWP 9278)]
[Thread 0x7fffa210a700 (LWP 9278) exited]
[New Thread 0x7fffa210a700 (LWP 9279)]
[Thread 0x7fffa210a700 (LWP 9279) exited]
[New Thread 0x7fffa210a700 (LWP 9280)]
NORMAL ESTIMATOR IN - Reprojection error: 6.11107
Points considered: 221
00036 197 0 448

Thread 5 "DefSLAM" received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7fffc225c700 (LWP 8645)]
0x00007ffff78c7c68 in std::vector<defSLAM::SurfacePoint, std::allocator<defSLAM::SurfacePoint> >::operator[]
    (this=0x100000008, __n=<optimized out>) at /usr/include/c++/7/bits/stl_vector.h:798
798             return *(this->_M_impl._M_start + __n);

Thread info from gdb:

(gdb) info threads
  Id   Target Id         Frame
  1    Thread 0x7ffff7f84040 (LWP 8637) "DefSLAM" 0x00007ffff58e60bb in cv::Mat::copyTo(cv::_OutputArray const&) const ()
   from /usr/local/lib/libopencv_core.so.4.4
  2    Thread 0x7fffd9dfb700 (LWP 8642) "DefSLAM" 0x00007ffff44f6ad3 in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0x7ffff3dafbe4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  3    Thread 0x7fffd95fa700 (LWP 8643) "DefSLAM" 0x00007ffff44f6ad3 in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0x7ffff3dafc64) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  4    Thread 0x7fffd4df9700 (LWP 8644) "DefSLAM" 0x00007ffff44f6ad3 in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0x7ffff3dafce4) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
* 5    Thread 0x7fffc225c700 (LWP 8645) "DefSLAM" 0x00007ffff78c7c68 in std::vector<defSLAM::SurfacePoint, std::allocator<defSLAM::SurfacePoint> >::operator[] (this=0x100000008, __n=<optimized out>) at /usr/include/c++/7/bits/stl_vector.h:798
  6    Thread 0x7fffc1a5b700 (LWP 8646) "DefSLAM" 0x00007fffebd37050 in ?? () from /lib/x86_64-linux-gnu/libz.so.1
  7    Thread 0x7fffc0ffe700 (LWP 8647) "DefSLAM" 0x00007ffff44f6ad3 in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0x55555580edb8) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  8    Thread 0x7fffbbfff700 (LWP 8648) "DefSLAM" 0x00007ffff44f6ad3 in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0x5555633b168c) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  9    Thread 0x7fffbb7fe700 (LWP 8649) "DefSLAM" 0x00007ffff44f6ad3 in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0x5555633b1e3c) at ../sysdeps/unix/sysv/linux/futex-internal.h:88
  10   Thread 0x7fffb9ec4700 (LWP 8650) "DefSLAM" 0x00007fffee5d7f4e in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
  11   Thread 0x7fffae7cb700 (LWP 8651) "DefSLAM" 0x00007fffee5d7f4e in ?? () from /usr/lib/x86_64-linux-gnu/libgomp.so.1
  640  Thread 0x7fffa210a700 (LWP 9280) "DefSLAM" 0x00007ffff44f6ad3 in futex_wait_cancelable (private=<optimized out>, expected=0,
    futex_word=0x7fff9c0eda48) at ../sysdeps/unix/sysv/linux/futex-internal.h:88                 

(2) Normal run without debugger

./DefSLAM $orb_voc $yaml $imgs $ts

Interestingly though, I just tried to run DefSLAM normally without using gdb and got the same error, however the crash happens much sooner at around Frame 4

Defomation tracking Parameters:
- Reg. Inextensibility: 12000
- Reg. Laplacian: 700
- Reg. Temporal: 0.05
- Reg. LocalZone: 2
/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936003113.png i:  0
New map created with 264 points
NORMAL ESTIMATOR IN - NORMALS REESTIMATED : 0 - 0
 NORMAL ESTIMATOR OUT/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936003142.png i:  1
Points potential : 0  70
Number Of normals 0 0x55e795cba220
Not enough normals
POINTS matched:154
Reprojection error: 6.27704
Points considered: 154
00001 154 0 263
/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936003171.png i:  2
POINTS matched:124
libGL error: No matching fbConfigs or visuals found
libGL error: failed to load driver: swrast
X11 Error: GLXBadFBConfig
Pangolin X11: Indirect GLX rendering context obtained
Finding by Schwarp
-0.751307  0.586089 -0.580695-0.751307  0.586089 -0.580695Reprojection error: 5.48458
Points considered: 155
New points : 0
Calculating final Schwarp
00002 155 0 263
-0.751307  0.586089 -0.580695/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936003201.png i:  3
POINTS matched:130
Reprojection error: 7.08086
Points considered: 163
00003 163 0 263
/home/user3/slam/datasets/mandala0/images/stereo_im_l_1560936003230.png i:  4
POINTS matched:130
Reprojection error: 6.50548
Points considered: 147
00004 147 0 263
./run.sh: line 28:  9293 Segmentation fault      ./DefSLAM $orb_voc $yaml $imgs $ts

Remark(s)

  • I was a bit confused to see that so many threads are apparently created... I had thought that in DefSLAM, there were only the main/tracking, mapping and viewer threads, so in total 3 as far as I'd understood the code?

Hope that was detailed enough so that you can reproduce the error. If I should include anything else, please let me know. Thanks again, and I really appreciate the help!

@JoseLamarca
Copy link
Collaborator

Yes, there are three main threads, but there are small parts of the program that use internally multithreading like Ceres or the b-bspline third-party.

I didn't get to repeat it yet :S, does it work fine with the simple_camera.cc?

@salehahr
Copy link
Author

I see, that makes sense, thanks for clearing that up!
I had only tested simple_camera.cc with one video (f5phantom in the Hamlyn dataset), but it seemed to work fine, no crashes

@JoseLamarca
Copy link
Collaborator

Initially, if the flag save_result is False DefSLAMGT does not use the right image, if I do not forget something... To make it faster, you can always use the parallel mode uncommenting line 22 in modules/settings/set_MAC.h. Could you make sure you that the save_results flag is in False and it works with DefSLAMGT?

@salehahr
Copy link
Author

salehahr commented Feb 2, 2021

Sure thing. Is it the Viewer.SaveResults parameter in the yaml file? If so, yes it was previously set to 1 and I have since changed it to 0. Tried it out with both DefSLAM and DefSLAMGT... DefSLAM still produces the same error regardless, but DefSLAMGT works fine once SaveResults was set to 0!

@salehahr
Copy link
Author

salehahr commented Feb 2, 2021

Saw that you had a new commit changing the DefTracking.cc function; I just pulled and recompiled DefSLAM. Now DefSLAM runs without errors too... Closing with thanks!

@salehahr salehahr closed this as completed Feb 2, 2021
@JoseLamarca
Copy link
Collaborator

I'm happy it works now! I was taking a look and I think that there was some issue with the multithreading... I am not totally sure in any case.

Let me know if the problem reappears eventually,
Thanks for your comments!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants