Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Random segfaults #31

Closed
OrdinaryMagician opened this issue Feb 18, 2016 · 6 comments
Closed

Random segfaults #31

OrdinaryMagician opened this issue Feb 18, 2016 · 6 comments

Comments

@OrdinaryMagician
Copy link

Because after an average of 10 consecutive crashes while browsing every day I think it's about time I report this madness that just gets on my nerves every time.

While browsing any booru with or without any tags, sometimes the program instantly crashes, either when beginning to load posts or while scrolling. A backtrace just gives completely meaningless information to me:

#0  0x00007ffff64c249c in Glib::DispatchNotifier::pipe_io_handler(Glib::IOCondition) () from /usr/lib/libglibmm-2.4.so.1
#1  0x00007ffff64c4ff7 in Glib::IOSource::dispatch(sigc::slot_base*) () from /usr/lib/libglibmm-2.4.so.1
#2  0x00007ffff64c49ef in Glib::Source::dispatch_vfunc(_GSource*, int (*)(void*), void*) () from /usr/lib/libglibmm-2.4.so.1
#3  0x00007ffff470cc7a in g_main_context_dispatch () from /usr/lib/libglib-2.0.so.0
#4  0x00007ffff470d020 in ?? () from /usr/lib/libglib-2.0.so.0
#5  0x00007ffff470d342 in g_main_loop_run () from /usr/lib/libglib-2.0.so.0
#6  0x00007ffff7049787 in gtk_main () from /usr/lib/libgtk-x11-2.0.so.0
#7  0x00007ffff7a4016f in Gtk::Main::run(Gtk::Window&) () from /usr/lib/libgtkmm-2.4.so.1
#8  0x000000000041c803 in main (argc=1, argv=0x7fffffffdeb8) at main.cc:51

The trace is always the same, so at least the crashing is somewhat consistent.

Although I haven't been able to test if this happens on other distros, all my computers with Arch suffer this same exact problem. It seems to be more frequent on my laptop than on my desktop, though.

My dmesg is flooded with "segfault at 0" and "segfault at 19" messages from all the crashing. This has been happening for months now.

@ahodesuka
Copy link
Owner

I've experienced these same exact crashes while browsing booru's on both Gentoo and Windows.
I know there is something wrong with my booru multithreading/curl_multi code, but I have not found where the problem is.
This is the only consistent (somewhat since its most likely a multithreading issue) crash that I am aware of.

@OrdinaryMagician
Copy link
Author

Funny enough all I could find about this issue is a stackoverflow post from three years ago with the same exact backtrace, their problem wasn't solved.

Oh wait, it was. It's right there at the bottom... I guess?

@ahodesuka
Copy link
Owner

I actually just read through that, and it gave me a hint at where the problem might be.
It is definitely something to do with my dispatchers that are being used to send signals between threads.
My code is most likely breaking one of these rules that Glibmm has in its Dispatcher documentation:

  • Only one thread may connect to the signal and receive notification, but multiple senders are allowed even without locking.
  • The GLib main loop must run in the receiving thread (this will be the GUI thread usually).
  • The Dispatcher object must be instantiated by the receiver thread.
  • The Dispatcher object should be instantiated before creating any of the sender threads, if you want to avoid extra locking.
  • The Dispatcher object must be deleted by the receiver thread.
  • All Dispatcher objects instantiated by the same receiver thread must use the same main context.

Since it only occurs when using boorus it's either the dispatchers in my Curler or Booru::Image classes.
I'll try to take a thorough look and see if I can fix this issue.

@OrdinaryMagician
Copy link
Author

With valgrind I traced that down to just the Curler class.

==12637== Invalid read of size 8
==12637==    at 0x6577470: Glib::DispatchNotifier::pipe_io_handler(Glib::IOCondition) (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x6579FF6: Glib::IOSource::dispatch(sigc::slot_base*) (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x65799EE: Glib::Source::dispatch_vfunc(_GSource*, int (*)(void*), void*) (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x8287C79: g_main_context_dispatch (in /usr/lib/libglib-2.0.so.0.4600.2)
==12637==    by 0x828801F: ??? (in /usr/lib/libglib-2.0.so.0.4600.2)
==12637==    by 0x8288341: g_main_loop_run (in /usr/lib/libglib-2.0.so.0.4600.2)
==12637==    by 0x57E2786: gtk_main (in /usr/lib/libgtk-x11-2.0.so.0.2400.29)
==12637==    by 0x50CB16E: Gtk::Main::run(Gtk::Window&) (in /usr/lib/libgtkmm-2.4.so.1.1.0)
==12637==    by 0x41C802: main (main.cc:51)
==12637==  Address 0x15f43888 is 24 bytes inside a block of size 80 free'd
==12637==    at 0x4C2A144: operator delete(void*) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12637==    by 0x82CC2D1: g_private_replace (in /usr/lib/libglib-2.0.so.0.4600.2)
==12637==    by 0x6577C4B: Glib::DispatchNotifier::unreference_instance(Glib::DispatchNotifier*, Glib::Dispatcher const*) (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x6577CBF: Glib::Dispatcher::~Dispatcher() (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x429C73: AhoViewer::Booru::Curler::~Curler() (curler.cc:80)
==12637==    by 0x42AF79: AhoViewer::Booru::Image::get_thumbnail() (image.cc:49)
==12637==    by 0x44BFA0: operator() (imagelist.cc:216)
==12637==    by 0x44BFA0: operator() (adaptor_trait.h:256)
==12637==    by 0x44BFA0: sigc::internal::slot_call0<AhoViewer::ImageList::load_thumbnails()::{lambda()#1}, void>::call_it(sigc::internal::slot_rep*) (slot.h:108)
==12637==    by 0x657FCA1: ??? (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x82AF0AD: ??? (in /usr/lib/libglib-2.0.so.0.4600.2)
==12637==    by 0x82AE714: ??? (in /usr/lib/libglib-2.0.so.0.4600.2)
==12637==    by 0xBE434A3: start_thread (in /usr/lib/libpthread-2.22.so)
==12637==    by 0xC141DCC: clone (in /usr/lib/libc-2.22.so)
==12637==  Block was alloc'd at
==12637==    at 0x4C29118: operator new(unsigned long) (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==12637==    by 0x6577ABF: Glib::DispatchNotifier::reference_instance(Glib::RefPtr<Glib::MainContext> const&, Glib::Dispatcher const*) (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x6577B30: Glib::Dispatcher::Dispatcher() (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x429D77: AhoViewer::Booru::Curler::Curler(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) (curler.cc:57)
==12637==    by 0x42AEE7: AhoViewer::Booru::Image::get_thumbnail() (image.cc:49)
==12637==    by 0x44BFA0: operator() (imagelist.cc:216)
==12637==    by 0x44BFA0: operator() (adaptor_trait.h:256)
==12637==    by 0x44BFA0: sigc::internal::slot_call0<AhoViewer::ImageList::load_thumbnails()::{lambda()#1}, void>::call_it(sigc::internal::slot_rep*) (slot.h:108)
==12637==    by 0x657FCA1: ??? (in /usr/lib/libglibmm-2.4.so.1.3.0)
==12637==    by 0x82AF0AD: ??? (in /usr/lib/libglib-2.0.so.0.4600.2)
==12637==    by 0x82AE714: ??? (in /usr/lib/libglib-2.0.so.0.4600.2)
==12637==    by 0xBE434A3: start_thread (in /usr/lib/libpthread-2.22.so)
==12637==    by 0xC141DCC: clone (in /usr/lib/libc-2.22.so)

Of course when it gets destroyed the dispatcher just goes down with it, and Glib being a complete idiot still tries to access it.

@ahodesuka
Copy link
Owner

Okay that is extremely helpful - I know where the issue is now.
The thumbnail's curlers are not being created in the main thread.
If you could try reverting commit bff1c1d it should hopefully solve the issue.

@OrdinaryMagician
Copy link
Author

Well would you look at that, it's stable now

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants