Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mysql2 related crashes when running specs #2662

Closed
sayap opened this issue Oct 7, 2013 · 26 comments
Closed

mysql2 related crashes when running specs #2662

sayap opened this issue Oct 7, 2013 · 26 comments
Labels
C-API Compatibility Function and feature compatibility with the MRI C-API Process Abort | Hang

Comments

@sayap
Copy link

sayap commented Oct 7, 2013

When running the specs of my application, the process would crash intermittently. Seems like a similar issue to #2655, but specific to mysql2.

https://gist.github.com/sayap/6861102

rubinius 2.0.0 (2.1.0 2013-10-04 JI) [x86_64-linux-gnu]

Linux henry 3.11.1-ck1 #1 SMP PREEMPT Sun Sep 22 17:47:30 MYT 2013 x86_64 Intel(R) Core(TM) i5-2410M CPU @ 2.30GHz GenuineIntel GNU/Linux

@dbussink
Copy link
Contributor

dbussink commented Oct 7, 2013

Do you have a way for us to reproduce the problem? Just the stacktrace gives very little information and doesn't give enough to either solve of further investigate the problem. It might be a bug in Rubinius or Mysql2 or perhaps something completely different, but it's impossible to say what is going with only this information.

@sayap
Copy link
Author

sayap commented Oct 11, 2013

I will see if I can come up with a reproducible environment.

By the way, I did also encounter GC related crashes, such as:

Backtrace:
/opt/rubies/rubinius-2.0.0/bin/rbx[0x5ec9be]
/lib64/libpthread.so.0(+0x10b10)[0x7f61437cdb10]
/opt/rubies/rubinius-2.0.0/bin/rbx(_ZN8rubinius7ImmixGC10saw_objectEPNS_6ObjectE+0x4e)[0x75ccfe]
/opt/rubies/rubinius-2.0.0/bin/rbx(rb_gc_mark+0x37)[0x7482a7]
/opt/rubies/rubinius-2.0.0/bin/rbx(_ZN8rubinius4Data4Info4markEPNS_6ObjectERNS_10ObjectMarkE+0x6e)[0x6e5aee]
/opt/rubies/rubinius-2.0.0/bin/rbx(_ZN8rubinius16GarbageCollector11scan_objectEPNS_6ObjectE+0x8b)[0x758b8b]
/opt/rubies/rubinius-2.0.0/bin/rbx(_ZN8rubinius7ImmixGC18process_mark_stackEi+0x44)[0x75a424]
/opt/rubies/rubinius-2.0.0/bin/rbx(_ZN8rubinius11ImmixMarker7performEPNS_5StateE+0x11b)[0x75f8eb]
/opt/rubies/rubinius-2.0.0/bin/rbx(_ZN8rubinius18immix_marker_trampEPNS_5StateE+0x1e)[0x75fc7e]
/opt/rubies/rubinius-2.0.0/bin/rbx(_ZN8rubinius6Thread13in_new_threadEPv+0x462)[0x735cb2]
/lib64/libpthread.so.0(+0x8f3a)[0x7f61437c5f3a]
/lib64/libc.so.6(clone+0x6d)[0x7f6142bd59ad]

#2655 (comment) seems to have fixed such crashes. 50 rspec runs and no GC related crashes yet 👍 (mysql2 crashes still happen though...)

@dbussink
Copy link
Contributor

@sayap Getting a repro would be great and would really help with debugging this, since otherwise it gets really tricky and ends up being guess work about what's going on :). Seeing rb_gc_mark in backtraces usually indicates a problem with a C extension, so that could still be the same mysql2 issue (or another C extension).

@dbussink
Copy link
Contributor

Do you think you can extract the issue for the mysql2 crash so we can investigate that?

@sodabrew
Copy link
Contributor

In mysql2, we recently fought hard with Ruby GC and hoped that we'd won... @sayap Can you provide a repro script so that I can try to see the errors on my own?

@dbussink
Copy link
Contributor

@sodabrew What kind of issues were those?

@sodabrew
Copy link
Contributor

Some internal refcounting and RB_GC_GUARD usage to prevent the GC from freeing C structures out from under us. The mysql2 0.3.13 release has this work merged:
brianmario/mysql2#381
brianmario/mysql2#378

@sodabrew
Copy link
Contributor

@dbussink I also just merged a fix for mysql connections that were left dangling after a GC run. However, the specs for it require the ability to call GC.start. @sayap could you try mysql2 master and see if it improves the crash situation?

@dbussink is there a way I can pass -Xvm.gc.honor_start for Travis? Does Rubinius have an environment variable I can set command line args perhaps? https://travis-ci.org/brianmario/mysql2/jobs/14403397

@dbussink
Copy link
Contributor

@sodabrew You can use the RBXOPT environment variable for options like that.

@dbussink
Copy link
Contributor

@sodabrew I checked that spec also btw, and it seem pretty problematic to me to be honest. Nothing prevents a GC from happening at other points than where you do GC.start, so it could perhaps sporadically fail even when it's not a problem. Might be more theoretical problem, but things like this are always very tricky with specs like that.

@sodabrew
Copy link
Contributor

Got it, spec is passing now. Yes, other GC triggers are certainly a risk for that test, but it works reliably enough to catch and demonstrate the fix to the resource problem that was there before.

All that said, I re-read the original backtrace gist, and @sayap could you recompile mysql2 with debugging symbols (CFLAGS="-g") and then repro the crash once more? There's only a memory address within mysql2 and that's not enough to narrow down the problem (or even implicate mysql2 necessarily).

@dbussink
Copy link
Contributor

I don't know if @sayap also uses nokogiri, but I also sent a pull request there to fix a crash bug.

@sayap
Copy link
Author

sayap commented Nov 24, 2013

Still happen with latest rbx, latest mysql2, and nokogiri-1.5.10: https://gist.github.com/sayap/7623735

Will try recompiling mysql2 with debugging symbols, along with https://github.com/dbussink/nokogiri/commit/3fc79c0744d7a00da2d961eebcc99141a3fdf99c

Unfortunately, no reproducible environment yet...

@yorickpeterse
Copy link
Contributor

@sayap Did you manage to figure out a way to reproduce this in the mean time?

@yorickpeterse
Copy link
Contributor

Closing this one due to the lack of feedback. Feel free to re-open if the issue still persists when using Rubinius 2.5.3.

@sodabrew
Copy link
Contributor

I'm starting to see this pretty often on Travis with rbx-2.5.7: https://gist.github.com/sodabrew/656e0f44d33d6b46dc1b

I have a bunch of these, coming from different places in the Ruby trace, but the C traces are all similar. Memory pressure or GC thresholds would be triggering a GC run, at which point the mark function is called on all object, and mysql2 calls rb_gc_mark for things we're tracking. But I don't know what about the objects we're tracking might be causing the Rubinius implementation of rb_gc_mark to crash.

@yorickpeterse yorickpeterse reopened this Jul 13, 2015
@sodabrew
Copy link
Contributor

I guess, basically, "what would cause rb_gc_mark to segfault, and should I be testing against that condition before calling it?"

@yorickpeterse
Copy link
Contributor

My guess is that either our GC is messed up or somehow rb_gc_mark is passing bogus data to the GC. GDB should reveal more information but requires being able to reproduce this outside of Travis. Having seen this pattern before I wouldn't be surprised if somehow a NULL is being passed around where it shouldn't be.

@yorickpeterse yorickpeterse added C-API Compatibility Function and feature compatibility with the MRI C-API and removed Needs Feedback labels Jul 13, 2015
@sodabrew
Copy link
Contributor

I searched for other Rubinius bugs with the text "saw_object" and traced my way to sparklemotion/nokogiri#1047 - since mysql2 is also a C extension, the issues might be related? (Edit: oh derp, you even linked to nokogiri above).

@brixen
Copy link
Member

brixen commented Jul 14, 2015

@sodabrew didn't you recently fix another GC-related issue in mysql2? Could this be related? Did you run this under Valgrind? Is this consistently reproducible when running the tests? Have you built Rubinius in debug mode (./configure --debug-build) and run this so you can inspect the Ruby Data object from which rb_gc_mark is being called?

@tamird
Copy link
Contributor

tamird commented Jul 14, 2015

I built rbx in debug and reproduced

@yorickpeterse
Copy link
Contributor

@tamird When this occurs could you please provide GDB backtraces as explained at http://rubini.us/doc/en/how-to/obtaining-gdb-backtraces/? Also if you could print the various variables in the offending call frame (using p some-variable-name-here) that would hopefully give more insight in what kind of data is being passed around.

@sodabrew
Copy link
Contributor

I haven't seen any failures on Travis today - Is it possible that Rubinius 2.5.8 fixes this?

@tamird
Copy link
Contributor

tamird commented Jul 16, 2015

Running Rubocop in the same VM that ran the mysql2 tests failed here: https://travis-ci.org/brianmario/mysql2/jobs/71173451

@sodabrew
Copy link
Contributor

Yep, bummer. Here's a gist of that crash for future reference: https://gist.github.com/sodabrew/e7de223e019a390a3926

@brixen brixen added the mysql2 label Jun 7, 2016
@brixen brixen removed the mysql2 label Jan 4, 2020
@brixen
Copy link
Member

brixen commented Jan 4, 2020

In general, the MRI C-API for Rubinius is deprecated, but that doesn't mean it will be going anywhere soon. Compatibility for C-extensions will continue to be evaluated on a case-by-case basis. However, C-extensions that depend on functions that examine or manipulate built-in MRI data structures will never be supported.

The focus for Rubinius in the near term is on the following capabilities:

  1. Instruction set
  2. Debugger
  3. Profiler
  4. Just-in-time compiler
  5. Concurrency
  6. Garbage collector

Contributions in the form of PRs for any of the areas of focus above are appreciated.

@brixen brixen closed this as completed Jan 4, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
C-API Compatibility Function and feature compatibility with the MRI C-API Process Abort | Hang
Projects
None yet
Development

No branches or pull requests

6 participants