Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP

Loading…

Seg fault running job under Sidekiq #164

Closed
ruckus opened this Issue · 10 comments

3 participants

@ruckus

Environment

  • OS X 10.7.3
  • Ruby 1.9.3-p194 built using RVM and osx-gcc-installer (instead of clang as I saw another post from @mperham that suggested clang can build bad rubies)
  • Sidekiq 1.1.4

Problem: running my job under Sidekiq results in a seg fault that appears to be outside of sidekiq. Running the same job multiple times results in a new seg fault but in a different place, so the fault location is not consistent.

Running the job inline in the console (just calling perform directly) results in the job running successfully. So this seg fault only happens under sidekiq.

I realize that this seg fault appears to happen outside of sidekiq, but it does at least appear to be related to Celluloid.

Full gist with stacktrace here: https://gist.github.com/2513242

Sorry if this is not an issue with sidekiq and lies elsewhere. But like I said the code runs fine when not invoked via Sidekiq.

@mperham
Owner

That's a crash in Ruby's Enumerable#zip method. I realize that Sidekiq may "seem" at fault but Sidekiq is pure Ruby: it can't cause a segfault. The only thing that can do that are native extensions or a bug in Ruby itself.

@ruckus

Ugh. Yeah, it seems to shift locations. Running again yields this location:

2012-04-27T21:30:40Z 6528 TID-ovhokyg04 INFO: IntuitSyncRequest MSG-ovhokygbs start
/Users/codyc/.rvm/gems/ruby-1.9.3-p194@vino/gems/activerecord-3.2.2/lib/active_record/connection_adapters/postgresql_adapter.rb:606: [BUG] Segmentation fault
ruby 1.9.3p194 (2012-04-20 revision 35410) [x86_64-darwin11.3.0]

-- Control frame information -----------------------------------------------
c:0061 p:---- s:0291 b:0291 l:000290 d:000290 CFUNC  :values
c:0060 p:0037 s:0288 b:0288 l:000287 d:000287 METHOD /Users/codyc/.rvm/gems/ruby-1.9.3-p194@vino/gems/activerecord-3.2.2/lib/active_record/connection_adapters/postgresql_adapter.rb

I'm not even sure where to begin to debug this and/or report it to the appropriate project.

@ruckus ruckus closed this
@ruckus

As a followuop: as much as I love Sidekiq this was a show-stopper for me and I ended up having to migrate to Resque. The same job runs fine under Resque. :(

I'd love to file this report in the appropriate project but its not clear where the issue is.

@mperham
Owner

It's quite possible that you have a native gem that does not behave well in a multithreaded system. I like to think that Ruby is getting better about this but there are still popular gems out there that are definitely not thread-safe.

Could you share the set of gems you use which have native extensions?

@ruckus

Of course.

Here are all my gems with native extensions:


Installing bcrypt-ruby (3.0.1) with native extensions 
Installing json (1.7.0) with native extensions 
Installing fastthread (1.0.7) with native extensions 
Installing nokogiri (1.5.2) with native extensions 
Installing hiredis (0.4.5) with native extensions 
Installing libxml-ruby (2.3.2) with native extensions 
Installing pg (0.13.2) with native extensions 
Installing therubyracer (0.10.1) with native extensions 
Installing yajl-ruby (1.1.0) with native extensions 

Thanks again. Sidekiq is awesome. You're doing great work.

@mperham
Owner

fastthread is legacy and only useful for Ruby 1.8.6 and older. It's definitely not appropriate for a 1.9 application. I believe therubyracer is used for asset pipeline compilation and not runtime.

Anyhow, thanks for the list.

@fred

@mperham im also having segmentation faults now. Mostly coming from pg_ext gem
here is the full trace http://pastebin.com/gCyc5LrW

@fred

ok, im back on sidekiq 2.0.3 and celluloid 0.11.0

gem 'sidekiq', '2.0.3'
gem 'celluloid', '0.11.0'

no more segfaults, I think it was celluloid 0.11.1 causing it.

@mperham
Owner

@fred I would register a bug with pg_ext and ruby. Celluloid and Sidekiq are pure ruby - they should not be able to crash a VM.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.