Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could Sidekiq use concurrent-ruby? #2583

Closed
mperham opened this Issue Oct 2, 2015 · 24 comments

Comments

Projects
None yet
@mperham
Copy link
Owner

mperham commented Oct 2, 2015

Celluloid is a big dependency and to be perfectly frank, I really don't understand it very well. Is concurrent-ruby simpler and easier to understand? Can it replace Celluloid for Sidekiq's simple needs? Anyone interested in porting Sidekiq to use concurrent-ruby and see how well it works?

@davydovanton

This comment has been minimized.

Copy link
Collaborator

davydovanton commented Oct 2, 2015

Also I know that concurrent-ruby actors faster than Celluloid actors (link)

@HoneyryderChuck

This comment has been minimized.

Copy link

HoneyryderChuck commented Oct 2, 2015

would help to know the main drawbacks of having celluloid as a main dependency, and what you perceive as the biggest benefits of switching.

@mperham

This comment has been minimized.

Copy link
Owner Author

mperham commented Oct 2, 2015

The performance looks great compared to Celluloid but the code/API is not idiomatic Ruby at all. One really nice thing about Celluloid is that it's core Actor API fits in so well with Ruby's OOP: the message is an arbitrary method call, whereas with concurrent-ruby's API, you have to explicitly create, send and process messages yourself. For example, here's Celluloid:

class SomeActor
  include Celluloid

  def initialize(count)
    @a = count
  end

  def incr(amount)
    @a += amount
  end
end

c = SomeActor.new(0)
c.async.incr(2)
c.terminate

Here's concurrent-ruby:

Message = Struct.new(:action, :value)

class SomeActor < Concurrent::Actor::RestartingContext
  def initialize(count)
    @a = count
  end

  def on_message(msg)
    case msg.action
    when :incr
      @a += msg.value
    end
  end
end

count = SomeActor.spawn(:counter, 0)
count.tell(Message.new(:incr, 2))
count.ask!(:terminate!) 

My take: Celluloid is a much nicer redesign of the Actor pattern into an OOP language. concurrent-ruby's Actor API appears to be a much more traditional Erlang-inspired API but does not feel "right". That bare metal approach will get better performance but at the expense of developer friendliness.

@mperham

This comment has been minimized.

Copy link
Owner Author

mperham commented Oct 2, 2015

@TiagoCardoso1983 I've been a little down on Celluloid recently because of the explosion in little gems it pulls in. I don't want a half-dozen little gems with no documentation about why I want/don't want a particular gem:

  celluloid-essentials (>= 0)
  celluloid-extras (>= 0)
  celluloid-fsm (>= 0)
  celluloid-pool (>= 0)
  celluloid-supervision (>= 0)
  timers (>= 4.1.1)

Which of these does Sidekiq need? I have no idea. Why are the gems broken down like this? I have no idea. But I do know that now it's harder to trace through code that's split across 6 gems.

@searls

This comment has been minimized.

Copy link

searls commented Oct 2, 2015

I suggest we yank @jdantonio in here to explain why concurrent-ruby is awesome 💚

@mperham

This comment has been minimized.

Copy link
Owner Author

mperham commented Oct 2, 2015

Also, note that concurrent-ruby actors cannot perform blocking I/O because they use a shared pool of Threads, instead you must split your Actors into compute and I/O instances? Seems even more developer-unfriendly. :-(

https://github.com/ruby-concurrency/concurrent-ruby/blob/master/doc/actor/main.md#io-cooperation

@searls I'm definitely not trying to slag c-r here but giving it an honest evaluation. An idiomatic Ruby API is very important to me. I know that c-r's Actor API is still young and under development and many of these issues can be fixed.

@jdantonio

This comment has been minimized.

Copy link
Contributor

jdantonio commented Oct 2, 2015

Although an actor-to-actor comparison makes sense on the surface, one of the advantages of c-r is that we offer multiple high-level abstractions. Sidekiq already provides a great high-level API for the end-user. Internally it can use whatever abstraction makes the most sense. For example, the new AsyncJob in Rails 5 is a functional replacement for sucker_punch and it posts directly to a thread pool, skipping over all the overhead that actors entail. See here. The relevant bit:

      def enqueue(job_data, queue: 'default') #:nodoc:
        QUEUES[queue].post(job_data) { |job| ActiveJob::Base.execute(job) }
      end

QUEUES is nothing but a collection of thread pools where where each queue is a different pool:

    QUEUES = Concurrent::Map.new do |hash, queue_name| #:nodoc:
      hash.compute_if_absent(queue_name) { ActiveJob::AsyncJob.create_thread_pool }
    end
@jdantonio

This comment has been minimized.

Copy link
Contributor

jdantonio commented Oct 2, 2015

@mperham All of our high-level abstractions support thread pool configuration via dependency injection. The patter of having one global thread pool for IO-intensive tasks and another for fast, non-blocking tasks is a patter we borrow from Clojure. Nothing more.

@jdantonio

This comment has been minimized.

Copy link
Contributor

jdantonio commented Oct 2, 2015

@mperham I'm not going to lie--I've looked at the source of Sidekiq and given a lot of thought to how c-r could be used instead of Celluloid. I'm happy to spike something out for you to look at.

@mperham

This comment has been minimized.

Copy link
Owner Author

mperham commented Oct 2, 2015

@jdantonio I see, so I could have a thread pool for one type of Actor so that Actor instances (e.g. Sidekiq::Processor) could perform blocking I/O without worry? Rad. Could I size the default thread pool size if I know how many Actors I'll have and that way every thread can do blocking I/O without issue?

Sidekiq has 3-4 singleton Actors (Manager, Fetcher, Scheduler, etc) + a set of Processors. I prefer to pre-allocate a Thread for each because the system does not dynamically allocate Actors. If you set Sidekiq concurrency to 20, you'll get about 25 Actor instances in the system.

@mperham

This comment has been minimized.

Copy link
Owner Author

mperham commented Oct 2, 2015

@jdantonio Are you going to Rubyconf? We could pair there one night and spike something if this doesn't happen before.

@jdantonio

This comment has been minimized.

Copy link
Contributor

jdantonio commented Oct 2, 2015

@mperham Yes to all the above. Not only can you configure the global thread pools, but we also have SingleThreadExecutor for when you want a single thread (another idea borrowed from Java). A STE is basically a pool of 1 with a queue. Internally it monitors the thread so should something happen to it, it will be restarted. So that's a very good option for those singleton instances.

I'm attending RubyConf (and speaking on Mon "Everything You Know About the GIL is Wrong"). I'd love to pair with you. Another member of the team, @pitr-ch will be attending and speaking as well.

@mperham

This comment has been minimized.

Copy link
Owner Author

mperham commented Oct 2, 2015

@jdantonio Sounds good, let's meet up at Rubyconf and see what needs to be done then.

@jdantonio

This comment has been minimized.

Copy link
Contributor

jdantonio commented Oct 2, 2015

👍

@pitr-ch

This comment has been minimized.

Copy link

pitr-ch commented Oct 2, 2015

Regarding making the actors to feel more OOP, see what @iNecas did here. @mperham great news that you are considering c-r :) I would love to join the discussion on RubyConf. Most of the actor implementation is my contribution so I should be able to answer any questions you might have.

@HoneyryderChuck

This comment has been minimized.

Copy link

HoneyryderChuck commented Oct 2, 2015

@mperham the explosion in little gem means that celluloid team saw (finally) that they're providing too many features inside the celluloid "framework-not-gem" (what do futures have to do with the actor pattern?). I think that they're separating now and bundling all together for backwards compatibility, in the future you only add what you really need. Which in your case, means the basic celluloid (no extras, no fsm, no pool, no supervision), since sidekiq actors mainly set timer events and call async functions.

Things you maybe should be using already: supervisor for your processor actors (you must have had your reasons not to though).

Things you should guard against and (I think) c-r will not help you with: random use of the Timeout module. For that, I'd suggest https://github.com/celluloid/timeout-extensions and overwrite of Timeout::timeout and Kernel.sleep with the proper celluloid implementations. Been using this for over a year with my celluloid-io project without issues.

Non-blocking IO support is interesting, but not really the main focus of sidekiq, right? Do you want your n workers to perform m tasks, in general? Sounds like unexpected behaviour. In the end, the libraries used in the job will define and will have to opt-in in this behaviour (celluloid-io solves this already. one could provide examples of setting one's non-blocking network fetcher inside a sidekiq worker, though).

Celluloid has IMO three main drawbacks: method dispatching (awfully slow), time spent in futex syscalls (reason why celluloid-io doesn't match eventmachine in performance, even if the code is more maintainable) and byzantine backtraces. And these haven't changed in 0.17 . I don't know c-r actor implementation that well to know whether these issues are solved there, but the benchmarks do look good.

@halorgium

This comment has been minimized.

Copy link
Contributor

halorgium commented Oct 2, 2015

@mperham I agree with @jdantonio. I have switched some simple code away from Celluloid because the gains from the abstractions were outweighed by the hidden complexity.
Sidekiq is a pretty static system, so having a few thread-pools and queues would be enough to achieve the same behaviour.
I would love to see an prototype of Sidekiq on c-r. I have a feeling that the heavy weight associated with Celluloid and also the workarounds needed because of some abstraction issues may be eliminated by using lower-level primitives.
Good luck!

@mperham

This comment has been minimized.

Copy link
Owner Author

mperham commented Oct 2, 2015

@halorgium I think you're right. Sidekiq should be simple enough to port to a simpler abstraction. We'll see what if anything comes of it.

@tarcieri

This comment has been minimized.

Copy link
Collaborator

tarcieri commented Oct 3, 2015

As the author of Celluloid and a user of concurrent-ruby, I have some Opinions™

If you want to stay on Celluloid, ideally there would be a single gem you could use for the purposes of Sidekiq. I'm not sure the current Celluloid gem structure offers that, but I think you really just need celluloid-core or something to that effect.

I think for what Sidekiq is actually doing, you want a ThreadPoolExecutor ala java.util.concurrent.ThreadPoolExecutor. You really don't need actors. So really any "actor" benchmarks are kind of a red herring.

We've tried to move apps at Square from Celluloid pools to concurrent-ruby ThreadPoolExecutors. These were actually unacceptably slow on MRI. Things may have changed.

I was working for awhile on a stream processing library based on concurrent-ruby called Turbine. In the course of developing this library, I found a lot of slow, crazy code gluing everything together, escalated to @headius, and eventually got things fixed upstream (although with no real trail of the problems, investigation, who contributed the fixes, or effectively any information about what happened). I am still encountering various thread-safety problems on Turbine, and have basically abandoned the project at this point.

We tried to switch from Celluloid::Pools to concurrent-ruby ThreadPoolExecutors at Square and saw huge performance regressions.

Ultimately I'd like to say that I think concurrent-ruby is a better fit for Sidekiq, and you should use their ThreadPoolExecutor, and that this would give you massive performance gains on JRuby, but I have personally had a really bad time trying to use concurrent-ruby. I know there has been a long history of bugs and code quality problems in Celluloid too, but when it comes down to it concurrency is a really, really, really hard problem.

I think the outstanding issues I've just been complaining about have been mostly resolved.

That said, if you try to move projects from Celluloid to concurrent-ruby, I expect the semantic change will break a lot of people's jobs.

Celluloid implements "flexible object interactions" as described in section 4.4.2 of the ATOM paper. I think if anyone is actually leveraging this behavior, they'll get deadlocks with a ThreadPoolExecutor-style model:

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.47.5074&rep=rep1&type=pdf

Anyway, not to speak too harshly of concurrent-ruby: it's a good project implementing lightweight concurrency primitives, and I don't think Sidekiq fundamentally needs anything more than a ThreadPoolExecutor.

That said, my (basically abandoned) attempt at building an executor based on that pattern isn't thread-safe. So good luck!

@jdantonio

This comment has been minimized.

Copy link
Contributor

jdantonio commented Oct 3, 2015

Thank you for the feedback @tarcieri. I wasn't aware of your efforts to use concurrent-ruby. I would have been happy to offer you assistance had I known, but you never submit any Issues to us nor contacted anyone on the team. In all of the performance tests we've done, implementations using c-r significantly outperform Celluloid--even on MRI--so I don't know why your experience was different. If you choose to work with c-r in the future please don't hesitate to reach out to me. I'd love an opportunity to work with you.

All:

Two things we need to keep in mind: 1) c-r provides many high level and low level abstractions, including thread pools and actors. This isn't a binary choice between the two. 2) Sidekiq users don't interact with Celluloid directly, they interact with a Sidekiq class which uses Celluloid.

The question that we need to explore is this: Can the behavior and guarantees of Sidekiq be retained using an internal implementation that uses c-r instead of Celluloid. I believe the answer is yes, and will spike some ideas. If I get something working, then it's a matter of running the code side-by-side on multiple platforms to compare performance and stability.

@brandonhilkert

This comment has been minimized.

Copy link
Contributor

brandonhilkert commented Oct 3, 2015

As the author/maintainer of sucker_punch, it's awesome to see alternatives crop up.

A bit of potentially unwarranted feedback, I was curious what the equivalent would look like with concurrent-ruby and the API wasn't terribly intuitive. But honestly, I don't think this level of complexity comes natural to me so it may take some time to let it simmer.

Alternatively, I've been putting off upgrading sucker_punch to celluloid ~ 0.17.x because of all the changes. It's feels like a moving target with the releases that were yanked and the recent memory leaks.

Sucker Punch was meant to be a wrapper for celluloid because of the use of it as a background job queue wasn't completely intuitive to less experienced developers IMO.

Seems like there's a place for each of these tools. Cheers and thanks for all ya'lls hard work!

@mperham

This comment has been minimized.

Copy link
Owner Author

mperham commented Oct 3, 2015

For anyone interested, I've started a spike to remove Celluloid. I've finished porting the Fetcher and Scheduler to use raw threads but the tough one will be the Manager, which performs more distinct operations than those two. We'll see if I need to pull in c-r; I tend to take a dim view of dependencies these days.

https://github.com/mperham/sidekiq/compare/concurrent-ruby

@mhenrixon

This comment has been minimized.

Copy link
Contributor

mhenrixon commented Oct 8, 2015

👍

@saouddk

This comment has been minimized.

Copy link

saouddk commented Nov 6, 2015

Awesome performance increase with c-r.

@mperham mperham closed this Nov 6, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.