Provide an optional threaded background worker #318

dsander · 2014-05-11T22:26:51Z

Due to the three background workers (scheduler, twitter stream and delayed job) huginn needs a lot of memory to run (about 520MB on my dev machine). This PR introduces an optional threaded background worker which combines the three current separated processed into one (reducing the memory footprint to ~260MB).

Since just one instance of the of every background processor is running at a time there should not be any threading related issues.

The main gotcha of this is, that it's most likely not possible to run multiple delayed job workers concurrently. The ultimate solution would probably be switching to sidekiq with sidetiq as scheduler, but that is a different task :)

When running on MRI the GIL should not be an issue because it is released for most IO
bound operations (waiting for the database/website/sleeping).

Due to the three background workers (scheduler, twitter stream and delayed job) huginn needs a lot of memory to run (about 520MB on my dev machine). This PR introduces an optional threaded background worker which combines the three current separated processed into one (reducing the memory footprint to ~260MB). Since just one instance of the of every background processor is running at a time there should not be any threading related issues. The main gotcha of this is, that it's most likely not possible to run multiple delayed job workers concurrently. The ultimate solution would probably be switching to sidekiq with sidetiq as scheduler, but that is a different task :) When running on MRI the GIL should not be an issue because it is released for most IO bound operations (waiting for the database/website/sleeping).

0xdevalias · 2014-05-12T14:03:06Z

@dsander The ultimate solution would probably be switching to sidekiq with sidetiq as scheduler, but that is a different task

I've actually been pondering (and a couple of quick googles) on whether there was a 'better' scheduling system that could be used. I was thinking along the lines of resque (partly because it's designed for distributed stuff, partly because it's designed to handle a lot of volume, mostly because I fanboi github)

dsander · 2014-05-12T17:11:11Z

@alias1 Resque would also be a good replacement for delayed job. The benefit of sidekiq is that it is also threaded, that way multiple workers consume consume less RAM

cantino · 2014-05-12T20:41:24Z

One of the downsides I see with switching away from delayed_job is that jobs stop being transactional. Right now, when a job is enqueued in DJ by a ActiveRecord hook, that job won't run until the surrounding transaction is committed. If an error causes the transaction to roll back, the DJ will correctly never be scheduled. With other systems, such as sidekiq and resque, these transactional semantics are not present.

How important do we feel this is?

dsander · 2014-05-12T20:55:24Z

Oh, I was not aware of the transactions. Actually i just mentioned sidekiq because it fit the threading context of the PR. The polling architecture of DJ might be a bit out of date, but other then that it does it's job well. That is also why I just merged the three background processes we have now into one and kept DJ as it is.
I think that right now there are more important tasks like adding OAuth consumer, agent scenarios or moving agents to gems.

cantino · 2014-05-13T05:37:32Z

Where there any changes in scheduler or twitter_stream besides pulling into libraries?

cantino · 2014-05-13T05:38:21Z

Also, the twitter_stream is using eventmachine. I guess that's okay in a thread?

0xdevalias · 2014-05-13T08:45:17Z

@dsander I think that right now there are more important tasks like adding OAuth consumer, agent scenarios or moving agents to gems.

Definitely agree.

Also, on the resque side of things, it isn't really super relevant (and might be overkill given I believe it requires redis and stuff) for a self deployed/single user type Huginn setup. Where it would become really interesting is with massive scale, and/or huge numbers of distributed agents.

dsander · 2014-05-14T18:57:50Z

lib/huginn_scheduler.rb

+  def stop
+    @rufus_scheduler.stop
+  end
+


@cantino Here I added the stop method to be able to stop the scheduler from the threaded script

dsander · 2014-05-14T19:04:24Z

I do not think running eventmachine within a thread will cause any issues. The twitter stream is running in its own thread, so it should really be just like running in a process.

cantino · 2014-05-15T05:59:15Z

Do you think we should just switch to this entirely, or leave it commented out by default? I'm tempted to switch.

cantino · 2014-05-15T06:00:43Z

bin/threaded.rb

+end
+
+# We need to wait a bit to let delayed_job set it's traps so we can override them
+sleep 0.5


This feels like it could race on a slow machine, unfortunately.

True, the problem is that DJ sets it's traps in the start method https://github.com/collectiveidea/delayed_job/blob/master/lib/delayed/worker.rb#L136-L146
I will check if there is a way to ensure our traps are set after DJs

dsander · 2014-05-15T13:26:47Z

Do you think we should just switch to this entirely, or leave it commented out by default? I'm tempted to switch.

I think once a few people tried this out and report no issues we could be safe to switch entirely.

0xdevalias · 2014-05-16T01:29:22Z

What's the benefit/risk of switching entirely? I'm not fully up on what's happening, but in layman a what's the gain from it?—
Sent from Mailbox

On Thu, May 15, 2014 at 11:26 PM, Dominik Sander notifications@github.com
wrote:

Do you think we should just switch to this entirely, or leave it commented out by default? I'm tempted to switch.

I think once a few people tried this out and report no issues we could be safe to switch entirely.

Reply to this email directly or view it on GitHub:
#318 (comment)

dsander · 2014-05-16T13:26:46Z

The risk is that we can not be 100% sure that everything will work in the threaded context. But that can be tested by just letting the background workers run for a few days.
The benefit is that the memory usage is cut in half.

0xdevalias · 2014-05-16T14:15:09Z

Cool :)

0xdevalias · 2014-05-26T00:27:59Z

@dsander Does this change how things would be started up from foreman export? I have a feeling this might make my issue on #339 (unicorns not dying properly) easier?

dsander · 2014-05-26T08:02:33Z

@alias1 I don't think so because this does not change the way foreman starts or stops unicorn

0xdevalias · 2014-05-26T08:16:28Z

Oh ok. Figured it might have smooshed all the workers into the main thread.

coveralls · 2014-06-01T10:25:28Z

Coverage remained the same when pulling df90322 on dsander:threaded-background-workers into 44853e4 on cantino:master.

dsander · 2014-06-01T22:01:39Z

@cantino The threaded worker is now the default, are we confident to enough to merge this?

cantino · 2014-06-02T05:07:44Z

Well, it's been working fine for me. Have you had any issues?

dsander · 2014-06-02T08:19:04Z

Nope it worked without issues so far.

cantino · 2014-06-02T16:15:57Z

Let's go for it. First step in getting to an easy Heroku free deploy!

On Monday, June 2, 2014, Dominik Sander notifications@github.com wrote:

Nope it worked without issues so far.

—
Reply to this email directly or view it on GitHub
#318 (comment).

dsander · 2014-06-02T17:21:39Z

Ok, merging now 😄

Provide a new threaded background worker

Use threaded workers in the chef production Procfile

ec0ec30

dsander added enhancement labels May 13, 2014

dsander reviewed May 14, 2014
View reviewed changes

cantino reviewed May 15, 2014
View reviewed changes

dsander added 2 commits June 1, 2014 12:12

Merge branch 'master' into threaded-background-workers

a505a1c

Threaded background worker now is the default

df90322

dsander mentioned this pull request Jun 2, 2014

RESTful polling #52

Open

dsander added a commit that referenced this pull request Jun 2, 2014

Merge pull request #318 from dsander/threaded-background-workers

577b95f

Provide a new threaded background worker

dsander merged commit 577b95f into huginn:master Jun 2, 2014

dsander deleted the threaded-background-workers branch June 2, 2014 17:22

DataMinerUK pushed a commit to DataMinerUK/huginn that referenced this pull request Oct 6, 2014

Merge pull request huginn#318 from dsander/threaded-background-workers

ec8e8fc

Provide a new threaded background worker

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Provide an optional threaded background worker #318

Provide an optional threaded background worker #318

dsander commented May 11, 2014

0xdevalias commented May 12, 2014

dsander commented May 12, 2014

cantino commented May 12, 2014

dsander commented May 12, 2014

cantino commented May 13, 2014

cantino commented May 13, 2014

0xdevalias commented May 13, 2014

dsander May 14, 2014

dsander commented May 14, 2014

cantino commented May 15, 2014

cantino May 15, 2014

dsander May 15, 2014

dsander commented May 15, 2014

0xdevalias commented May 16, 2014

I think once a few people tried this out and report no issues we could be safe to switch entirely.

dsander commented May 16, 2014

0xdevalias commented May 16, 2014

0xdevalias commented May 26, 2014

dsander commented May 26, 2014

0xdevalias commented May 26, 2014

coveralls commented Jun 1, 2014

dsander commented Jun 1, 2014

cantino commented Jun 2, 2014

dsander commented Jun 2, 2014

cantino commented Jun 2, 2014

dsander commented Jun 2, 2014

Provide an optional threaded background worker #318

Provide an optional threaded background worker #318

Conversation

dsander commented May 11, 2014

0xdevalias commented May 12, 2014

dsander commented May 12, 2014

cantino commented May 12, 2014

dsander commented May 12, 2014

cantino commented May 13, 2014

cantino commented May 13, 2014

0xdevalias commented May 13, 2014

dsander May 14, 2014

Choose a reason for hiding this comment

dsander commented May 14, 2014

cantino commented May 15, 2014

cantino May 15, 2014

Choose a reason for hiding this comment

dsander May 15, 2014

Choose a reason for hiding this comment

dsander commented May 15, 2014

0xdevalias commented May 16, 2014

I think once a few people tried this out and report no issues we could be safe to switch entirely.

dsander commented May 16, 2014

0xdevalias commented May 16, 2014

0xdevalias commented May 26, 2014

dsander commented May 26, 2014

0xdevalias commented May 26, 2014

coveralls commented Jun 1, 2014

dsander commented Jun 1, 2014

cantino commented Jun 2, 2014

dsander commented Jun 2, 2014

cantino commented Jun 2, 2014

dsander commented Jun 2, 2014