dj: memory leak #978

yhaenggi · 2015-08-20T10:55:16Z

delay_job is using more and more memory over time. it starts at ~150M and rises forever, slowly tough. As i have 16 workers, im affected much faster.
Any way to debug this? Im currently just restarting the workers every X hours.
Im using Web,Mail,MailDigest,XMPP,Trigger-Agents. With Web+Trigger being the most used.

Can foreman handle memory usage of dj? ie, restart a dj if mem is higher than X and no jobs currently running? restarting a worker after X jobs might also be an option.

cantino · 2015-08-21T02:25:38Z

Hey @K1773R, that's frustrating! I haven't noticed that. It's possible that upstart is killing and restarting them for me when ram runs low, but I haven't observed that. Are you running any unusual Agents? I suspect it's some Agent that I don't use often.

yhaenggi · 2015-08-21T04:34:40Z

Are the ones i listed unusual agents?

cantino · 2015-08-21T16:35:16Z

XMPP seems the most likely. If you disable it, does memory stabilize?

yhaenggi · 2015-08-22T17:13:07Z

it leaks even when there are no xmpp agents...

cantino · 2015-08-22T18:18:10Z

That's very odd. @dsander, have you seen memory issues with DJ?

dsander · 2015-08-22T18:28:46Z

@cantino I did not really monitor the memory usage on my server, but I doubt the issue is DJ iteself.

@K1773R debugging memory leaks in ruby is a bit annoying, if you notice the leak within a few minutes the quickest way would probably be to disable all but one agent type, restart your worker and check if the leak still exists.

If that is not an option and you are using 2.1+ I will try to add some sort of memory profiling option tomorrow.

yhaenggi · 2015-08-22T19:51:02Z

im using ruby 2.2.2p95

virtadpt · 2015-08-22T22:48:52Z

I may have just run into this (where 'just' == sometime early last evening). Two virtual CPUs, four job runners, and both jobs and the web user interface died with a signal 137 after staying online for something like four days running. I had to kill the delayed_job processes and restart from scratch to get Huginn running again. As an experiment I've cut my delayed job fleet to three to see how long it takes to run out of RAM.

Ruby v2.1.5p273.

cantino · 2015-08-23T00:45:26Z

Did this issue start recently?

virtadpt · 2015-08-23T02:01:31Z

The hang-up happened some time last night (maybe 1900 hours PDT). I last updated my running install on 19 August (commit c3e9545), after which memory utilization started creeping up until something broke. Going through log/development.log, there isn't anything in there because the error message was emitted by the shell Huginn was started from. It didn't say which process received the out-of-memory signal, though. I can fire up a few more DJ's to artifically use up RAM and see what part falls over if it would help.

cantino · 2015-08-23T02:06:17Z

What commit were you running before this, that seemed more stable? And are you running in RAILS_ENV=production or RAILS_ENV=development?

virtadpt · 2015-08-23T05:37:35Z

I was running commit 9e8f9bf in development mode. Still in development mode.

virtadpt · 2015-08-23T21:30:29Z

I have a stack trace from two days ago, from approximately the time of the crash. I don't know if it'll help debug this issue (I'm working to reproduce it independently, which is why I jumped into this ticket's discussion thread) but I can throw it on pastebin if it'll help.

cantino · 2015-08-23T21:33:24Z

Sure, let's see it.

cantino · 2015-08-23T21:40:12Z

Well, we shouldn't be running Huginn in development mode for production use, but there was really very little code change between your two commits.

The only things we did that I could imagine affecting RAM were to remove spring and to add letter_opener_web.

You should ensure spring isn't being used with spring stop and you could try commenting out gem 'letter_opener_web' in the Gemfile, config.action_mailer.delivery_method = :letter_opener_web in development.rb, and mount LetterOpenerWeb::Engine, at: "/letter_opener" in routes.rb, and see if that helps.

virtadpt · 2015-08-23T22:02:04Z

Here it is: http://pastebin.com/TcBR22tq

Removing Spring has done a world of good - things are much more stable now! I've been keeping an eye on it, and it appears to be gone for good.

cantino · 2015-08-23T22:23:22Z

That looks like the box running out of RAM or disk. Was this correlated with ballooning RAM? If so, do you know how high it got?

virtadpt · 2015-08-23T23:31:11Z

That was my analysis also - the amount of free RAM hit zero and the OOM killer took out a process. I don't know which one because this is all the kernel message buffer has in it:

[3813894.197850] Killed process 11626 (ruby) total-vm:1596164kB, anon-rss:610824kB, file-rss:0kB

So, some aspect of Huginn got killed but it's not clear which one it was. I'm trying to recreate this problem on my instance to help debug it, so I just captured the ruby processes running right now, and if one mysteriously dies I can compare the lists and see which one was the victim.

dsander · 2015-08-24T20:23:37Z

@K1773R I updated my memory profile branch. If you import this scenario the MemoryProfileAgent will count the allocated objects every 5 minutes and send the data to a PeakDetectionAgent, when a peak is detected a memory dump will be written to the configured directory. It works with multiple background workers but the results will probably look a bit strange because the job will run on a random worker process. To get an general overview (see how many objects are leaked over time) it would probably the best to run just one worker for a few hours.

To analyze a memory dump a small script is added to the bin directory:

Get an overview how many objects have been allocated (and still are) in which garbage collection generation run: ruby bin/analyze.rb <pathto.json>. To show which file and line allocated the object add the generation as second parameter: ruby bin/analyze.rb <pathto.json> <generation number>

I ran the branch over night on my production server and also see a small leak of objects but was not able to find the source, if you are running more agents the results might be more conclusive.

yhaenggi · 2015-08-24T20:34:46Z

@dsander switching to a single worker is currently not really an option.
how can i migrate from cantino/huginn master to your branch and afterwards back w/o breaking things?

dsander · 2015-08-24T20:41:05Z

My branch is based on the current master branch and just adds the memory profile agent. As long as you delete that agent before you switch back there should not be any problems.

virtadpt · 2015-08-25T15:50:14Z

Data!

06:28:11 dj1.1      | script/delayed_job:5:in `<main>'
06:28:52 schedule.1 | Queuing event propagation
06:28:52 schedule.1 | Queuing schedule for every_1m
06:29:51 schedule.1 | Queuing schedule for every_2m
06:29:54 schedule.1 | Queuing event propagation
06:29:54 schedule.1 | Queuing schedule for every_1m
06:31:17 schedule.1 | Queuing event propagation
06:31:44 twitter.1  | Man in stable condition after East Oakland shooting http://t.co/owRD0aeFBa
06:31:49 twitter.1  |  -> Lifeline: Monitor Twitter for keywords
06:34:13 dj1.1      | exited with code 137
06:34:13 dj1.1      | Killed
06:34:13 system     | sending SIGTERM to all processes
06:34:13 web.1      | terminated by SIGTERM
06:34:13 schedule.1 | terminated by SIGTERM
06:34:13 twitter.1  | terminated by SIGTERM
06:34:13 dj0.1      | terminated by SIGTERM
06:34:13 dj2.1      | terminated by SIGTERM

dj.1 was killed with a signal 137 (out of memory). The 'system` process detected this and sent SIGTERMs to the web server, scheduler, Twitter subsystem, and dj0.1 and dj2.1. However, not everything is responding to SIGTERM correctly. From my currently running process list:

14106 pts/5    Sl    13:20 /home/drwho/.rvm/rubies/ruby-2.1.5/bin/ruby bin/rails server -b127.0.0.1
14114 pts/5    Sl   132:16 ruby script/delayed_job -i 0 run
14120 pts/5    Rl   136:16 ruby script/delayed_job -i 2 run

cantino · 2015-08-25T20:00:41Z

@virtadpt can you try @dsander's MemoryProfilingAgent?

virtadpt · 2015-09-04T21:47:39Z

How up to date is @dsander's fork of the codebase?

virtadpt · 2015-09-04T21:48:43Z

I'm trying to recreate the OOM problem by force by manually starting DJs. I'm capturing stdout to a logfile. When I have something I'll comment on this issue.

virtadpt · 2015-09-04T22:14:54Z

When jobs aren't running, memory utilization doesn't change (e.g., 485 MB free and holding).

I'm up to 9 delayed job runners now with my usual agent load of approximately 750 agents.

virtadpt · 2015-09-04T22:29:27Z

Results! I'll post what I got in separate Screen terminals in successive comments.

22:25:22 schedule.1 |   69896720     Mysql2::Error: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111): BEGIN
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `query'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `block in execute'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_adapter.rb:473:in `block in log'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_adapter.rb:467:in `log'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `execute'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/mysql2_adapter.rb:231:in `execute'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:315:in `begin_db_transaction'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/transaction.rb:144:in `initialize'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/transaction.rb:170:in `new'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/transaction.rb:170:in `begin_transaction'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/transaction.rb:187:in `within_new_transaction'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in `transaction'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:220:in `transaction'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:344:in `with_transaction_returning_status'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:286:in `block in save'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:301:in `rollback_active_record_state!'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:285:in `save'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:35:in `block (2 levels) in enqueue'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `call'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `block in initialize'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `call'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `execute'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in `run_callbacks'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:33:in `block in enqueue'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:32:in `tap'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:32:in `enqueue'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/message_sending.rb:12:in `method_missing'
22:25:22 schedule.1 |   69896720       /home/drwho/huginn/lib/huginn_scheduler.rb:153:in `block in run_schedule'
22:25:22 schedule.1 |   69896720       /home/drwho/huginn/lib/huginn_scheduler.rb:192:in `block (2 levels) in with_mutex'
22:25:22 schedule.1 |   69896720       /home/drwho/huginn/lib/huginn_scheduler.rb:191:in `synchronize'
22:25:22 schedule.1 |   69896720       /home/drwho/huginn/lib/huginn_scheduler.rb:191:in `block in with_mutex'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:292:in `with_connection'
22:25:22 schedule.1 |   69896720       /home/drwho/huginn/lib/huginn_scheduler.rb:190:in `with_mutex'
22:25:22 schedule.1 |   69896720       /home/drwho/huginn/lib/huginn_scheduler.rb:151:in `run_schedule'
22:25:22 schedule.1 |   69896720       /home/drwho/huginn/lib/huginn_scheduler.rb:129:in `block (2 levels) in run!'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:234:in `call'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:234:in `do_call'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:258:in `do_trigger'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:300:in `block (3 levels) in start_work_thread'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:303:in `call'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:303:in `block (2 levels) in start_work_thread'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:289:in `loop'
22:25:22 schedule.1 |   69896720       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:289:in `block in start_work_thread'
22:25:22 schedule.1 |   69896720   tz:
22:25:22 schedule.1 |   69896720     ENV['TZ']:
22:25:22 schedule.1 |   69896720     Time.now: 2015-09-04 22:25:21 +0000
22:25:22 schedule.1 |   69896720   scheduler:
22:25:22 schedule.1 |   69896720     object_id: 87855320
22:25:22 schedule.1 |   69896720     opts:
22:25:22 schedule.1 |   69896720       {:frequency=>"1.0"}
22:25:22 schedule.1 |   69896720       frequency: 1.0
22:25:22 schedule.1 |   69896720       scheduler_lock: #<Rufus::Scheduler::NullLock:0x0000000a7a7100>
22:25:22 schedule.1 |   69896720       trigger_lock: #<Rufus::Scheduler::NullLock:0x0000000a7a70d8>
22:25:22 schedule.1 |   69896720     uptime: 21669.938076734543 (6h1m9s938)
22:25:22 schedule.1 |   69896720     down?: false
22:25:22 schedule.1 |   69896720     threads: 5
22:25:22 schedule.1 |   69896720       thread: #<Thread:0x0000000a7a6ed0>
22:25:22 schedule.1 |   69896720       thread_key: rufus_scheduler_87855320
22:25:22 schedule.1 |   69896720       work_threads: 4
22:25:22 schedule.1 |   69896720         active: 3
22:25:22 schedule.1 |   69896720         vacant: 1
22:25:22 schedule.1 |   69896720         max_work_threads: 28
22:25:22 schedule.1 |   69896720       mutexes: {}
22:25:22 schedule.1 |   69896720     jobs: 44
22:25:22 schedule.1 |   69896720       at_jobs: 0
22:25:22 schedule.1 |   69896720       in_jobs: 0
22:25:22 schedule.1 |   69896720       every_jobs: 16
22:25:22 schedule.1 |   69896720       interval_jobs: 0
22:25:22 schedule.1 |   69896720       cron_jobs: 28
22:25:22 schedule.1 |   69896720     running_jobs: 3
22:25:22 schedule.1 |   69896720     work_queue: 0
22:25:22 schedule.1 | } 69896720 .
22:25:22 schedule.1 | { 69995240 rufus-scheduler intercepted an error:
22:25:22 schedule.1 |   69995240   job:
22:25:22 schedule.1 |   69995240     Rufus::Scheduler::EveryJob "2m" {}
22:25:22 schedule.1 |   69995240   error:
22:25:22 schedule.1 |   69995240     69995240
22:25:22 schedule.1 |   69995240     ActiveRecord::StatementInvalid
22:25:22 schedule.1 |   69995240     Mysql2::Error: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111): BEGIN
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `query'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `block in execute'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_adapter.rb:473:in `block in log'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_adapter.rb:467:in `log'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `execute'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/mysql2_adapter.rb:231:in `execute'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:315:in `begin_db_transaction'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/transaction.rb:144:in `initialize'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/transaction.rb:170:in `new'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/transaction.rb:170:in `begin_transaction'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/transaction.rb:187:in `within_new_transaction'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/database_statements.rb:213:in `transaction'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:220:in `transaction'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:344:in `with_transaction_returning_status'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:286:in `block in save'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:301:in `rollback_active_record_state!'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/transactions.rb:285:in `save'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:35:in `block (2 levels) in enqueue'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `call'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `block in initialize'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `call'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `execute'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in `run_callbacks'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:33:in `block in enqueue'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:32:in `tap'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:32:in `enqueue'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/message_sending.rb:12:in `method_missing'
22:25:22 schedule.1 |   69995240       /home/drwho/huginn/lib/huginn_scheduler.rb:153:in `block in run_schedule'
22:25:22 schedule.1 |   69995240       /home/drwho/huginn/lib/huginn_scheduler.rb:192:in `block (2 levels) in with_mutex'
22:25:22 schedule.1 |   69995240       /home/drwho/huginn/lib/huginn_scheduler.rb:191:in `synchronize'
22:25:22 schedule.1 |   69995240       /home/drwho/huginn/lib/huginn_scheduler.rb:191:in `block in with_mutex'
22:25:22 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/connection_pool.rb:292:in `with_connection'
22:25:23 schedule.1 |   69995240       /home/drwho/huginn/lib/huginn_scheduler.rb:190:in `with_mutex'
22:25:23 schedule.1 |   69995240       /home/drwho/huginn/lib/huginn_scheduler.rb:151:in `run_schedule'
22:25:23 schedule.1 |   69995240       /home/drwho/huginn/lib/huginn_scheduler.rb:129:in `block (2 levels) in run!'
22:25:23 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:234:in `call'
22:25:23 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:234:in `do_call'
22:25:23 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:258:in `do_trigger'
22:25:23 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:300:in `block (3 levels) in start_work_thread'
22:25:23 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:303:in `call'
22:25:23 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:303:in `block (2 levels) in start_work_thread'
22:25:23 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:289:in `loop'
22:25:23 schedule.1 |   69995240       /home/drwho/.rvm/gems/ruby-2.1.5/gems/rufus-scheduler-3.0.9/lib/rufus/scheduler/jobs.rb:289:in `block in start_work_thread'

virtadpt · 2015-09-04T22:29:51Z

Nothing emitted on stderr. Nothing in ~/stderr.log.

virtadpt · 2015-09-04T22:32:44Z

In one of the extra workers I'd spawned:

Agent#563: Exception during check. Mysql2::Error: Lost connection to MySQL server during query: SELECT  agent_logs.id FROM `agent_logs` WHERE `agent_logs`.`agent_id` = 563  ORDER BY agent_logs.id desc LIMIT 1 OFFSET 249: /home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `query'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `block in execute'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_adapter.rb:473:in `block in log'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_adapter.rb:467:in `log'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract_mysql_adapter.rb:299:in `execute'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/mysql2_adapter.rb:231:in `execute'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/mysql2_adapter.rb:235:in `exec_query'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/database_statements.rb:347:in `select'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/database_statements.rb:32:in `select_all'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/connection_adapters/abstract/query_cache.rb:70:in `select_all'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activerecord-4.2.2/lib/active_record/relation/calculations.rb:180:in `pluck'
/home/drwho/huginn/app/models/agent_log.rb:22:in `log_for_agent'
/home/drwho/huginn/app/models/agent.rb:212:in `log'
/home/drwho/huginn/app/concerns/dry_runnable.rb:61:in `log'
/home/drwho/huginn/app/models/agents/imap_folder_agent.rb:358:in `each_unread_mail'
/home/drwho/huginn/app/models/agents/imap_folder_agent.rb:196:in `check'
/home/drwho/huginn/app/jobs/agent_check_job.rb:7:in `perform'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/execution.rb:32:in `block in perform_now'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:117:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:117:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:555:in `block (2 levels) in compile'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:505:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:505:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:498:in `block (2 levels) in around'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:343:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:343:in `block (2 levels) in simple'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/logging.rb:23:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/logging.rb:23:in `block (4 levels) in <module:Logging>'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/notifications.rb:164:in `block in instrument'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/notifications/instrumenter.rb:20:in `instrument'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/notifications.rb:164:in `instrument'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/logging.rb:22:in `block (3 levels) in <module:Logging>'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/logging.rb:43:in `block in tag_logger'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/tagged_logging.rb:68:in `block in tagged'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/tagged_logging.rb:26:in `tagged'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/tagged_logging.rb:68:in `tagged'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/logging.rb:43:in `tag_logger'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/logging.rb:19:in `block (2 levels) in <module:Logging>'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:441:in `instance_exec'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:441:in `block in make_lambda'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:342:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:342:in `block in simple'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:497:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:497:in `block in around'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:505:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:505:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:92:in `_run_callbacks'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:776:in `_run_perform_callbacks'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activesupport-4.2.2/lib/active_support/callbacks.rb:81:in `run_callbacks'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/execution.rb:31:in `perform_now'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/execution.rb:21:in `execute'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/activejob-4.2.2/lib/active_job/queue_adapters/delayed_job_adapter.rb:34:in `perform'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:94:in `block in invoke_job'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `block in initialize'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `execute'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in `run_callbacks'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/backend/base.rb:91:in `invoke_job'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:204:in `block (2 levels) in run'
/home/drwho/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/timeout.rb:91:in `block in timeout'
/home/drwho/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/timeout.rb:101:in `call'
/home/drwho/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/timeout.rb:101:in `timeout'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:204:in `block in run'
/home/drwho/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/benchmark.rb:294:in `realtime'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:203:in `run'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:280:in `block in reserve_and_run_one_job'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `block in initialize'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `execute'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in `run_callbacks'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:280:in `reserve_and_run_one_job'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:187:in `block in work_off'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:186:in `times'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:186:in `work_off'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:150:in `block (4 levels) in start'
/home/drwho/.rvm/rubies/ruby-2.1.5/lib/ruby/2.1.0/benchmark.rb:294:in `realtime'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:149:in `block (3 levels) in start'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `execute'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in `run_callbacks'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:148:in `block (2 levels) in start'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:147:in `loop'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:147:in `block in start'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/plugins/clear_locks.rb:7:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/plugins/clear_locks.rb:7:in `block (2 levels) in <class:ClearLocks>'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:79:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:79:in `block (2 levels) in add'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:61:in `block in initialize'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:79:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:79:in `block in add'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:66:in `execute'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/lifecycle.rb:40:in `run_callbacks'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/worker.rb:146:in `start'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/command.rb:124:in `run'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/command.rb:112:in `block in run_process'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons/application.rb:255:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons/application.rb:255:in `block in start_proc'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons/application.rb:264:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons/application.rb:264:in `start_proc'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons/application.rb:296:in `start'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons/controller.rb:73:in `run'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons.rb:197:in `block in run_proc'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons/cmdline.rb:109:in `call'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons/cmdline.rb:109:in `catch_exceptions'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/daemons-1.1.9/lib/daemons.rb:196:in `run_proc'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/command.rb:110:in `run_process'
/home/drwho/.rvm/gems/ruby-2.1.5/gems/delayed_job-4.0.6/lib/delayed/command.rb:86:in `daemonize'
script/delayed_job:5:in `<main>'
Mysql2::Error: Can't connect to local MySQL server through socket '/var/run/mysqld/mysqld.sock' (111): UPDATE `delayed_jobs` SET `delayed_jobs`.`locked_by` = NULL, `delayed_jobs`.`locked_at` = NULL WHERE `delayed_jobs`.`locked_by` = 'delayed_job.6 host:exocortex pid:25535'

virtadpt · 2015-09-04T22:37:43Z

The same stack trace happened in all of the extra DJs I spawned to force this to happen. It killed almost all of the spares I spawned to test this; a few survived because they were running at the time the others died, but now that the scheduler's died it's sitting there waiting for things to do.

Looking at the process list:

...
8881 pts/3    Sl     6:52 /home/drwho/.rvm/rubies/ruby-2.1.5/bin/ruby bin/rails server -b127.0.0.1
8884 pts/3    Sl     1:20 bin/schedule.rb
...

dsander · 2015-10-04T20:40:11Z

You need to watch the disk usage though, per default the PeakDetectorAgent in the scenario will trigger a dump for every detected peak. I ran out of space of my root disk once 😄

cantino · 2015-10-04T21:05:01Z

Thanks! Just added a monitor to keep an eye on that.

cantino · 2015-10-28T02:52:30Z

I'm seeing the memory leak too after a few days running. I suspect it may be the cause of some of my crashes. According to #1114, this will happen with a completely empty agent too, which makes me suspect DJ or the scheduler?

dsander · 2015-10-28T17:49:39Z

@cantino My test instance crashed for the same reason today, I recently added some more debugging information and found that multiple instances of TwitterStreamAgent::Worker were still in memory. Applied a fix and I am now waiting if that fixes the crashes.

This does not explain #1114 though.

virtadpt · 2015-10-28T21:42:33Z

My instance (now running with the multithreaded scheduler) is doing the same thing. I've had two lockups in as many days. Until I purged the pending jobs this morning, I had a backlog of 7900 jobs waiting to run.

I'm interested to see if commit 07e2d9c fixes it.

jn11 · 2015-11-13T00:07:05Z

I'm reading in heroku discussion forum that some people was experimenting memory leaks running the latests version of Ruby. Have anybody tried running huggin with Ruby 2.0.0?

More info:
https://discussion.heroku.com/t/memory-leaks-with-ruby-2-1-2/644/23

virtadpt · 2015-11-13T21:53:00Z

The leak's happening in v2.1.5, too.

cantino · 2015-11-14T04:20:30Z

Are you still seeing a steady climb of memory?

virtadpt · 2015-11-16T22:02:45Z

Yes. However, JabberAgent's been locking up Huginn way more than OOMs have been killing Huginn lately (two weeks or therabouts).

yhaenggi · 2015-11-16T22:26:02Z

@virtadpt thanks for telling, was just about to update huginn. a lockup due to jabberagent would be horrible for me. i'll try as soon as the problem with jabberagent is fixed.

dsander · 2015-11-16T22:37:41Z

Pretty sure it is not an issue with the jabber agent in general, as far as I can remember we did not change it for a least a few months and I am using it as a receiver (JabberAgent::Worker) without any issues.

virtadpt · 2015-11-16T23:32:40Z

Insofar as the memory leak, I think the article that @jn11 linked to has a lot to do with it.

I'm using JabberAgent as a sender, which might be the problem. But that's a separate issue (literally).

brianpetro · 2015-12-25T16:07:16Z

Thought of this issue when I read this How Ruby 2.2 can cause an out-of-memory server crash

yhaenggi · 2015-12-25T22:25:05Z

@brianpetro versions below 2.2 are affected too

cantino · 2015-12-26T19:12:16Z

I think this would only happen if recursion was interrupted by a signal, so the GC got disabled? Then we wouldn't see a slow climb in memory, but probably a fairly quick one? But I could totally be wrong. If you're having memory issues, would you mind trying the newer version of Ruby and seeing if it helps? It'd be awesome if this fixed our issues.

wasafiri · 2016-01-14T02:16:10Z

@cantino Jumping in here to help troubleshoot. As I was fiddling around with Huginn today I noticed that Celluloid 0.16.0 gets installed during setup. There is a known memory leak with that version of Celluloid, see celluloid/celluloid#455 You might look at requiring Celluloid version 0.15.2 or 0.17.2, see https://gist.github.com/gazay/3b518f72266b5a7e88ff

I also bookmarked a list of "leaky gems" and delayed_job and therubyracer, both of which are in the Huginn gemfile, are listed. See https://github.com/ASoftCo/leaky-gems

When I set up Huginn today I noticed that delayed_job version 4.0.6 was installed -- that version is known to have memory leaks: collectiveidea/delayed_job#776

therubyracer was installed with gem version 0.12.2 which fixes the memory leak that was previously reported, AND ~> 0.12.2 is specified in the gemfile, so that one should be fine. rubyjs/therubyracer#336

Hope that helps!

cantino · 2016-01-14T07:10:07Z

Thanks @wasafiri, this is very helpful! I'll try updating the Gemfile.

wasafiri · 2016-02-03T13:14:09Z

Does the memory leak persist @virtadpt @cantino ?

cantino · 2016-02-03T16:41:17Z

I haven't had issues recently, but it wasn't affecting me much. @K1773R?

virtadpt · 2016-02-03T18:50:52Z

I think so. I'm running the threaded job runner (have been since I got Huginn into production mode) and it's been remarkably stable since I disabled all of my JabberAgents in favor of BeeperAgent instances.

yhaenggi · 2016-02-04T09:07:05Z

@cantino I should have time for an upgrade at the weekend. i'll report back.
What is the Issue with the JabberAgent? Im relying on it and upgrading to a broken version would be bad for me.

HyShai · 2017-01-06T17:18:10Z

I've been having memory leak issues for the past couple of months. I'm using the latest codebase on an ubuntu xenial box with 1 GB memory.
Huginn runs fine for a day - and then the server get's very sluggish - the threaded.rb process starts at 5% and then steadily increases - after a day it's around 16%....

The logs don't seem to show any errors.

cantino · 2017-01-07T01:10:23Z

That's not good. I've seen my RAM usage climb once in the last month I think. Seems like there's still some subtle issue.

dsander · 2017-01-21T10:44:42Z

I did some monitoring of my installation and it looks like the memory usage stabilized after a few days. The amount of RAM used for the twitter stream process still surprises me a bit.

CONTAINER ID        IMAGE                            COMMAND                  CREATED             STATUS              PORTS                      NAMES
8c4a99ed0f4c        cantino/huginn-single-process    "/scripts/init bin/tw"   4 minutes ago       Up 4 minutes        3000/tcp                   huginndocker_huginn_tw_1
6a9a28389370        cantino/huginn-single-process    "/scripts/init bin/sc"   4 minutes ago       Up 4 minutes        3000/tcp                   huginndocker_huginn_scheduler_1
2065a5122238        cantino/huginn-single-process    "/scripts/init"          4 minutes ago       Up 4 minutes        127.0.0.1:3000->3000/tcp   huginndocker_huginn_web_1
a859b884a9a4        cantino/huginn-single-process    "/scripts/init script"   4 minutes ago       Up 4 minutes        3000/tcp                   huginndocker_huginn_dj_1
cdd35eae5260        postgres:9.5                     "/docker-entrypoint.s"   4 minutes ago       Up 4 minutes        5432/tcp                   huginndocker_postgres_1
Mon Jan 16 17:47:37 CET 2017
CONTAINER           CPU %               MEM USAGE / LIMIT       MEM %               NET I/O               BLOCK I/O             PIDS
8c4a99ed0f4c        0.12%               190.6 MiB / 31.32 GiB   0.59%               65.05 MB / 2.116 MB   0 B / 1.769 MB        0
6a9a28389370        0.07%               150.9 MiB / 31.32 GiB   0.47%               1.17 MB / 253.7 kB    0 B / 1.769 MB        0
2065a5122238        0.00%               262.1 MiB / 31.32 GiB   0.82%               1.513 MB / 602.9 kB   376.8 kB / 1.769 MB   0
a859b884a9a4        0.27%               162.6 MiB / 31.32 GiB   0.51%               1.355 MB / 450.1 kB   2.335 MB / 1.786 MB   0
cdd35eae5260        0.11%               69.21 MiB / 31.32 GiB   0.22%               2.09 MB / 61.83 MB    2.31 MB / 12.53 MB    0
Mon Jan 16 21:32:11 CET 2017
CONTAINER           CPU %               MEM USAGE / LIMIT       MEM %               NET I/O               BLOCK I/O             PIDS
8c4a99ed0f4c        3.84%               225.7 MiB / 31.32 GiB   0.70%               862 MB / 148.5 MB     0 B / 1.769 MB        0
6a9a28389370        0.08%               190 MiB / 31.32 GiB     0.59%               1.717 MB / 981.1 kB   0 B / 1.769 MB        0
2065a5122238        0.00%               287.9 MiB / 31.32 GiB   0.90%               1.665 MB / 728.1 kB   376.8 kB / 1.769 MB   0
a859b884a9a4        0.12%               187.9 MiB / 31.32 GiB   0.59%               37.75 MB / 22.31 MB   2.335 MB / 1.786 MB   0
cdd35eae5260        1.38%               89 MiB / 31.32 GiB      0.28%               161.5 MB / 719.2 MB   2.31 MB / 1.092 GB    0
Tue Jan 17 22:15:49 CET 2017
CONTAINER           CPU %               MEM USAGE / LIMIT       MEM %               NET I/O               BLOCK I/O             PIDS
8c4a99ed0f4c        2.10%               313.1 MiB / 31.32 GiB   0.98%               3.673 GB / 986.3 MB   0 B / 1.769 MB        0
6a9a28389370        0.08%               206.1 MiB / 31.32 GiB   0.64%               4.839 MB / 5.647 MB   0 B / 1.769 MB        0
2065a5122238        0.00%               448.4 MiB / 31.32 GiB   1.40%               16.38 MB / 12.57 MB   557.1 kB / 1.769 MB   0
a859b884a9a4        0.12%               189.3 MiB / 31.32 GiB   0.59%               274.9 MB / 165.4 MB   2.335 MB / 1.786 MB   0
cdd35eae5260        0.84%               113.2 MiB / 31.32 GiB   0.35%               1.05 GB / 1.598 GB    3.015 MB / 7.56 GB    0
Fri Jan 20 13:27:54 CET 2017
CONTAINER           CPU %               MEM USAGE / LIMIT       MEM %               NET I/O               BLOCK I/O             PIDS
8c4a99ed0f4c        0.14%               307.8 MiB / 31.32 GiB   0.96%               1.023 GB / 2.93 GB    0 B / 1.769 MB        0
6a9a28389370        0.09%               209.5 MiB / 31.32 GiB   0.65%               12.73 MB / 17.55 MB   0 B / 1.769 MB        0
2065a5122238        0.00%               455.7 MiB / 31.32 GiB   1.42%               18.2 MB / 14.03 MB    557.1 kB / 1.769 MB   0
a859b884a9a4        0.12%               188.7 MiB / 31.32 GiB   0.59%               884.1 MB / 531.6 MB   2.335 MB / 1.786 MB   0
cdd35eae5260        0.06%               122.7 MiB / 31.32 GiB   0.38%               3.147 GB / 3.795 GB   3.727 MB / 23.39 GB   0

cantino · 2017-01-22T02:03:17Z

How high does your RAM usage get @HyShai? I know that Ruby's generational garbage collector can take a while before it reaches a steady state, which is always higher than where it starts.

cantino mentioned this issue Oct 28, 2015

huginn memory leak with no agents running #1114

Closed

cantino mentioned this issue Nov 11, 2015

Add AgentRunner and LongRunnable to support long running agents #819

Merged

brianpetro mentioned this issue Nov 13, 2015

Heroku restart rake task #1148

Closed

cantino mentioned this issue Jan 15, 2016

Update celluloid and delayed_job to avoid potential memory issues #1224

Merged

dsander mentioned this issue Apr 7, 2016

Improve docker images #1359

Merged

7 tasks

dj: memory leak #978

dj: memory leak #978

Comments

yhaenggi commented Aug 20, 2015

cantino commented Aug 21, 2015

yhaenggi commented Aug 21, 2015

cantino commented Aug 21, 2015

yhaenggi commented Aug 22, 2015

cantino commented Aug 22, 2015

dsander commented Aug 22, 2015

yhaenggi commented Aug 22, 2015

virtadpt commented Aug 22, 2015

cantino commented Aug 23, 2015

virtadpt commented Aug 23, 2015

cantino commented Aug 23, 2015

virtadpt commented Aug 23, 2015

virtadpt commented Aug 23, 2015

cantino commented Aug 23, 2015

cantino commented Aug 23, 2015

virtadpt commented Aug 23, 2015

cantino commented Aug 23, 2015

virtadpt commented Aug 23, 2015

dsander commented Aug 24, 2015

yhaenggi commented Aug 24, 2015

dsander commented Aug 24, 2015

virtadpt commented Aug 25, 2015

cantino commented Aug 25, 2015

virtadpt commented Sep 4, 2015

virtadpt commented Sep 4, 2015

virtadpt commented Sep 4, 2015

virtadpt commented Sep 4, 2015

virtadpt commented Sep 4, 2015

virtadpt commented Sep 4, 2015

virtadpt commented Sep 4, 2015

dsander commented Oct 4, 2015

cantino commented Oct 4, 2015

cantino commented Oct 28, 2015

dsander commented Oct 28, 2015

virtadpt commented Oct 28, 2015

jn11 commented Nov 13, 2015

virtadpt commented Nov 13, 2015

cantino commented Nov 14, 2015

virtadpt commented Nov 16, 2015

yhaenggi commented Nov 16, 2015

dsander commented Nov 16, 2015

virtadpt commented Nov 16, 2015

brianpetro commented Dec 25, 2015

yhaenggi commented Dec 25, 2015

cantino commented Dec 26, 2015

wasafiri commented Jan 14, 2016

cantino commented Jan 14, 2016

wasafiri commented Feb 3, 2016

cantino commented Feb 3, 2016

virtadpt commented Feb 3, 2016

yhaenggi commented Feb 4, 2016

HyShai commented Jan 6, 2017

cantino commented Jan 7, 2017

dsander commented Jan 21, 2017

cantino commented Jan 22, 2017