Permalink

Comparing changes

Choose two branches to see what’s changed or to start a new pull request. If you need to, you can also .

Open a pull request

Create a new pull request by comparing changes across two branches. If you need to, you can also .
Choose a Base Repository
benbruscella/delayed_job
collectiveidea/delayed_job
tobi/delayed_job
AlekSi/delayed_job_on_steroids
BLC/delayed_job
BrianTheCoder/delayed_job
DohMoose/delayed_job
HatemMahmoud/delayed_job
JackDanger/delayed_job
MirDoe/delayed_job
PatrickTulskie/delayed_job
Punchbowl/delayed_job
Shopify/delayed_job
Viximo/delayed_job
acosme/delayed_job
adamwiggins/delayed_job
addywaddy/delayed_job
agmiklas/delayed_job
alphabetum/delayed_job
andrewroth/delayed_job
andrewsardone/delayed_job
ascarter/delayed_job
aukan/delayed_job
betamatt/delayed_job
bhbryant/delayed_job
biro/delayed_job
blaxter/delayed_job
bmarini/delayed_job
bmizerany/delayed_job
bracken/delayed_job
brandondrew/delayed_job
bryanwoods/delayed_job
chaupt/delayed_job
chrisconley/delayed_job
chrismear/delayed_job
christophermanning/delayed_job
christophklocker/delayed_job
clicrdv/delayed_job
clifff/delayed_job
corpgovrisk/delayed_job
crispinheneise/delayed_job
cschmidt/delayed_job
davekrupinski/delayed_job
dbarros/delayed_job
dburkes/delayed_job
dekart/delayed_job
derk/delayed_job
dhh/delayed_job
dissolved/delayed_job
dmeiz/delayed_job
drewblas/delayed_job
dsander/delayed_job
edwinmoss/delayed_job
efaizal/delayed_job
efficiency20/delayed_job
elia/tobi_delayed_job
eric/delayed_job
eugenebolshakov/delayed_job
fhwang/delayed_job
fredwu/delayed_job
frison/delayed_job
getaroom/delayed_job
giom/delayed_job
gramos/delayed_job
groupme/delayed_job
gugod/delayed_job
guns/delayed_job
hagope/delayed_job
haraldmartin/delayed_job
hennk/delayed_job
hjdivad/delayed_job
hsribei/delayed_job
iamnader/delayed_job
iansheridan/delayed_job
ianwhite/delayed_job
intinno/delayed_job
itrcdevs/delayed_job
ivanvanderbyl/delayed_job
jacquescrocker/delayed_job
jakeonrails/delayed_job
jamesgolick/delayed_job
japetheape/delayed_job
jasonfox/delayed_job
jch/delayed_job
jdsiegel/delayed_job
jeremy/delayed_job
jiriknesl/delayed_job
jlecour/delayed_job
jmay/delayed_job
joevandyk/delayed_job
johndouthat/delayed_job
jorgemanrubia/delayed_job
jrichter/delayed_job
jugyo/delayed_job
kbuckler/delayed_job
kmuthupa/delayed_job
knapo/delayed_job
kusor/delayed_job
larrywright/delayed_job
latentflip/delayed_job
leongersing/delayed_job
lorensr/delayed_job
luke0x/delayed_job
lunks/delayed_job
manzhikov/delayed_job
marano/delayed_job
martinbtt/delayed_job
mattallen/delayed_job
mattbeedle/delayed_job
mguterl/delayed_job
mikezaschka/delayed_job
minofare/delayed_job
miroslawmaziarz/delayed_job
mkirchstein/delayed_job
mmangino/delayed_job
mmfa/delayed_job
nifarius/delayed_job
nightshade427/delayed_job
nitsujri/delayed_job
noriaki/delayed_job
npramanik/delayed_job
nurav123/delayed_job
nz/delayed_job
oferlin/delayed_job
oki/delayed_job
opsb/delayed_job
panterch/delayed_job-fork
pat/delayed_job
patio11/delayed_job
patrick99e99/delayed_job
pauldoerwald/delayed_job
pbrumm/delayed_job
pedro/delayed_job
peterberkenbosch/delayed_job
pftg/delayed_job
phaza/delayed_job
phildarnowsky/delayed_job
pjdavis/delayed_job
pjfitzgibbons/delayed_job
powcloud/delayed_job
pvdb/delayed_job
quasor/delayed_job
quirkey/delayed_job
rares/delayed_job
reeplayit/delayed_job
ricardochimal/delayed_job
rmatei/delayed_job
romanenko/delayed_job
saas-dev/delayed_job
saimonmoore/delayed_job
samoli/delayed_job
sandboxrem/delayed_job
sentientmonkey/delayed_job
siebertm/delayed_job
slaskis/delayed_job
slideshare/delayed_job
smasry/delayed_job
spejman/delayed_job
speramus/delayed_job
squeejee/delayed_job
ssoroka/delayed_job
steel/delayed_job
stympy/delayed_job
subdigital/delayed_job
sunkencity/delayed_job
sunstroke/delayed_job
taylorbarstow/delayed_job
tb0yd/delayed_job
technicalpickles/delayed_job
tedkulp/delayed_job
teejayvanslyke/delayed_job
terrbear/delayed_job
theneubeck/delayed_job
thibaudgg/delayed_job
thilo/delayed_job
thredup/delayed_job
tiffani/delayed_job
timmatheson/delayed_job
tjsingleton/delayed_job
tka/delayed_job
tlianza/delayed_job
topprospect/delayed_job
tosik/delayed_job
tpinto/delayed_job
tricycle/delayed_job
ttilley/delayed_job
turadg/delayed_job
tyraeltong/delayed_job
vanpelt/delayed_job
vantran/delayed_job
visnup/delayed_job
wakemaster39/delayed_job
wakiki/delayed_job
weplay/delayed_job
whittle/delayed_job
windock/dm-delayed-job
winescout/delayed_job
woahdae/delayed_job
xspond/delayed_job
yob/delayed_job
zwily/delayed_job
Nothing to show
Choose a base branch
...
Choose a Head Repository
benbruscella/delayed_job
collectiveidea/delayed_job
tobi/delayed_job
AlekSi/delayed_job_on_steroids
BLC/delayed_job
BrianTheCoder/delayed_job
DohMoose/delayed_job
HatemMahmoud/delayed_job
JackDanger/delayed_job
MirDoe/delayed_job
PatrickTulskie/delayed_job
Punchbowl/delayed_job
Shopify/delayed_job
Viximo/delayed_job
acosme/delayed_job
adamwiggins/delayed_job
addywaddy/delayed_job
agmiklas/delayed_job
alphabetum/delayed_job
andrewroth/delayed_job
andrewsardone/delayed_job
ascarter/delayed_job
aukan/delayed_job
betamatt/delayed_job
bhbryant/delayed_job
biro/delayed_job
blaxter/delayed_job
bmarini/delayed_job
bmizerany/delayed_job
bracken/delayed_job
brandondrew/delayed_job
bryanwoods/delayed_job
chaupt/delayed_job
chrisconley/delayed_job
chrismear/delayed_job
christophermanning/delayed_job
christophklocker/delayed_job
clicrdv/delayed_job
clifff/delayed_job
corpgovrisk/delayed_job
crispinheneise/delayed_job
cschmidt/delayed_job
davekrupinski/delayed_job
dbarros/delayed_job
dburkes/delayed_job
dekart/delayed_job
derk/delayed_job
dhh/delayed_job
dissolved/delayed_job
dmeiz/delayed_job
drewblas/delayed_job
dsander/delayed_job
edwinmoss/delayed_job
efaizal/delayed_job
efficiency20/delayed_job
elia/tobi_delayed_job
eric/delayed_job
eugenebolshakov/delayed_job
fhwang/delayed_job
fredwu/delayed_job
frison/delayed_job
getaroom/delayed_job
giom/delayed_job
gramos/delayed_job
groupme/delayed_job
gugod/delayed_job
guns/delayed_job
hagope/delayed_job
haraldmartin/delayed_job
hennk/delayed_job
hjdivad/delayed_job
hsribei/delayed_job
iamnader/delayed_job
iansheridan/delayed_job
ianwhite/delayed_job
intinno/delayed_job
itrcdevs/delayed_job
ivanvanderbyl/delayed_job
jacquescrocker/delayed_job
jakeonrails/delayed_job
jamesgolick/delayed_job
japetheape/delayed_job
jasonfox/delayed_job
jch/delayed_job
jdsiegel/delayed_job
jeremy/delayed_job
jiriknesl/delayed_job
jlecour/delayed_job
jmay/delayed_job
joevandyk/delayed_job
johndouthat/delayed_job
jorgemanrubia/delayed_job
jrichter/delayed_job
jugyo/delayed_job
kbuckler/delayed_job
kmuthupa/delayed_job
knapo/delayed_job
kusor/delayed_job
larrywright/delayed_job
latentflip/delayed_job
leongersing/delayed_job
lorensr/delayed_job
luke0x/delayed_job
lunks/delayed_job
manzhikov/delayed_job
marano/delayed_job
martinbtt/delayed_job
mattallen/delayed_job
mattbeedle/delayed_job
mguterl/delayed_job
mikezaschka/delayed_job
minofare/delayed_job
miroslawmaziarz/delayed_job
mkirchstein/delayed_job
mmangino/delayed_job
mmfa/delayed_job
nifarius/delayed_job
nightshade427/delayed_job
nitsujri/delayed_job
noriaki/delayed_job
npramanik/delayed_job
nurav123/delayed_job
nz/delayed_job
oferlin/delayed_job
oki/delayed_job
opsb/delayed_job
panterch/delayed_job-fork
pat/delayed_job
patio11/delayed_job
patrick99e99/delayed_job
pauldoerwald/delayed_job
pbrumm/delayed_job
pedro/delayed_job
peterberkenbosch/delayed_job
pftg/delayed_job
phaza/delayed_job
phildarnowsky/delayed_job
pjdavis/delayed_job
pjfitzgibbons/delayed_job
powcloud/delayed_job
pvdb/delayed_job
quasor/delayed_job
quirkey/delayed_job
rares/delayed_job
reeplayit/delayed_job
ricardochimal/delayed_job
rmatei/delayed_job
romanenko/delayed_job
saas-dev/delayed_job
saimonmoore/delayed_job
samoli/delayed_job
sandboxrem/delayed_job
sentientmonkey/delayed_job
siebertm/delayed_job
slaskis/delayed_job
slideshare/delayed_job
smasry/delayed_job
spejman/delayed_job
speramus/delayed_job
squeejee/delayed_job
ssoroka/delayed_job
steel/delayed_job
stympy/delayed_job
subdigital/delayed_job
sunkencity/delayed_job
sunstroke/delayed_job
taylorbarstow/delayed_job
tb0yd/delayed_job
technicalpickles/delayed_job
tedkulp/delayed_job
teejayvanslyke/delayed_job
terrbear/delayed_job
theneubeck/delayed_job
thibaudgg/delayed_job
thilo/delayed_job
thredup/delayed_job
tiffani/delayed_job
timmatheson/delayed_job
tjsingleton/delayed_job
tka/delayed_job
tlianza/delayed_job
topprospect/delayed_job
tosik/delayed_job
tpinto/delayed_job
tricycle/delayed_job
ttilley/delayed_job
turadg/delayed_job
tyraeltong/delayed_job
vanpelt/delayed_job
vantran/delayed_job
visnup/delayed_job
wakemaster39/delayed_job
wakiki/delayed_job
weplay/delayed_job
whittle/delayed_job
windock/dm-delayed-job
winescout/delayed_job
woahdae/delayed_job
xspond/delayed_job
yob/delayed_job
zwily/delayed_job
Nothing to show
Choose a head branch
Nothing to show
Checking mergeability… Don’t worry, you can still create the pull request.
Commits on Aug 17, 2010
guns
guns
Add task jobs:daemon:start, which spawns a simple forking daemon that…
… actually works

The command

        $ WORKERS=n RAILS_ENV=production rake jobs:daemon:start

spawns a simple forking daemon, which spawns and restarts `n' instances of
Delayed::Worker. Worker processes are revived by the master process on receipt
of SIGCLD.

We can restart worker instances by sending SIGHUP to the master process or by
killing them directly. Sending SIGTERM, SIGINT, or SIGQUIT to the master
instructs it to kill its children and terminate.

Alternately, there are the tasks `jobs:daemon:restart' and `jobs:daemon:stop'

Two extra features:

* To avoid CPU thrashing, if a child worker dies 4 times in 60 seconds, a
  warning message is logged and the child sleeps for 300 seconds before
  booting up

* The master polls tmp/restart.txt and restarts children on timestamp update
Commits on Aug 21, 2010
Commits on Aug 24, 2010
Commits on Sep 05, 2010
Commits on Sep 06, 2010
Commits on Sep 07, 2010
Commits on Sep 09, 2010
Commits on Sep 15, 2010
Commits on Sep 17, 2010
guns
guns
KISS and check to see if the file exists before checking its mtime
Fixes bug where the master would constantly raise SIGHUP when
'restart.txt' didn't exist
Commits on Sep 22, 2010
guns
guns
explicitly exit(0) on SIGTERM; keep it simple and trap less signals
Thanks to Brandon Keepers and Paul Gideon Dann for reporting that
`Process.kill :TERM, $$' does not work reliably.

For now, we will just exit(0) since that really won't do any harm.

Also, removed unnecessary traps on INT and QUIT.
Commits on Oct 04, 2010
Commits on Nov 27, 2010
Commits on Dec 27, 2011
Commits on Feb 02, 2012
Commits on Feb 10, 2012
Remove extant pid files with a warning.
Restarting workers on boot after a sudden shutdown was problematic.
Showing with 185 additions and 1 deletion.
  1. +181 −0 lib/delayed/daemon_tasks.rb
  2. +2 −0 lib/delayed/tasks.rb
  3. +2 −1 lib/delayed/yaml_ext.rb
View
@@ -0,0 +1,181 @@
+# The command
+#
+# $ WORKERS=n RAILS_ENV=production rake jobs:daemon:start
+#
+# spawns a simple forking daemon, which spawns and restarts `n' instances of
+# Delayed::Worker. Worker processes are revived by the master process on receipt
+# of SIGCLD.
+#
+# We can restart worker processes by sending SIGHUP to the master process or by
+# killing them directly. Sending SIGTERM to the master instructs it to kill its
+# children and terminate.
+#
+# Alternately, there are the tasks `jobs:daemon:restart' and `jobs:daemon:stop'
+#
+# Two extra features:
+#
+# * To avoid CPU thrashing, if a child worker dies 4 times in 60 seconds, a
+# warning message is logged and the child sleeps for 300 seconds before
+# booting up
+#
+# * The master polls tmp/restart.txt and restarts children on timestamp update
+
+namespace :jobs do
+ namespace :daemon do
+ desc 'Spawn a daemon which forks WORKERS=n instances of Delayed::Worker'
+ task :start do
+ # we want master and children to share a logfile, so set these before fork
+ rails_env = ENV['RAILS_ENV'] || 'development'
+ rails_root = Dir.pwd # rake sets cwd to the parent dir of the Rakefile
+ logfile = "#{rails_root}/log/delayed_worker.#{rails_env}.log"
+
+ # Ensure new file permissions are set to a standard 0755
+ File.umask 0022
+
+ # Loads the Rails environment and spawns a worker
+ worker = lambda do |id, delay|
+ fork do
+ $0 = "delayed_worker.#{id}"
+
+ # reset all inherited traps from main process
+ [:CLD, :HUP, :TERM, :EXIT].each { |sig| trap sig, 'DEFAULT' }
+
+ # lay quiet for a while before booting up if specified
+ sleep delay if delay
+
+ # Boot the rails environment and start a worker
+ Rake::Task[:environment].invoke
+ Delayed::Worker.logger = Logger.new logfile
+ Delayed::Worker.new({
+ :min_priority => ENV['MIN_PRIORITY'],
+ :max_priority => ENV['MAX_PRIORITY'],
+ :quiet => true
+ }).start
+ end
+ end
+
+ # fork a simple master process
+ master = fork do
+ $0 = 'delayed_worker.master'
+
+ # simple logger; there is some overhead due to reopening the file for
+ # every write, but it's minor, and avoids headaches with open files
+ logger = lambda do |msg|
+ File.open logfile, 'a' do |f|
+ f.puts "#{Time.now.strftime '%FT%T%z'}: [#{$0}] #{msg}"
+ end
+ end
+
+ # Create / overwrite pidfile
+ pid_dir = "#{rails_root}/tmp/pids"
+ pid_file = "#{pid_dir}/#{$0}.pid"
+ if File.exists? pid_file
+ logger.call "WARNING: PID file #{pid_file} already exists!"
+ end
+
+ # silence output like a proper daemon
+ [$stdin, $stdout, $stderr].each { |io| io.reopen '/dev/null' }
+ mkdir_p pid_dir, :verbose => false
+ File.open(pid_file, 'w') { |f| f.write $$ }
+
+ # spawn the first workers
+ children, times_dead = {}, {}
+ worker_count = (ENV['WORKERS'] || 1).to_i
+ logger.call "Spawning #{worker_count} worker(s)"
+ worker_count.times { |id| children[worker.call id, nil] = id }
+
+ # and respawn the failures
+ trap :CLD do
+ id = children.delete Process.wait
+
+ # check to see if this worker is dying repeatedly
+ times_dead[id] ||= []
+ times_dead[id] << (now = Time.now)
+ times_dead[id].reject! { |time| now - time > 60 }
+ if times_dead[id].size > 4
+ delay = 60 * 5 # time to tell the children to sleep before loading
+ logger.call %Q{
+ delayed_worker.#{id} has died four times in the past minute!
+ Something is seriously wrong!
+ Restarting worker in #{delay} seconds.
+ }.strip.gsub /\s+/, ' '
+ else
+ logger.call "Restarting dead worker: delayed_worker.#{id}"
+ end
+
+ children[worker.call id, delay] = id
+ end
+
+ # restart children on SIGHUP
+ trap :HUP do
+ logger.call 'SIGHUP received! Restarting workers.'
+ times_dead.clear if times_dead # reset death log on user restart
+ Process.kill :TERM, *children.keys
+ end
+
+ # cleanup on exit
+ trap :EXIT do
+ rm_f pid_file
+ end
+
+ # terminate children on user termination
+ trap :TERM do
+ logger.call 'SIGTERM received! Shutting down workers.'
+
+ # reset trap handlers so we don't get caught in a trap loop
+ [:CLD, :HUP, :TERM].each { |s| trap s, 'DEFAULT' }
+
+ # kill the children and reap them before terminating
+ Process.kill :TERM, *children.keys
+ Process.waitall
+ logger.call 'All workers have shut down. Exiting.'
+
+ # TODO: investigate why some users are reporting that
+ # `Process.kill :TERM, $$' isn't working
+ exit
+ end
+
+ # NOTE: We want to block on something so that Process.waitall doesn't
+ # reap dead children before the SIGCLD handler does
+ #
+ # poll passenger restart file and restart on update
+ restart_file = "#{rails_root}/tmp/restart.txt"
+ last_modified = File.mtime restart_file if File.exists? restart_file
+ loop do
+ if File.exists? restart_file
+ if (check = File.mtime restart_file) > last_modified
+ last_modified = check
+ Process.kill :HUP, $$
+ end
+ end
+ sleep 5
+ end
+
+ # reap children and remove logfile if the blocking loop is broken
+ Process.waitall
+ rm_f pid_file
+ end
+
+ # detach the master process and exit;
+ # note that Process#detach calls setsid(2)
+ Process.detach master
+ end
+
+ [%w[restart SIGHUP], %w[stop SIGTERM]].each do |name, signal|
+ instance_eval do
+ desc "#{name.capitalize} an existing delayed_worker daemon"
+ task name do
+ begin
+ pid_file = "#{Dir.pwd}/tmp/pids/delayed_worker.master.pid"
+ abort 'No pid file found!' unless File.exists? pid_file
+ pid = File.read(pid_file).to_i
+ puts "Sending #{signal} to #{pid}"
+ Process.kill signal, pid
+ rescue Errno::ESRCH => e # no such process
+ abort e.to_s
+ end
+ end
+ end
+ end
+ end
+end
View
@@ -1,3 +1,5 @@
+require File.join(File.dirname(__FILE__), 'daemon_tasks')
+
namespace :jobs do
desc "Clear the delayed_job queue."
task :clear => :environment do
View
@@ -2,6 +2,7 @@
# Classes, Modules and Structs
require 'yaml'
+YAML::ENGINE.yamler = "syck" if defined?(YAML::ENGINE)
class Module
yaml_as "tag:ruby.yaml.org,2002:module"
@@ -27,7 +28,7 @@ def yaml_tag_read_class(name)
class Class
yaml_as "tag:ruby.yaml.org,2002:class"
- remove_method :to_yaml # use Module's to_yaml
+ remove_method :to_yaml if respond_to? :to_yaml and method(:to_yaml).owner == self # use Module's to_yaml
end
class Struct

Showing you all comments on commits in this comparison.

@bkeepers

This comment has been minimized.

Show comment Hide comment
@bkeepers

bkeepers Sep 16, 2010

Can you explain what the headaches are with open files in the daemon?

Can you explain what the headaches are with open files in the daemon?

@guns

This comment has been minimized.

Show comment Hide comment
@guns

guns Sep 16, 2010

Owner

Oh just the headache of passing around open file handles. It's not really a big problem, and you solve it by using a Logger class. I opted to just use a simple Proc to keep the memory usage of the master process low. It's not a super-scalable solution, but it was dead simple.

Also, I think you are looking at it, but I had pushed an update here: http://github.com/guns/delayed_job/blob/delayed_job_daemon/lib/delayed/daemon_tasks.rb

Owner

guns commented on 031efe6 Sep 16, 2010

Oh just the headache of passing around open file handles. It's not really a big problem, and you solve it by using a Logger class. I opted to just use a simple Proc to keep the memory usage of the master process low. It's not a super-scalable solution, but it was dead simple.

Also, I think you are looking at it, but I had pushed an update here: http://github.com/guns/delayed_job/blob/delayed_job_daemon/lib/delayed/daemon_tasks.rb

@bkeepers

This comment has been minimized.

Show comment Hide comment
@bkeepers

bkeepers Sep 16, 2010

That was my guess. Thanks.

Yeah, I'm looking at the latest version, but github only lets you comment from the commit :/

That was my guess. Thanks.

Yeah, I'm looking at the latest version, but github only lets you comment from the commit :/

@giddie

This comment has been minimized.

Show comment Hide comment
@giddie

giddie Sep 21, 2010

I assume this signal propagation is intended to cause the master to terminate? Unfortunately, this doesn't work for me. When the master receives SIGTERM, it closes the workers perfectly, removes its PID file, but fails to terminate. By adding a log line to the traps (line 95 in this commit), I can see that the master receives SIGCLD, but not the SIGTERM you'd expect from this line.

For me, the solution is to replace line 101 with "exit". The master terminates immediately after its children, and all is well. Is there any reason you know of why sending SIGTERM would be better than calling "exit" here?

giddie commented on 031efe6 Sep 21, 2010

I assume this signal propagation is intended to cause the master to terminate? Unfortunately, this doesn't work for me. When the master receives SIGTERM, it closes the workers perfectly, removes its PID file, but fails to terminate. By adding a log line to the traps (line 95 in this commit), I can see that the master receives SIGCLD, but not the SIGTERM you'd expect from this line.

For me, the solution is to replace line 101 with "exit". The master terminates immediately after its children, and all is well. Is there any reason you know of why sending SIGTERM would be better than calling "exit" here?

@bkeepers

This comment has been minimized.

Show comment Hide comment
@bkeepers

bkeepers Sep 21, 2010

I was seeing the same thing. If I called Process.wait after line 101, it would kill the process.

I was seeing the same thing. If I called Process.wait after line 101, it would kill the process.

@giddie

This comment has been minimized.

Show comment Hide comment
@giddie

giddie Sep 21, 2010

But doesn't Process.wait wait for child processes to exit? According to the Ruby API:

Calling this method raises a SystemError if there are no child processes.

Surely, having just called Process.waitall, the master will have no child processes. In fact thinking about it, raising any exception here would probably have the effect of terminating the master process anyway. It might be worth checking whether your added wait is actually throwing an exception or allowing the process to end gracefully.

giddie commented on 031efe6 Sep 21, 2010

But doesn't Process.wait wait for child processes to exit? According to the Ruby API:

Calling this method raises a SystemError if there are no child processes.

Surely, having just called Process.waitall, the master will have no child processes. In fact thinking about it, raising any exception here would probably have the effect of terminating the master process anyway. It might be worth checking whether your added wait is actually throwing an exception or allowing the process to end gracefully.

@guns

This comment has been minimized.

Show comment Hide comment
@guns

guns Sep 21, 2010

Owner

giddle:

Is there any reason you know of why sending SIGTERM would be better than calling "exit" here?

In this case we could very well just call exit and be okay, but I like to reserve using exit with an explicit value for when the program exits by its own hand. When you trap a signal, you are supposed to resend it after you have done whatever it is that you wanted to do, so we don't disturb the normal rules of interprocess communication.

As to why the process doesn't properly exit for you, I'm not sure. What's your setup? Are you using the latest changes from this branch? It works on my machines, both with ruby 1.8.7 and 1.9.2. You may want to check out the other thread that brandon and I have here, but that doesn't really address this issue.

I'm also confused about the comment about adding logging to line 95 - that line resets the trap handler, so how do you inject logging into that?

Let's figure this out. I'm really curious about why Process.kill :TERM, $$ doesn't work properly...

Owner

guns commented on 031efe6 Sep 21, 2010

giddle:

Is there any reason you know of why sending SIGTERM would be better than calling "exit" here?

In this case we could very well just call exit and be okay, but I like to reserve using exit with an explicit value for when the program exits by its own hand. When you trap a signal, you are supposed to resend it after you have done whatever it is that you wanted to do, so we don't disturb the normal rules of interprocess communication.

As to why the process doesn't properly exit for you, I'm not sure. What's your setup? Are you using the latest changes from this branch? It works on my machines, both with ruby 1.8.7 and 1.9.2. You may want to check out the other thread that brandon and I have here, but that doesn't really address this issue.

I'm also confused about the comment about adding logging to line 95 - that line resets the trap handler, so how do you inject logging into that?

Let's figure this out. I'm really curious about why Process.kill :TERM, $$ doesn't work properly...

@giddie

This comment has been minimized.

Show comment Hide comment
@giddie

giddie Sep 22, 2010

My comment about logging was a bit obscure: I meant that I changed that line so that it would log the signal and do nothing, instead of resetting the handlers, just to see what signals were actually received. Obviously I then had to send SIGKILL to terminate the process. However, it did highlight for me the fact that SIGTERM was not propagated in the way that was intended: it was not caught.

I found this article that clearly agrees with what you say about needing to propagate the signal: http://www.cons.org/cracauer/sigint.html. It makes sense to me too. It's odd that it works for you and not me :s I'll try to look into this more later today.

So my setup is REE (1.8.7) in Archlinux. I've tried it on two systems, one x86 and one x86_64. I'm using your "guns" branch, so I'm pretty sure I've got the latest version. BTW, thank you so much for writing this daemon; it's a huge improvement :)

giddie commented on 031efe6 Sep 22, 2010

My comment about logging was a bit obscure: I meant that I changed that line so that it would log the signal and do nothing, instead of resetting the handlers, just to see what signals were actually received. Obviously I then had to send SIGKILL to terminate the process. However, it did highlight for me the fact that SIGTERM was not propagated in the way that was intended: it was not caught.

I found this article that clearly agrees with what you say about needing to propagate the signal: http://www.cons.org/cracauer/sigint.html. It makes sense to me too. It's odd that it works for you and not me :s I'll try to look into this more later today.

So my setup is REE (1.8.7) in Archlinux. I've tried it on two systems, one x86 and one x86_64. I'm using your "guns" branch, so I'm pretty sure I've got the latest version. BTW, thank you so much for writing this daemon; it's a huge improvement :)

@giddie

This comment has been minimized.

Show comment Hide comment
@giddie

giddie Feb 7, 2011

Maybe you can clear something up for me: what exactly will happen to a child when it receives SIGTERM? My observations have mostly led me to conclude that the current job is finished before the process terminates, but I'm not sure why, or even if this is by design or coincidence. Should we be handling a TERM signal more gracefully in the worker, and setting a boolean to terminate after the next job?

Here's my specific use-case: I have long-running jobs: sometimes 2mins or more, and I've written a little reaper script that TERMs workers if they get too bloated, but I don't want to risk shutting them down half-way through a job; I want to make sure they're only restarted between jobs.

giddie commented on 15678f4 Feb 7, 2011

Maybe you can clear something up for me: what exactly will happen to a child when it receives SIGTERM? My observations have mostly led me to conclude that the current job is finished before the process terminates, but I'm not sure why, or even if this is by design or coincidence. Should we be handling a TERM signal more gracefully in the worker, and setting a boolean to terminate after the next job?

Here's my specific use-case: I have long-running jobs: sometimes 2mins or more, and I've written a little reaper script that TERMs workers if they get too bloated, but I don't want to risk shutting them down half-way through a job; I want to make sure they're only restarted between jobs.