-
Notifications
You must be signed in to change notification settings - Fork 110
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow workers to timeout. #149
Conversation
3459cd0
to
ed50964
Compare
README.markdown
Outdated
@@ -271,6 +271,7 @@ optipng: | |||
* `:allow_lossy` — Allow lossy workers and optimizations *(defaults to `false`)* | |||
* `:cache_dir` — Configure cache directory | |||
* `:cache_worker_digests` - Also cache worker digests along with original file digest and worker options: updating workers invalidates cache | |||
* `:timeout` — Number of seconds before workers are timed out. *(defaults to `0`)* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What does the default of 0
mean? I assume it means "no timeout", but it would be good to clarify that, as a naive reading suggests that the default is basically "kill as soon as you start".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point 👍 I'll update the description for the option.
lib/image_optim/cmd.rb
Outdated
|
||
class ImageOptim | ||
# Helper for running commands | ||
module Cmd | ||
class Timeout < StandardError; end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A more "obviously erroneous" name might be good here. The Timeout.timeout
standard library method uses Timeout::Error
by default, which isn't great but is at least obviously some sort of Bad Thing. Perhaps TimeoutExceeded
?
lib/image_optim/cmd.rb
Outdated
end | ||
end | ||
|
||
# Run commands using `Process.spawn` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment demonstrates why mentioning implementation details in comments is fraught with peril... as far as I can tell, you're not actually using Process.spawn
in the method. Even if you were, anyone calling run_with_timeout
shouldn't need to care how you're doing what you're doing, just what is supposed to happen. "Run the specified command, and kill it off if it runs longer than :timeout
seconds" would be more descriptive.
lib/image_optim/cmd.rb
Outdated
success = thread.value.exitstatus.zero? | ||
end | ||
ensure | ||
stdin.close |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're not going to use stdin
, it's best to close it immediately after calling popen2
, rather than in an ensure
block after everything is done. It guarantees that the child process won't hang around unnecessarily waiting on input that will never come, and makes it clearer that the pipes definitely are getting closed. If you're very (very) sure that the subprocess won't produce output, you can close stdout
too immediately, however if the subprocess does write anything, sadness will result (the process will get a SIGPIPE
and almost certainly fall over in a heap). On the other hand, failing to read from stdout
if it is left open can lead to the process stalling, because the pipe has a finite capacity, and if it writes too much to its stdout
further writes will block on the pipe emptying, and the process won't make any progress, which is also sad.
Isn't multiprocessing fun?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for explaining @mpalmer 👍
I've gone ahead and just close stdout
since the only method that uses this library redirects :out
and :err
to Path::NULL.
lib/image_optim/cmd.rb
Outdated
now = Time.now | ||
pid = thread[:pid] | ||
|
||
sleep 0.001 while (Time.now - now) < timeout && thread.alive? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Willthread.join(timeout)
work here instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍 Now I know that I can set a limit on Thread#join
lib/image_optim/cmd.rb
Outdated
|
||
while Time.now - now < 10 | ||
begin | ||
Process.getpgid(pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is more idiomatically expressed as Process.kill(0, pid)
(and also raises ESRCH
when the process goes away).
lib/image_optim/cmd.rb
Outdated
|
||
def cleanup_process(pid) | ||
Thread.new do | ||
Process.kill('-TERM', pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From Process.kill
:
If signal is negative (or starts with a minus sign), kills process groups instead of processes.
I'm pretty sure that isn't what you want to have happen here. A straight-up TERM
might be more appropriate.
(Also applies to the -KILL
on line 117)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm I want to be killing the process group here since we won't know if the command being run will spawn child processes. My understanding is that sending a signal to the process group would also send the signal to all the processes in the process group.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oooh, of course. Not sure why I didn't associate the new_pgroup
code with the "kill the process group"... Carry on.
1101d9c
to
5b4fd97
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry for the wait. Checked a lot of stuff and created an issue in ruby bug tracker.
options[:allow_lossy] = image_optim.allow_lossy | ||
|
||
[:allow_lossy, :timeout].each do |option_key| | ||
if !options.key?(option_key) && klass.method_defined?(option_key) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a guard (or two) would be cleaner in this case
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry I don't quite what you mean here. What would the guard be used for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next if …
and next unless …
instead of if
with block
0, | ||
'Number of seconds before worker is timed out. Must be greater than' \ | ||
'0 to enable timeout.' | ||
){ |v| v } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.to_i
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
option( | ||
:timeout, | ||
0, | ||
'Number of seconds before worker is timed out. Must be greater than' \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Space missing between than and 0
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah good catch will fix
lib/image_optim/worker/jhead.rb
Outdated
@@ -7,6 +7,8 @@ class Worker | |||
# | |||
# Jhead internally uses jpegtran which should be on path | |||
class Jhead < Worker | |||
TIMEOUT_OPTION = timeout_option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is be better to not timeout jhead – it should not be a time consuming process, but it may time out for a big jpeg and if exif will be removed, the image will be "broken"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand, none of the image optimizing is done in place right? Even for optimize_image!
, it does not replace the original image if optimization fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right. I understood only later that you've implemented the timeout in the way that if any worker timeouts, than optimisation will not continue. But if workers that timeout are just skipped, it is better to now allow jhead to timeout.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think that jhead should not timeout
lib/image_optim/worker.rb
Outdated
@@ -151,7 +153,18 @@ def run_command(cmd_args) | |||
{:out => Path::NULL, :err => Path::NULL}, | |||
].flatten | |||
end | |||
Cmd.run(*args) | |||
|
|||
seconds_to_timeout = timeout || @image_optim.timeout |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
timeout
will already contain specific or global value, it is passed in Worker.init_all
lib/image_optim/cmd.rb
Outdated
@@ -15,6 +18,44 @@ def run(*args) | |||
success | |||
end | |||
|
|||
def support_timeout? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think supports_timeout?
or timeout_supported?
would be better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I'll go with supports_timeout?
lib/image_optim/cmd.rb
Outdated
init_options!(args) | ||
|
||
begin | ||
stdin, stdout, thread = Open3.popen2(*args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As pipe is not really used, why not directly use spawn
and Process.detach
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also it looks like some workers don't like closed stdin
or stdout
, so maybe better to use :in
, :out
with Path::NULL
options of spawn
lib/image_optim/cmd.rb
Outdated
thread.kill | ||
fail TimeoutExceeded | ||
else | ||
success = thread.value.exitstatus.zero? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Running ruby -Ilib bin/image_optim --timeout 1 -r spec/images
dies with undefined method
zero?' for nil:NilClass`
lib/image_optim/worker.rb
Outdated
begin | ||
Cmd.run_with_timeout(seconds_to_timeout, *args) | ||
rescue Cmd::TimeoutExceeded | ||
raise ImageOptim::Worker::TimeoutExceeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This exception is not handled, cli interface will fail for all images and I would expect timeout to stop only one worker and try next one.
lib/image_optim/cmd.rb
Outdated
pid = thread[:pid] | ||
|
||
if thread.join(timeout).nil? | ||
cleanup_process(pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cleanup_process
creates a thread which is never joined, then the watching thread is killed, I think in this case it is better to wait for cleanup to finish
ping @tgxworld |
@toy Noted. Give me awhile as I've been alittle busy at work, will fix those changes soon. |
c0bef26
to
e49089b
Compare
lib/image_optim/cmd.rb
Outdated
thread = Process.detach(pid) | ||
|
||
if thread.join(timeout).nil? | ||
cleanup_process(pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toy The reason I'm not calling join
here is because we don't want the cleanup process to be blocking the main process.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the comment for Thread.new
of cleanup_process
lib/image_optim/cmd.rb
Outdated
Process.detach(pid) | ||
now = Time.now | ||
|
||
while Time.now - now < 10 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@toy 10 is sort of a magic number here. Any ideas about how we can handle this better?
@toy Updated per your review comments. Rubocop is broken on master so Travis is having abit of sad here as well. |
@tgxworld I've reverted the last commit, so rubocop doesn't complain. I'll try to review the changes in the nearest time |
e49089b
to
9670fb7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lots of small comments, but main points are:
- simplifying
run_with_timeout
- cleanup in background can create problems with reusing of temporary files
- it is better not to timeout jhead
lib/image_optim.rb
Outdated
begin | ||
handler.process{ |src, dst| worker.optimize(src, dst) } | ||
rescue Worker::TimeoutExceeded | ||
next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
next
is not needed, but a message to stderr if verbose is on can be helpful
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure I can add that
lib/image_optim/cmd.rb
Outdated
@@ -15,6 +17,40 @@ def run(*args) | |||
success | |||
end | |||
|
|||
def supports_timeout? | |||
if defined?(JRUBY_VERSION) | |||
JRUBY_VERSION >= '9.0.0.0' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are a lot of versions to go till 10.0 and all of them >= 1.9, but next major jruby version will not be.
Does it really work well with jruby at all and only starting with version 9? I remember having lots of problems trying to just make new system
(spawn) syntax work with jruby (though maybe pre 9).
lib/image_optim/cmd.rb
Outdated
if defined?(JRUBY_VERSION) | ||
JRUBY_VERSION >= '9.0.0.0' | ||
else | ||
RUBY_VERSION >= '1.9' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe Kernel.respond_to?(:spawn)
lib/image_optim/worker.rb
Outdated
if timeout | ||
begin | ||
Cmd.run_with_timeout(timeout, *args) | ||
rescue Cmd::TimeoutExceeded |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it works having two new exceptions just to reraise?
lib/image_optim/cmd.rb
Outdated
return run(*args) unless timeout > 0 && supports_timeout? | ||
|
||
success = false | ||
init_options!(args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Method name is not descriptive, method is short and not used for anything else, is it worth extraction?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rubocop was complaining about the method being too long 😉
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This method can be shortened, so maybe it will fit inside ;)
lib/image_optim/cmd.rb
Outdated
begin | ||
Process.kill(0, pid) | ||
sleep 0.001 | ||
next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not needed
lib/image_optim/cmd.rb
Outdated
sleep 0.001 | ||
next | ||
rescue Errno::ESRCH | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this place you can already return or even combine it with other rescue as in both cases there should be no such process
lib/image_optim/worker/jhead.rb
Outdated
@@ -7,6 +7,8 @@ class Worker | |||
# | |||
# Jhead internally uses jpegtran which should be on path | |||
class Jhead < Worker | |||
TIMEOUT_OPTION = timeout_option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still think that jhead should not timeout
def timeout_option | ||
option( | ||
:timeout, | ||
nil, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Insert Integer
as next parameter so you don't need to add NilClass
in option_parser.rb
lib/image_optim/cmd.rb
Outdated
thread = Process.detach(pid) | ||
|
||
if thread.join(timeout).nil? | ||
cleanup_process(pid) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Check the comment for Thread.new
of cleanup_process
9670fb7
to
13e5d81
Compare
eb6e374
to
4feeb03
Compare
5e130be
to
8416f03
Compare
8416f03
to
0a576eb
Compare
@toy I've been thinking about my approach here and feel like it might be wrong. The use case that we had at Discourse was to timeout calls to |
@tgxworld Sure, there can be timeout per worker and timeout for the whole optimisation (two global options). I'm fine with a PR for one of them or both. Probably if you decide to only implement one for whole optimisation, better open a separate PR, so this one can also be finished later |
Guys, thanks for the lib and all the work done! I'm really waiting for this one to be merged since when I use external |
@smileart Thanks for reminding that the issue will be useful |
Closing in favor of #162 |
No description provided.