Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Set Faraday timeout #58

Closed
benkant opened this Issue Mar 19, 2013 · 9 comments

Comments

Projects
None yet
3 participants

benkant commented Mar 19, 2013

We were having an intermittent issue where a rails runner would hang and we tracked it down to Librato using a Faraday connection without open_timeout being set.

Is there any way to configure the Faraday connection's timeout and open_timeout settings? Or perhaps there should be a sane value by default?

Contributor

nextmat commented Mar 20, 2013

Thanks for the report. A few questions:

  • For our information, how long would your runner tasks hang?
  • Are you doing things in your runner processes that need to be reported to metrics?

I'll investigate setting a better timeout value as I think that makes sense no matter what.

benkant commented Mar 22, 2013

Thanks Matt. The runner tasks can take a few minutes- up to 5 perhaps. It seems that if the runner takes longer than some unit of time, it will never exit because it's hanging in a Faraday connection via librato-rails.

There's nothing in these particular processes that require reporting to metrics (as in we are not explicitly making Librato calls in them) however there are implicit calls to metrics.

Here's a stack trace (call rb_eval_string("puts caller.join(\"\\n\")") from gdb attached to the offending process):

/usr/local/lib/ruby/1.9.1/net/http.rb:762:in `initialize'
/usr/local/lib/ruby/1.9.1/net/http.rb:762:in `open'
/usr/local/lib/ruby/1.9.1/net/http.rb:762:in `block in connect'
/usr/local/lib/ruby/1.9.1/timeout.rb:54:in `timeout'
/usr/local/lib/ruby/1.9.1/timeout.rb:99:in `timeout'
/usr/local/lib/ruby/1.9.1/net/http.rb:762:in `connect'
/usr/local/lib/ruby/1.9.1/net/http.rb:755:in `do_start'
/usr/local/lib/ruby/1.9.1/net/http.rb:744:in `start'
/usr/local/lib/ruby/1.9.1/net/http.rb:1284:in `request'
$GEM_ROOT/faraday-0.8.4/lib/faraday/adapter/net_http.rb:74:in `perform_request'
$GEM_ROOT/faraday-0.8.4/lib/faraday/adapter/net_http.rb:37:in `call'
$GEM_ROOT/faraday-0.8.4/lib/faraday/response.rb:8:in `call'
$GEM_ROOT/librato-metrics-1.0.2/lib/librato/metrics/middleware/count_requests.rb:22:in `call'
$GEM_ROOT/librato-metrics-1.0.2/lib/librato/metrics/middleware/retry.rb:15:in `call'
$GEM_ROOT/librato-metrics-1.0.2/lib/librato/metrics/middleware/request_body.rb:11:in `call'
$GEM_ROOT/faraday-0.8.4/lib/faraday/connection.rb:226:in `run_request'
$GEM_ROOT/faraday-0.8.4/lib/faraday/connection.rb:99:in `post'
/usr/local/lib/ruby/1.9.1/forwardable.rb:201:in `post'
$GEM_ROOT/librato-metrics-1.0.2/lib/librato/metrics/persistence/direct.rb:22:in `block in persist'
$GEM_ROOT/librato-metrics-1.0.2/lib/librato/metrics/persistence/direct.rb:19:in `each'
$GEM_ROOT/librato-metrics-1.0.2/lib/librato/metrics/persistence/direct.rb:19:in `persist'
$GEM_ROOT/librato-metrics-1.0.2/lib/librato/metrics/processor.rb:32:in `submit'
$GEM_ROOT/librato-rails-0.8.1/lib/librato/rails/validating_queue.rb:26:in `submit'
$GEM_ROOT/librato-rails-0.8.1/lib/librato/rails.rb:81:in `flush'
$GEM_ROOT/librato-rails-0.8.1/lib/librato/rails.rb:130:in `block (2 levels) in start_worker'
$GEM_ROOT/librato-rails-0.8.1/lib/librato/rails/worker.rb:15:in `call'
$GEM_ROOT/librato-rails-0.8.1/lib/librato/rails/worker.rb:15:in `execute'
$GEM_ROOT/librato-rails-0.8.1/lib/librato/rails/worker.rb:26:in `run_periodically'
$GEM_ROOT/librato-rails-0.8.1/lib/librato/rails.rb:129:in `block in start_worker'
Contributor

nextmat commented Mar 26, 2013

@benkant So after spending some time looking at this my inclination is to add an ENV variable switch, something like LIBRATO_MODE=disabled which you can use with your runner actions to ensure the background thread never starts, how does this sound? It doesn't seem that there is a clear environment difference in runner mode, so my sense is this is probably something that will have to be managed manually.

Maybe a LIBRATO_DISABLED=1?

But still, other libraries seem to at least support setting a timeout and open_timeout value for Faraday (or Net::HTTP), if not also setting a sane default. Without it, the HTTP open attempt can indefinitely hang:

Perhaps two issues: a sane default value, to prevent the thread from hanging; and a LIBRATO_DISABLED flag, for processes that don't report metrics?

Contributor

nextmat commented Mar 26, 2013

@toolmantim Thanks for the resources. Yep, that's exactly what I'm thinking - there are two issues, one correcting the timeout issue but also providing a better path for manually controlling reporter startup.

I thought about LIBRATO_DISABLED exactly, but we also have the opposite scenario which some folks have been asking about where they are running in a mode where the reporter doesn't start (for example the console) and want to be able to start it. It would be nice to be able to handle these options in a single env var rather than LIBRATO_DISABLE, LIBRATO_ENABLED, etc. It might also allow something like LIBRATO_MODE=debug which would enable some of the commonly helpful things for debugging as a group.

@nextmat cool. What about: LIBRATO_AUTORUN=0 and LIBRATO_AUTORUN=1, with the default value being 1 in a normal Ruby process, and 0 in a console process. You could also have a LIBRATO_DEBUG=1 for the debug related stuff.

I've submitted ruby/ruby#269 to document those Ruby defaults.

Contributor

nextmat commented Apr 4, 2013

@toolmantim Sounds good. I like LIBRATO_AUTORUN. I'll be adding that to a new version here shortly as well as addressing the timeout issue.

👍

Contributor

nextmat commented Apr 5, 2013

Timeouts should be fixed if you upgrade your librato-metrics to 1.0.4 or later. Just added a separate ticket for the env vars. Closing, let me know if you run into further issues.

@nextmat nextmat closed this Apr 5, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment