Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception handling when Heroku API call fails? #21

Closed
swrobel opened this issue Aug 28, 2013 · 14 comments
Closed

Exception handling when Heroku API call fails? #21

swrobel opened this issue Aug 28, 2013 · 14 comments

Comments

@swrobel
Copy link
Contributor

swrobel commented Aug 28, 2013

I'm wondering what happens when autoscaler hits an exception while attempting to do its thing. Here's my understanding:

  1. I enqueue something in my code
  2. Sidekiq attempts to enqueue that job
  3. Autoscaler jumps in at some point and attempts to tell the API to add a worker
  4. Exception thrown
  5. Exception bubbles back up to the top, stopping Sidekiq & my code in its tracks

Does the job end up get enqueued in the Sidekiq or does that process fail because Autoscaler did? Is it worth trying to catch exceptions in Autoscaler since it isn't a "critical" process?

What led me to these questions is that I've been noticing weird Excon errors in honeybadger today. Here are the two exception messages I've gotten, both of which have backtraces that lead me to believe that it's happening when autoscaler attempts to hit the heroku api (each links to a gist of the backtrace):

Sorry, this is more just a dump of my thoughts, but hopefully it sparks a good conversation..

@JustinLove
Copy link
Owner

Good point. I may have to review Exceptional Ruby for library behavior recommendations - if it just ate exceptions, you'd have unprocessed jobs backing up for no apparent reason.

At the moment, the scale happens before the yield, so the job won't be enqueued

@swrobel
Copy link
Contributor Author

swrobel commented Aug 29, 2013

hm, yikes, that seems problematic. seems like some of my jobs got through and some didn't. can't really make sense of it.

anyway, i figure it's better to have the jobs in the queue and no workers to process them than no jobs and no workers, knowwhatimean?

@JustinLove
Copy link
Owner

How's this?

39d5bd8

@swrobel
Copy link
Contributor Author

swrobel commented Sep 3, 2013

👍

@swrobel swrobel closed this as completed Sep 3, 2013
@swrobel
Copy link
Contributor Author

swrobel commented Sep 4, 2013

Great, Heroku API is down today! Deploying w/ the github version to see what happens... Who would've thought we'd get a chance to test so soon.

EDIT: Sadly, no dice, can't deploy either...

@JustinLove
Copy link
Owner

I thought that heroku's tools used the API as well, although I had some hope that deploy was significant separate service in it's own right :-(

@swrobel
Copy link
Contributor Author

swrobel commented Sep 5, 2013

Any input on how to implement a custom handler correctly? I'm doing the following, but maybe there's a simpler/better way:

  exception_handler = ->(exception) do
    Honeybadger.notify(
      error_class: exception.class,
      error_message: exception.message
    )
  end

  Sidekiq.configure_server do |config|
    config.redis = sidekiq_redis if sidekiq_redis
    config.server_middleware do |chain|
      scaler = Autoscaler::HerokuScaler.new
      scaler.exception_handler = exception_handler
      chain.add(Autoscaler::Sidekiq::Server, scaler, 60)
    end
  end

@JustinLove
Copy link
Owner

That's about what I intended.

@swrobel
Copy link
Contributor Author

swrobel commented Sep 11, 2013

I've never been so excited to see Heroku API errors! Exceptions the last 15 minutes have been getting logged but jobs are still queuing. Yay!

Thank you, again.

@swrobel
Copy link
Contributor Author

swrobel commented Sep 16, 2013

How do you feel about rescuing Exception like sidekiq does rather than limiting it to Excon? I imagine other stuff could go wrong as well that we aren't thinking of ;)

https://github.com/mperham/sidekiq/blob/master/lib/sidekiq/processor.rb#L54

@JustinLove
Copy link
Owner

Mostly blanket exception catching makes me nervous about masking errors.

Sidekiq has a slightly different domain - it has to wrap arbitrary user code, and then respond appropriately (such as retry) One reason not to catch all exceptions is to let Sidekiq continue it's usual error handling.

@swrobel
Copy link
Contributor Author

swrobel commented Sep 17, 2013

True, but we wouldn't catch those, right? I don't think they would bubble up to Autoscaler... I'm in the same camp as you, but in situations like this where queuing is more important than performing, it just seems to make sense.

@swrobel
Copy link
Contributor Author

swrobel commented Sep 28, 2013

Another day, another Heroku API failure. Here's the latest exception that's now bubbling up to my app and causing sidekiq not to enqueue: Heroku::API::Errors::ErrorWithResponse: Expected(200) <=> Actual(503 Service Unavailable)

@swrobel
Copy link
Contributor Author

swrobel commented Sep 28, 2013

New idea: is it possible to handle the scaling after sidekiq has successfully enqueued the job? That way no matter what happens with autoscaler, we know jobs won't be lost...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants