Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Managing jobs with errors #929

Closed
oyeanuj opened this issue Apr 15, 2023 · 1 comment
Closed

Managing jobs with errors #929

oyeanuj opened this issue Apr 15, 2023 · 1 comment

Comments

@oyeanuj
Copy link

oyeanuj commented Apr 15, 2023

Hi @bensheldon, thank you for creating this library - it's an amazing library! This might be less of a bug and maybe more of a request for best practices or even documentation.

Context: I'm using good_job on Heroku on Rails-API app running Rails 6. Given that I'm using it on API-mode, I haven't yet set up the dashboard and I'm looking for non-dashboard solutions to the problems below:

Sometimes, I have jobs that fail, mostly related to email (inactive email, wrong email et al), and I notice that creates hundreds of job in my good_jobs table and if that happens for too long, then my regular jobs start to get affected even if they are in an urgent queue. I've tried deleting the duplicate or problematic jobs only to have them be created again, and I've tried deleting the original records only to have good_job try them again and give me an ActiveRecord's RecordNotFound error.

So I am wondering, what's the best practice when you have problematic records? Is there a recommended way to delete the job that works best with good_job and/or are there settings that I'm overlooking? FWIW, I've tried to look through the documentation but nothing has fixed it so far, hence the questions here :)

@bensheldon
Copy link
Owner

Thanks for opening this issue. That's not good!

It sounds like it could be one or both of these:

  • You have config.good_job.retry_on_unhandled_error = true/GoodJob.retry_on_unhandled_error = true. That will cause jobs to be immediately retried when an unrescued exception is raised. I made the default false as of GoodJob v3 to help avoid the situation I think you may be experiencing.
  • You have a rescue_from handler in your job that does not have wait: :exponentially_longer set.

I do have a section in the readme that tries to explain this stuff, but I'll admit I haven't done a pass on it recently, so there is probably room for improvement:

https://github.com/bensheldon/good_job/#retries

(be sure to read down because you have to duplicate the rescue_from behavior for Action Mailer deliver_later jobs too)

In terms of recommendations, if you're able, I think running the Good Job Dashboard would give you the most administrative insights into what is happening and actions to take to discard these jobs: https://github.com/bensheldon/good_job/#api-only-rails-applications

Alternatively, please take a look at the methods defined on Jobs themselves, as it's likely that you're simply destroying the record while it's executing and you should instead use one of these methods that safely take a lock on the record:

# Discard a job so that it will not be executed further.
# This action will add a {DiscardJobError} to the job's {Execution} and mark it as finished.
# @return [void]
def discard_job(message)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Development

No branches or pull requests

2 participants