fix index_objects to retry on failure #137

Open
willkg opened this Issue Apr 23, 2013 · 2 comments

Projects

None yet

2 participants

@willkg
Mozilla member

Right now if index_objects fails, that's it. In Kitsune, we have it retry indexing at some later date so as to reduce the likelihood that we have an out-of-date index.

It's possible that retrying is different between different versions of celery. We need to figure that out.

If it's the case that adding retry code complicates index_objects too much, maybe we should create an index_objects_retryable which has the retryable code in it and some options.

@willkg
Mozilla member

I looked at the code in kitsune that retries the task if it fails and there are a bunch of questions that need to get figured out:

  1. what exceptions should cause it to retry?
  2. should it retry indexing all the objects or just the ones that failed?
  3. should it notify developers that it retried?
  4. should the retry times be configurable?

Anyhow, there's a lot of stuff here. That makes for a far more complex index_objects task. It's definitely useful, but I'm not wildly excited about implementing, testing and maintaining it.

I'll keep this issue around, but I'm removing it from the 0.7 milestone.

If someone else wants to tackle this, I'd be much obliged.

@robhudson
Mozilla member

In zamboni-land we use @task(acks_late=True). If something happens mid-execution the task will get picked up by another worker b/c it's not removed from the queue until the task is completed. Since indexing is idempotent this is good. You may want both retries and late acks but late acks might be an easy thing to add to help.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment