Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Need better recovery from Google Sheet creation failing #65

Open
Zahariel opened this issue Jan 15, 2022 · 3 comments
Open

Need better recovery from Google Sheet creation failing #65

Zahariel opened this issue Jan 15, 2022 · 3 comments

Comments

@Zahariel
Copy link
Contributor

We should probably catch the exception and tell the Celery task to retry.

[2022-01-15 17:34:53,694 ERROR celery.app.trace] Task puzzles.tasks.create_puzzle_sheet_and_channel[40bdd864-4629-4c74-9d9a-fcbee95565a5] raised unexpected: <HttpError 403 when requesting https://www.googleapis.com/drive/v3/files/1rWyhGIkyx4h64us3BGeFrNY_xe01cxnKCNB4AxRYTO4/copy?alt=json returned "User rate limit exceeded.">
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.8/site-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/app/.heroku/python/lib/python3.8/site-packages/celery/app/trace.py", line 650, in protected_call
return self.run(*args, **kwargs)
File "/app/herring/puzzles/tasks.py", line 164, in create_puzzle_sheet_and_channel
sheet_id = make_sheet(sheet_title)
File "/app/herring/puzzles/spreadsheets.py", line 35, in make_sheet
got = service.files().copy(fileId=settings.HERRING_SECRETS['gapps-doc-to-clone'], body=body).execute()
File "/app/.heroku/python/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
File "/app/.heroku/python/lib/python3.8/site-packages/googleapiclient/http.py", line 856, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://www.googleapis.com/drive/v3/files/1rWyhGIkyx4h64us3BGeFrNY_xe01cxnKCNB4AxRYTO4/copy?alt=json returned "User rate limit exceeded."> (worker.1, v203)

This also caused the Discord channels not to be created, because the spreadsheet is done first (so it can put the spreadsheet link in the channel topic).

@gwillen
Copy link
Owner

gwillen commented Jan 15, 2022

Just want to flag the obvious but easily-overlooked fact that retrying in the face of a ratelimit is dangerous unless we have insight into what the ratelimit is. Otherwise you can end up in the meltdown we had with Zulip that one year, where each failed retry is counted against the ratelimit and we never recover.

Do we know why we got ratelimited? Was it probably just some kind of glitch on google's end? I can't imagine we were doing anything intensive at the time.

@sparkyb
Copy link

sparkyb commented Jan 15, 2022

"Better recovery" need not be retrying. Could be a way to manually re-attempt or fix after the fact. Also, when creating the sheet fails, creating the discord channel also failed, and maybe error handling at least would be good so we can end up in a known state without a sheet?

@gwillen
Copy link
Owner

gwillen commented Jan 15, 2022

Yeah, I would ideally like to see us have better handling of those kinds of states. Right now I think our model is not very robust to anything in the world being in an unexpected state.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants