Need better recovery from Google Sheet creation failing #65

Zahariel · 2022-01-15T18:08:43Z

We should probably catch the exception and tell the Celery task to retry.

[2022-01-15 17:34:53,694 ERROR celery.app.trace] Task puzzles.tasks.create_puzzle_sheet_and_channel[40bdd864-4629-4c74-9d9a-fcbee95565a5] raised unexpected: <HttpError 403 when requesting https://www.googleapis.com/drive/v3/files/1rWyhGIkyx4h64us3BGeFrNY_xe01cxnKCNB4AxRYTO4/copy?alt=json returned "User rate limit exceeded.">
Traceback (most recent call last):
File "/app/.heroku/python/lib/python3.8/site-packages/celery/app/trace.py", line 385, in trace_task
R = retval = fun(*args, **kwargs)
File "/app/.heroku/python/lib/python3.8/site-packages/celery/app/trace.py", line 650, in protected_call
return self.run(*args, **kwargs)
File "/app/herring/puzzles/tasks.py", line 164, in create_puzzle_sheet_and_channel
sheet_id = make_sheet(sheet_title)
File "/app/herring/puzzles/spreadsheets.py", line 35, in make_sheet
got = service.files().copy(fileId=settings.HERRING_SECRETS['gapps-doc-to-clone'], body=body).execute()
File "/app/.heroku/python/lib/python3.8/site-packages/googleapiclient/_helpers.py", line 130, in positional_wrapper
return wrapped(*args, **kwargs)
File "/app/.heroku/python/lib/python3.8/site-packages/googleapiclient/http.py", line 856, in execute
raise HttpError(resp, content, uri=self.uri)
googleapiclient.errors.HttpError: <HttpError 403 when requesting https://www.googleapis.com/drive/v3/files/1rWyhGIkyx4h64us3BGeFrNY_xe01cxnKCNB4AxRYTO4/copy?alt=json returned "User rate limit exceeded."> (worker.1, v203)

This also caused the Discord channels not to be created, because the spreadsheet is done first (so it can put the spreadsheet link in the channel topic).

gwillen · 2022-01-15T19:18:41Z

Just want to flag the obvious but easily-overlooked fact that retrying in the face of a ratelimit is dangerous unless we have insight into what the ratelimit is. Otherwise you can end up in the meltdown we had with Zulip that one year, where each failed retry is counted against the ratelimit and we never recover.

Do we know why we got ratelimited? Was it probably just some kind of glitch on google's end? I can't imagine we were doing anything intensive at the time.

sparkyb · 2022-01-15T19:43:15Z

"Better recovery" need not be retrying. Could be a way to manually re-attempt or fix after the fact. Also, when creating the sheet fails, creating the discord channel also failed, and maybe error handling at least would be good so we can end up in a known state without a sheet?

gwillen · 2022-01-15T20:30:06Z

Yeah, I would ideally like to see us have better handling of those kinds of states. Right now I think our model is not very robust to anything in the world being in an unexpected state.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Need better recovery from Google Sheet creation failing #65

Need better recovery from Google Sheet creation failing #65

Zahariel commented Jan 15, 2022

gwillen commented Jan 15, 2022

sparkyb commented Jan 15, 2022

gwillen commented Jan 15, 2022

Need better recovery from Google Sheet creation failing #65

Need better recovery from Google Sheet creation failing #65

Comments

Zahariel commented Jan 15, 2022

gwillen commented Jan 15, 2022

sparkyb commented Jan 15, 2022

gwillen commented Jan 15, 2022