Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve reliability of source calendar retrieval by enabling retries #403

Merged
merged 1 commit into from
May 13, 2024

Conversation

octogonz
Copy link
Contributor

I was having problems where my entire calendar would get deleted temporarily, only to be recreated an hour later, due to intermittent HTTP errors being reported by the calendar source. It is the same issue discussed in #343.

Related work

PR #392 tries to solve this same problem by silently skipping sync when such errors occur. I'm uncomfortable with that solution, because I would prefer "obviously wrong" (entire calendar is missing) versus "deceptively wrong" (calendar looks okay but is actually telling me wrong information because it has stopped syncing). Skipping sync would be a good solution if we ALSO provided better mechanism for communicating problems, for example email notifications. Ideally we should pursue that.

A different approach

But then I found a superior fix: This PR completely solves the root cause in my case. That is why I have not invested the time to investigate better error notifications. For my own purposes at least, this PR entirely eliminated the need for PR #392.

Fixes #343

What I changed

  1. Track the failures. If an HTTP request ultimately fails (excluding success after retrying), throw a top-level exception so that the Google Apps Script "Executions" dashboard reports the execution as failing. Example shown below:

    image

  2. Reattempt HTTP requests. Improve callWithBackoff() so that it retries HTTP requests for all status codes that are known to be intermittent problems.

With these changes, I can see that the failures are no longer occurring. The longest reattempt took 8 tries over 14 seconds, as seen in this log:

image

But since yesterday, every execution ultimately succeeded. 👍

@Lonestarjeepin
Copy link
Collaborator

@octogonz I finally implemented this solution in my version and it seems to be making callWithBackoff work as intended now. The only thing I tweaked was adding an error catch in backoffRecoverableErrors for 404 (because that was the easiest way to break my calendar URL to test this!). Not sure if we would need that error in the real world though. I closed my #392 PR in favor of this solution.

@derekantrican
Copy link
Owner

Seems fine to me. @jonas0b1011001 - any thoughts?

@derekantrican derekantrican merged commit 90157f5 into derekantrican:master May 13, 2024
2 checks passed
jonas0b1011001 pushed a commit that referenced this pull request May 22, 2024
@jonas0b1011001 jonas0b1011001 linked an issue May 26, 2024 that may be closed by this pull request
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants