Improve reliability of source calendar retrieval by enabling retries #403

octogonz · 2024-01-10T21:22:20Z

I was having problems where my entire calendar would get deleted temporarily, only to be recreated an hour later, due to intermittent HTTP errors being reported by the calendar source. It is the same issue discussed in #343.

Related work

PR #392 tries to solve this same problem by silently skipping sync when such errors occur. I'm uncomfortable with that solution, because I would prefer "obviously wrong" (entire calendar is missing) versus "deceptively wrong" (calendar looks okay but is actually telling me wrong information because it has stopped syncing). Skipping sync would be a good solution if we ALSO provided better mechanism for communicating problems, for example email notifications. Ideally we should pursue that.

A different approach

But then I found a superior fix: This PR completely solves the root cause in my case. That is why I have not invested the time to investigate better error notifications. For my own purposes at least, this PR entirely eliminated the need for PR #392.

Fixes #343

What I changed

Track the failures. If an HTTP request ultimately fails (excluding success after retrying), throw a top-level exception so that the Google Apps Script "Executions" dashboard reports the execution as failing. Example shown below:
Reattempt HTTP requests. Improve callWithBackoff() so that it retries HTTP requests for all status codes that are known to be intermittent problems.

With these changes, I can see that the failures are no longer occurring. The longest reattempt took 8 tries over 14 seconds, as seen in this log:

But since yesterday, every execution ultimately succeeded. 👍

…for intermittent HTTP failures

Lonestarjeepin · 2024-05-02T19:43:25Z

@octogonz I finally implemented this solution in my version and it seems to be making callWithBackoff work as intended now. The only thing I tweaked was adding an error catch in backoffRecoverableErrors for 404 (because that was the easiest way to break my calendar URL to test this!). Not sure if we would need that error in the real world though. I closed my #392 PR in favor of this solution.

derekantrican · 2024-05-03T16:43:54Z

Seems fine to me. @jonas0b1011001 - any thoughts?

…for intermittent HTTP failures (#403)

Improve reliability of source calendar retrieval by enabling retries …

ef8a0bd

…for intermittent HTTP failures

This was referenced Jan 10, 2024

Events removed when URL returns 5xx #343

Closed

Skip on 5xx and 4xx URL errors #392

Closed

jonas0b1011001 approved these changes May 12, 2024

View reviewed changes

derekantrican merged commit 90157f5 into derekantrican:master May 13, 2024
2 checks passed

This was linked to issues May 15, 2024

Receive email when script produces an error #228

Closed

Error Code 403 when trying to process an .ics file stored in Google Drive #269

Closed

jonas0b1011001 pushed a commit that referenced this pull request May 22, 2024

Improve reliability of source calendar retrieval by enabling retries …

b235e14

…for intermittent HTTP failures (#403)

jonas0b1011001 linked an issue May 26, 2024 that may be closed by this pull request

Discussion: Notifications for failing sync #154

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve reliability of source calendar retrieval by enabling retries #403

Improve reliability of source calendar retrieval by enabling retries #403

octogonz commented Jan 10, 2024

Lonestarjeepin commented May 2, 2024

derekantrican commented May 3, 2024

Improve reliability of source calendar retrieval by enabling retries #403

Improve reliability of source calendar retrieval by enabling retries #403

Conversation

octogonz commented Jan 10, 2024

Related work

A different approach

What I changed

Lonestarjeepin commented May 2, 2024

derekantrican commented May 3, 2024