Skip to content
This repository has been archived by the owner on Aug 25, 2023. It is now read-only.

Increasing retries and delay to make 500 errors causes by Datastore e… #13

Conversation

radkomateusz
Copy link
Contributor

…ventual consistency less likely to happen.

Details:
When table entity doesn’t exist, then we create proper table entity, and based on that entity we schedule proper copy job.

Based on that copy job couldn’t be scheduled before creation of table entity, after copy job assumes that table entity exist.

Not having table entity is treaded as an error → if we really don’t have table entity for backup what we made, then it means that we have inconsistent data in Datastore.

In existing cases, from eventual consistency of Datastore we don’t get existing table_entity and we retried request till the success.

Existing errors shown that this request was retried 5 times before task fails.
Fortunately, whole task was retried and 3rd attempt of retry was successful.

As error is not repeatitable (occurs randomly in time, probably being dependent from datastore servers load), I increased number of retries and make it in longest period of time to make this kind of errors less likely to happen.

…ventual consistency less likely to happen.

Details:
When table entity doesn’t exist, then we create proper table entity, and based on that entity we schedule proper copy job.

Based on that copy job couldn’t be scheduled before creation of table entity, after copy job assumes that table entity exist.

Not having table entity is treaded as an error → if we really don’t have table entity for backup what we made, then it means that we have inconsistent data in Datastore.

In existing cases, from eventual consistency of Datastore we don’t get existing table_entity and we retried request till the success.

Existing errors shown that this request was retried 5 times before task fails.
Fortunately, whole task was retried and 3rd attempt of retry was successful.

As error is not repeatitable (occurs randomly in time, probably being dependent from datastore servers load), I increased number of retries and make it in longest period of time to make this kind of errors less likely to happen.
@coveralls
Copy link

Pull Request Test Coverage Report for Build 239

  • 1 of 1 (100.0%) changed or added relevant line in 1 file are covered.
  • No unchanged relevant lines lost coverage.
  • Overall coverage remained the same at 87.813%

Totals Coverage Status
Change from base Build 218: 0.0%
Covered Lines: 1996
Relevant Lines: 2273

💛 - Coveralls

@marcin-kolda marcin-kolda merged commit 7870563 into master Jul 9, 2018
@marcin-kolda marcin-kolda deleted the DatastoreTableGetRetriableException500error_less_likely_to_happen branch July 9, 2018 14:29
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants