Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix S3 adapter retrying failing uploads with exponential backoff #12085

Merged
merged 1 commit into from
Oct 6, 2019

Conversation

Gargron
Copy link
Member

@Gargron Gargron commented Oct 6, 2019

The default limit of 10 retries with exponential backoff meant
that if the S3 server was timing out, you would be stuck with it
for much, much longer than the 5 second read timeout we expect.

The uploading happens within a database transaction, which means
a failing S3 server could negatively affect database performance

The default limit of 10 retries with exponential backoff meant
that if the S3 server was timing out, you would be stuck with it
for much, much longer than the 5 second read timeout we expect.

The uploading happens within a database transaction, which means
a failing S3 server could negatively affect database performance
@Gargron Gargron added the performance Runtime performance label Oct 6, 2019
@Gargron Gargron merged commit 086fc7e into master Oct 6, 2019
@Gargron Gargron deleted the fix-disable-retry-limit branch October 6, 2019 04:21
@smiba
Copy link
Contributor

smiba commented Jan 3, 2023

I know this is an old change, but could this in any way be the reason some people (me included) experience media sometimes being not available (as in #13739 #16520)?

Running tootctl media refresh on the posts fixes it, suggesting it's a temporary failure. There doesn't seem to be any attempt from mastodon to retry these posts though, they will stay at "Not Available" forever (until I run said command).

I'll have to dig a bit deeper in how mastodon handles it's media_cache in s3, but it's a recurring issue for me that doesn't seem to be logged either.

@smiba
Copy link
Contributor

smiba commented Jan 10, 2023

I want to confirm that setting the retry_limit to 2 has severely reduced my image failures to basically near 0%.
I've yet to see any again myself, tootctl media refresh barely does anything nowadays :)

What to do with this? Should we slightly increase the retry_limit until we have developed a proper solution for S3 upload failures? @Gargron

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Runtime performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants