Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug-1863007: change UPLOAD_TEMPDIR_ORPHANS_CUTOFF to 15 minutes #2839

Merged
merged 1 commit into from Nov 28, 2023

Conversation

willkg
Copy link
Collaborator

@willkg willkg commented Nov 28, 2023

This drops the cutoff from 60 minutes which kept orphaned files way too long causing an instance to have disk-full problems for over an hour to 15 minutes which will allow instances to recover more quickly.

15 minutes far exceeds the 6 minute timeout for HTTP request handling, so files that are still around after 15 minutes are certainly orphaned.


The Tecken webapp handles upload API requests. These requests can take a very long time to handle. If the request exceeds 6 minutes, it's possible that the gunicorn worker serving the request will get killed off. If that happens, then the symbols files on disk that the upload API handler was processing get orphaned and remain on disk. The disk is finite, so after enough of these events, then the instance has no more disk left and starts throwing out-of-disk errors.

There is also a disk cache manager process running in the docker container. The UPLOAD_TEMPDIR_ORPHANS_CUTOFF setting affects how old a file can be before the disk cache manager determines it's an orphaned file and deletes it.

This reduces the cutoff number from 60 minutes to 15 minutes by changing the default value. We don't set this value in the infrastructure configuration, so changing the default changes it everywhere. There isn't a whole lot to review here.

This drops the cutoff from 60 minutes which kept orphaned files way too
long causing an instance to have disk-full problems for over an hour to
15 minutes which will allow instances to recover more quickly.

15 minutes far exceeds the 6 minute timeout for HTTP request handling,
so files that are still around after 15 minutes are certainly orphaned.
@willkg willkg merged commit 917341a into main Nov 28, 2023
1 check passed
@willkg willkg deleted the bug-1863007-orphans-cutoff branch November 28, 2023 16:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
2 participants