Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make storage classes into module level vars #7908

Merged
merged 1 commit into from Feb 15, 2021

Conversation

davidfischer
Copy link
Contributor

@davidfischer davidfischer commented Feb 14, 2021

Some storage backends (notably S3) have extra startup times for the first connection to the backend (~50ms) which means that the storage object has to be reused for the best performance. This reuses them by making our storage instances into lazy loaded module level variables exactly like Django does with staticfiles_storage.

This is an alternative to #7905.

Background

After the Azure -> AWS migration, we noticed that an extremely common piece of code which proxied requests to RTD to the appropriate file in cloud storage was taking much longer than before: ~60ms instead of ~10ms. Because this code is called on every request to docs or a build media file in RTD, it is called high tens to low hundreds of times per second in production and this caused us to need 3-4x the number of instances as normal. We traced it to this startup time in S3 (see gist).

This is the code for S3 storage that results in the extra startup time. This does not occur with Azure storage or filesystem backed storage engines.

Regular Django users who aren't directly using the storages API don't generally see this with S3 because they're typically using staticfiles_storage which is lazy loaded once per process. Because RTD uses multiple storage instances which are instantiated frequently, this affects RTD more than a typical Django project.

Some storage backends (notably S3) have extra startup times
for the first connection to the backend (~50ms)
which means that the storage object has to be reused
for the best performance.
This reuses them by making our storage instances
into lazy loaded module level variables
exactly like Django does with `staticfiles_storage`.
@davidfischer davidfischer requested a review from a team February 14, 2021 04:59
Copy link
Member

@humitos humitos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great! 💯

Copy link
Member

@ericholscher ericholscher left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much more comprehensive 👍

I looked at .com and it doesn't seem to be using get_storage_class at all, so we should be good here!

The only other usage I found is

https://github.com/readthedocs/readthedocs-ext/blob/master/readthedocsext/embed/views.py#L225-L224

/cc @humitos :)

@ericholscher ericholscher merged commit 7a117b6 into master Feb 15, 2021
3 checks passed
@ericholscher ericholscher deleted the davidfischer/storage-classes-module-vars branch February 15, 2021 15:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants