Make storage classes into module level vars #7908
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Some storage backends (notably S3) have extra startup times for the first connection to the backend (~50ms) which means that the storage object has to be reused for the best performance. This reuses them by making our storage instances into lazy loaded module level variables exactly like Django does with
staticfiles_storage
.This is an alternative to #7905.
Background
After the Azure -> AWS migration, we noticed that an extremely common piece of code which proxied requests to RTD to the appropriate file in cloud storage was taking much longer than before: ~60ms instead of ~10ms. Because this code is called on every request to docs or a build media file in RTD, it is called high tens to low hundreds of times per second in production and this caused us to need 3-4x the number of instances as normal. We traced it to this startup time in S3 (see gist).
This is the code for S3 storage that results in the extra startup time. This does not occur with Azure storage or filesystem backed storage engines.
Regular Django users who aren't directly using the storages API don't generally see this with S3 because they're typically using
staticfiles_storage
which is lazy loaded once per process. Because RTD uses multiple storage instances which are instantiated frequently, this affects RTD more than a typical Django project.