-
Notifications
You must be signed in to change notification settings - Fork 21.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Active storage add proxying #34477
Active storage add proxying #34477
Conversation
…torage-add-proxying-and-direct-downloads
…thub.com:fleck/rails into active-storage-add-proxying-and-direct-downloads
Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @georgeclaghorn (or someone else) soon. If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes. This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review. Please see the contribution instructions for more information. |
Sorry for commenting on a closed PR (please feel free to lmk if there's a better place for this question) but should we get rid of Maybe a before_action in the proxy controllers like this will do: def session_off
request.session_options[:skip] = true
end |
@joshling1919 does this PR resolve your issue? #39286 |
@zinosama no worries commenting on a closed PR, it's a great place to document issues/solutions to problems around the PR. To prevent session cookies being set during asset responses add the following to config/environment.rb: Rails.application.initialize!
# The code should be added after your application is initialized
ActiveStorage::BaseController.class_eval do
def disable_session
request.session_options[:skip] = true
end
end
ActiveStorage::Blobs::ProxyController.class_eval do
before_action :disable_session
end
ActiveStorage::Representations::ProxyController.class_eval do
before_action :disable_session
end |
Thank you @fleck for the quick response. I can see that working. I'm also wondering though if that should the default behavior? The purpose of these proxy controllers is to make image caching easy. Yet (plz correct me if I'm wrong) the cookies header makes the asset uncache-able, which defeats this purpose. |
@zinosama making that the default may make sense, I can't think of a good reason to include the session for assets. But, it's too late for that change to make it into rails 6.1. I missed this during development because the application I was testing on doesn't use session except for a couple routes on the admin portion of the site. As for "the cookies header makes the asset uncache-able" that's Cloudflare specific behavior, a lot of CDNs will cache assets regardless of cookies. Cloudflare can also be configured to cache with a cookie present using a page rule with the "Edge cache TTL" setting. |
Thank you @fleck for this feature. This is my application.rb config.active_storage.delivery_method = :proxy
config.active_storage.proxy_urls_host = "cdn.mydomain.com" I've also tried with asset_name.deliver(:proxy) in views, that is just removing the host but still not calling the cdn, otherwise the current host is being called. I'd appreciate any insight. |
@mcanto unfortunately the backport is in a half finished state. At some point during this PR I upgraded my project to use the latest version of rails and stopped updating the back port. Even if you can get the backport to work the API is fairly different from what was merged in this PR. If possible I'd recommend upgrading to the latest rails (it's pretty stable), or trying to re-create the backport based on this API. |
Thank you for your answer @fleck I'll take a look into upgrade to the latest version. Definitely its a pretty nice feature, good job!
|
I was kind of expecting to see this in 6.1.0.RC1 but instead i only find things like |
@phoet The API has changed a decent amount from the initial proposal. Here's the up to date API: https://github.com/rails/rails/blob/master/activestorage/README.md#proxying |
For what it's worth, that documentation is pretty sparse. It seems to indicate that there's some way to get the URL for the proxied version of the asset instead of the direct ActiveStorage version, but it doesn't go into detail about how to tell Rails about how to map the raw asset path to the proxied one. Presumably, if S3 is the primary ActiveStorage backend, then we need some way to tell Rails whether the proxy is Cloudfront or Cloudflare or Cloudinary, etc. |
Heads up for other people on Cloudflare who cannot get this to work. Apparently you can only have it ignore cookies on the $200/mo Business plan or higher. The mentioned workaround that disables the session for the relevant controllers might be a better fix if you're on one of the cheaper plans. |
@@ -106,6 +106,37 @@ Variation of image attachment: | |||
<%= image_tag user.avatar.variant(resize_to_limit: [100, 100]) %> | |||
``` | |||
|
|||
## File serving strategies |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
i think it would be good if https://edgeguides.rubyonrails.org/active_storage_overview.html#linking-to-files mentioned this too
@@ -954,6 +954,14 @@ text/javascript image/svg+xml application/postscript application/x-shockwave-fla | |||
|
|||
* `config.active_storage.draw_routes` can be used to toggle Active Storage route generation. The default is `true`. | |||
|
|||
* `config.active_storage.resolve_model_to_route` can be used to globally change how Active Storage files are delivered. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@fleck FYI this is rendering a bit weird:
https://guides.rubyonrails.org/configuring.html#configuring-active-storage
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#40920 should fix it
@fleck Hi, quick question. I have been using the direct block in my routes to set the CDN hostname for the proxy routes, and that works well. I would like to restrict these routes so that they only work if the incoming request actually uses the CDN hostname, and just a 404 or no content if the request uses the app's main domain. Basically I want all the requests for assets to go through the CDN. Reason: someone has been flooding my app by making lots of requests to the assets but using the app's main domain directly, bypassing the CDN caching. I would like to prevent this and make the relevant routes available only if the CDN hostname is used with some kind of constraint. Any suggestions? Thanks in advance. |
@vitobotta seems like a job for something like rack-attack. the redirect and proxy routes just point to a controller inside activestorage (somewhere in here https://github.com/rails/rails/tree/main/activestorage). configure rack attack to check additional headers (like those set by your CDN) and if they dont exist, shut down the request |
Has anyone tried the proxy method with pages that have hundreds of images? Currently we use Heroku + Cloudflare for caching... I wonder how many active storage proxy requests per second we can handle with each "2x dyno". |
@collimarco I have. Since each image is going to generate an HTTP request, the number of requests that a single 2x dyno can handle will depend on: the number of puma workers, the size of the images, the latency/speed of your storage backend. If you are using the recommended defaults for 2x dynos, your Puma concurrency is set to The first two times the page with hundreds with image is opened, all hundreds of requests will hit your servers. If you don't have enough dynos, all your other requests (including user navigation), will have to wait until all hundreds of images have been proxied from S3 to Cloudflare through your dynos. So let's say you have 200 images in that page, and 2 dynos (12-16 images per second). We are talking about other requests waiting 12-16 seconds until they get a response. The third time, the images will be cached and theoretically, your dynos will not have to handle them anymore. Theoretically because Cloudflare has so many PoPs, that if the page with all the images is not popular enough, your cache hit will be low, and your dynos will continue being forced to stream those images. I recommend you at least use lazy loading in that page, so that if the user does not scroll, you don't waste your capacity streaming those images. |
@brenogazzola Thanks for the reply! That was my fear... 6-8 req / sec / dyno are not enough with pages that may have 300+ images. It would be too expensive (300 / 8 = 37 dynos only for images!). Maybe it's something better however, because you forgot the threads. It think it's 2 workers * 5 threads the concurrency (most of the time with S3 is spent waiting). It's strange because a normal browser on a normal PC can download 300+ images in a few seconds and from my research it seems that they only use 6 concurrent connection per domain. It is strange that a server cannot perform similarly 🤔 |
@collimarco Threads coud help, yes. I tend to ignore them when calculating how much load Puma can handle since I normally have no idea which requests have I/O time and which don't. But in this case I guess you are right. As for the difference between PC and servers, that's because the browser is downloading from Cloudflare, which does its best to ensure that latency between request and content download is as short as possible. Your dynos on the other hand are dealing with S3, which does not care about latency at all. There might also be something in the AWS gem code, or the APIs it uses that introduces extra latency. It's not really a level playing field. |
Even after Disable session in ActiveStorage blobs and representations proxy controllers #48869 was implemented, or any monkey patch applied, that's only for session cookies. There could be other cookies beings set, for example if using ahoy, you get another cookie called I found a more future proof approach here: |
Summary
Added the option to globally change how active storage files are delivered. Users can now set app.config.active_storage.delivery_method to [*:redirect, :proxy]. There's also the option to override at the model level using
has_one_attached :avatar, delivery_method: :proxy
syntax.This is just a rough draft, still could be DRYed up a bit, needs tests, and documentation. Just want to get some feedback before I go to far.
Other Information
If you are updating any of the CHANGELOG files or are asked to update the
CHANGELOG files by reviewers, please add the CHANGELOG entry at the top of the file.
Finally, if your pull request affects documentation or any non-code
changes, guidelines for those changes are available
here