Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add CDN host option to ActiveStorage #42305

Closed
wants to merge 1 commit into from

Conversation

santib
Copy link
Contributor

@santib santib commented May 27, 2021

Summary

Some time ago #34477 was implemented to allow apps to configure a CDN in front of ActiveStorage images. But somehow there is still not a clear way to configure a CDN for ActiveStorage.

This PR attempts to make it easier (and document it better) to configure a CDN for Active Storage. It only affects routes generated with proxy mode because it wouldn't make sense to put a CDN in front of URLs that redirect to the storage service. What's more proxy mode doesn't really make sense if not using a CDN, it would just mean increasing the load of your servers without benefit. For that reason we think this PR would be very valuable to the community.

Co-authored-by: Bruno Vezoli <brunvezoli@hotmail.com>
Co-authored-by: Santiago Bartesaghi <santib@hey.com>
@lfalcao
Copy link
Contributor

lfalcao commented May 27, 2021

Thanks @santib for the contribution, but what's the difference between this and config.asset_host ?

config.asset_host sets the host for the assets. Useful when CDNs are used for hosting assets, or when you want to work around the concurrency constraints built-in in browsers using different domain aliases. Shorter version of config.action_controller.asset_host.

@zzak
Copy link
Member

zzak commented May 27, 2021

I'm also wondering if the name should be more specific to proxy-mode given it's only used in that scenario? I could be missing something here though. 🤔

@santib
Copy link
Contributor Author

santib commented May 27, 2021

Thanks @santib for the contribution, but what's the difference between this and config.asset_host ?

config.asset_host sets the host for the assets. Useful when CDNs are used for hosting assets, or when you want to work around the concurrency constraints built-in in browsers using different domain aliases. Shorter version of config.action_controller.asset_host.

@lfalcao That's to have a CDN in front of the assets (static images, CSS and JS) but it isn't used in Active Storage, right? With this PR you could serve Active Storage files through the CDN as well (same CDN or maybe a different one)

@santib
Copy link
Contributor Author

santib commented May 28, 2021

m also wondering if the name should be more specific to proxy-mode given it's only used in that scenario? I could be missing something here though. 🤔

I think I prefer it to be generic because if someone else finds another usage of the cdn_host apart from the proxy URL it should be ok I guess..

@lfalcao
Copy link
Contributor

lfalcao commented May 28, 2021

@santib I got it. I've been using helpers to prepend cdn host cause I use a service that converts and caches images based on URL structure for example https://cdn_host/300x300/webp/file_url.jpg and depending of the request/controller can change image size to 150x150 by just changing that part of the URL.

About the merge, two small considerations:

  • In Rails Guides, config.asset_host part should have something like 'for active_storage cdn see config.active_storage.cdn_host'.
  • I don't know if makes sense, but why not use config.asset_host as backup?

For example:

direct :rails_storage_proxy do |model, options = {}|
  cdn_host = ActiveStorage.cdn_host || Rails.config.asset_host
  
  if cdn_host
    cdn_uri = URI.parse(cdn_host)
    cdn_uri = URI.parse("//#{cdn_host}") if cdn_uri.scheme.nil?

    options = {
      host: cdn_uri.host,
      port: cdn_uri.port,
      protocol: cdn_uri.scheme
    }.merge(options)
  end

@zzak
Copy link
Member

zzak commented May 28, 2021

I like what @lfalcao is suggesting to use config.asset_host as the default but let you override it with config.active_storage.cdn_host (or even config.active_storage.asset_host?).

@santib
Copy link
Contributor Author

santib commented May 28, 2021

We thought about that fallback with @brunvez and @jeroig but we thought it was not worth it. I mean, in the end if you want to share the same CDN it's just having

config.asset_host = ENV['CDN_HOST']

config.active_storage.cdn_host = ENV['CDN_HOST']

If we go with the fallback we might be adding unexpected behavior to apps. I mean, probably most people have an asset_host configured and maybe some are using the proxied url for some reason, and out of the nowhere they'll be getting their images through the CDN without actively wanting it.

On another note I just remembered another usage for the proxy_url and is if you want to use disposition: :attachment. Like this: rails_storage_proxy_url(brochure, disposition: 'attachment')

@pixeltrix
Copy link
Contributor

@santib do you have a concrete example of what you're trying to do here because it seems like this could all be handled using things like a custom controller subclass and some routing constraints - thanks.

@santib
Copy link
Contributor Author

santib commented May 28, 2021

@santib do you have a concrete example of what you're trying to do here because it seems like this could all be handled using things like a custom controller subclass and some routing constraints - thanks.

Hey @pixeltrix sure, the example is being able in my app to have a CDN serving ActiveStorage attachments out of the box (just by setting 1 config and using the proxy route helper). As you said, currently in order to achieve it I need to hack my way around ActiveStorage. And with that I think everyone loses: it's harder for devs to implement, to maintain and upgrade. And for rails maintainers it's also harder to consider all the things people could be doing because ActiveStorage doesn't provide a simpler API. Maybe I'm missing something and having a CDN for ActiveStorage is simpler than I thought?

@brunvez
Copy link
Contributor

brunvez commented May 28, 2021

To provide some more context to what @santib said we had to configure the CDN and the resources we found most useful were:

This PR actually began as a way to provide some official documentation since this is something common on most Rails applications. We then found that the code on those tutorials is basically something that's already on ActiveStorage and so it would involve users basically copy-pasting that and adding just the :host option.

For us, this seemed like a config that would make having a CDN much easier and without having to get ActiveStorage's internals into your app.

@pixeltrix
Copy link
Contributor

So this is the pattern I use to do it:

constraints host: 'mycdn.com', protocol: 'https' do
  scope '/images', controller: 'active_storage/representations/proxy' do
    get '/:signed_blob_id/:variation_key/*filename', action: 'show', as: :cdn_proxy
  end
end

The host and protocol can be driven by environment variables so that it works in both development, test and production. I disable the drawing of the Active Storage default routes as I don't use direct to S3 upload.

I also add a direct url helper to extract out the path parameters like this:

direct :cdn_image do |image, options|
  signed_blob_id = image.blob.signed_id
  variation_key  = image.variation.key
  filename       = image.blob.filename

  route_for(:cdn_proxy, signed_blob_id, variation_key, filename, options)
end

This allows you to generate urls by using variations in the standard manner, e.g.

<%= cdn_image_url(user.avatar.variant(resize_to_limit: [128, 128])) %>

This is one way but there are others so it seems to me that this is more a documentation issue than adding another config.

@santib
Copy link
Contributor Author

santib commented May 28, 2021

So this is the pattern I use to do it:

constraints host: 'mycdn.com', protocol: 'https' do
  scope '/images', controller: 'active_storage/representations/proxy' do
    get '/:signed_blob_id/:variation_key/*filename', action: 'show', as: :cdn_proxy
  end
end

The host and protocol can be driven by environment variables so that it works in both development, test and production. I disable the drawing of the Active Storage default routes as I don't use direct to S3 upload.

I also add a direct url helper to extract out the path parameters like this:

direct :cdn_image do |image, options|
  signed_blob_id = image.blob.signed_id
  variation_key  = image.variation.key
  filename       = image.blob.filename

  route_for(:cdn_proxy, signed_blob_id, variation_key, filename, options)
end

This allows you to generate urls by using variations in the standard manner, e.g.

<%= cdn_image_url(user.avatar.variant(resize_to_limit: [128, 128])) %>

This is one way but there are others so it seems to me that this is more a documentation issue than adding another config.

Thanks for your replies @pixeltrix . Sounds good, we thought that the proposed solution was simpler and less intrusive to ActiveStorage details (e.g. it doesn't require app devs to know about blob.signed_id, variation.key, etc) but I get your point. Next week we'll work on a PR to improve documentation around a CDN for ActiveStorage then 👍

@zzak
Copy link
Member

zzak commented May 28, 2021

@santib Sounds good, LMK if you want a review on that whenever you're ready 🙇

@pixeltrix
Copy link
Contributor

Next week we'll work on a PR to improve documentation around a CDN for ActiveStorage then

Thanks, can you point out the simplest solution is to put a CDN in front of your whole application - the default Active Storage proxy controller uses http_cache_forever which CDNs will cache by default without any extra configuration.

@brunvez
Copy link
Contributor

brunvez commented Jun 24, 2021

@pixeltrix @zzak having documented #42363 I think this change is still relevant. It seems like it'd make adding a CDN so much easier, adding a config and having it work feels like the Rails way IMHO. It'd also be consistent with the way we add a CDN for asset hosts.

Anyway, just wanted to re-check that the current way of adding a CDN is the preferred one over a new config

@zzak
Copy link
Member

zzak commented Jun 25, 2021

@brunvez Can this be done in an initializer instead, using on_load?

@brunvez
Copy link
Contributor

brunvez commented Jun 25, 2021

@brunvez Can this be done in an initializer instead, using on_load?

Sorry I don't follow, how do you mean?

@zzak
Copy link
Member

zzak commented Jun 25, 2021

@brunvez I mean is there an alternative here where the user can still add a CDN without us having to introduce and support a new behavior?

@pixeltrix
Copy link
Contributor

@brunvez the easiest solution is to put a CDN in front of the whole application and use proxy urls rather than redirects. It's possible that integrating with the existing asset host feature may be useful but that seems to tie Active Storage and Action View together in a way we don't currently so could be tricky to implement.

@brunvez
Copy link
Contributor

brunvez commented Jun 28, 2021

@brunvez the easiest solution is to put a CDN in front of the whole application and use proxy urls rather than redirects. It's possible that integrating with the existing asset host feature may be useful but that seems to tie Active Storage and Action View together in a way we don't currently so could be tricky to implement.

Yeah I think tieing them together would be hard, we still agree that a proxy URLs are the best solution right now. We just meant that instead of having to configure a new path in your routes and have code that's similar to the one we have on ActiveStorage it'd be better to just add something that would allow users to have a CDN in a one liner.

This PR was hoping to make adding a CDN as easy as adding config.active_storage.cdn_host = ENV['CDN_HOST']. It's similar to the assets host but completely unrelated, many users might also want to use different CDNs so that might not even be a good idea.

@zzak
Copy link
Member

zzak commented Jun 29, 2021

Random thought: what if you could configure CDN for active storage the same way you can services? I imagine if your app is using disk service you don't need a CDN 🤔

@brunvez
Copy link
Contributor

brunvez commented Jun 29, 2021

Random thought: what if you could configure CDN for active storage the same way you can services? I imagine if your app is using disk service you don't need a CDN 🤔

That's also cool, we wanted to have the config next to the asset host but services could be a great place as well. We are just trying to push for that sweet one-liner for users, I think it's great in either place 😄

@zzak
Copy link
Member

zzak commented Jul 20, 2021

@brunvez @santib I have two concerns here, one I'm still not certain that @pixeltrix's suggestion to front the entire app in a CDN doesn't solve this (and saves us doing extra work). Second, if we do go down this route where is the best place to configure it, and what should the name be.

If we can get some answers on the first question, then we can proceed with two 🙏

@NobodysNightmare
Copy link
Contributor

NobodysNightmare commented Feb 28, 2022

Thought from my side: The now official (i.e. documented) state of affairs is that one should copy & paste a non-trivial piece of rails code and extend it by settings one's own CDN host.

This looks like it would easily break on any given upgrade to Rails (as soon as the if model.respond_to?(:signed_id) code changes). Edit: The currently documented state would already break passing expires_in: in the options

I see that fronting the entire application with a CDN may actually be a preferrable solution, however that's not written in the official Rails guides.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants