Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permanent URLs for Active Storage blobs #36729

Closed

Conversation

@peterzhu2118
Copy link
Contributor

peterzhu2118 commented Jul 22, 2019

Summary

Addresses issue #31419. Allows access to permanent URL to the blob. This changes Active Storage to use two buckets (a public bucket and a private bucket), where the public bucket has permissions for public access and the private bucket only has private access.

Since public URLs from providers don't allow custom filenames (i.e. download as a different filename as it exists on the service), the directory structure of uploaded files differ in the public bucket and the private bucket. In the public bucket, the directory structure is /[key]/[filename] while the directory structure in the private bucket remains /[key] to maintain backwards compatibility. So for example, if you upload a public file kitten.jpg and Rails give it the key 5a1d6e, it will be uploaded to /5a1d6e/kitten.jpg on the cloud provider and thus can be downloaded as a JPEG in the future.

Note that you can also choose to just have one of public and private buckets.

Note that this requires #36715 to be merged for Azure to work.

Other Information

I haven't updated much of the documentation, let me know how the code looks, if you think this is a good idea, I'll update more of the documentation.

It also currently does not support moving a file from private to public (and vice versa), nor changing the filename of a public file. This is something that I can implement in the future (not in the scope of this PR however).

Copy link
Member

gmcgibbon left a comment

  • Can you explain why we need to denote public vs private files with a column on active_storage_blobs? I'm don't see why this is necessary, it is contextual to the generated URL and can technically be both, right?
  • Why two different services? Surely it would make more sense to just work with one service.
@peterzhu2118

This comment has been minimized.

Copy link
Contributor Author

peterzhu2118 commented Jul 22, 2019

@gmcgibbon I chose to use two different services because of two reasons:

  1. Backwards compatibility. Active Storage right now stores files with just the key as the filename, so this file has its filename and extension data lost. This is not a problem with the service URLs because the cloud providers provide the ability to set a different filename when generating this temporary, signed URL. This is however a problem with public URLs because most of the time it becomes mycloudprovider.com/[key], but it actually should be mycloudprovider.com/[key]/[filename]. By using two providers, I can keep the old filename style for the private bucket and the new filename for the public bucket.
  2. It's much easier to set bucket-level permissions. Permissions per file is more difficult to manage, especially in the case when one performs a direct upload (to the cloud provider) rather than uploading through the Rails app. I played around with using only one service originally, and because of the direct upload, every time I wanted the public URL, I had to check the permissions of the file and set it to public if that had not been set. This means extra API calls and ultimately means it's slower. This can be overcome by adding a flag in the database, but that seems to be more smelly to me.

Because I need to know where to look for the blob (in the private or public service), is why I need the public_file boolean column in active_storage_blobs.

@georgeclaghorn

This comment has been minimized.

Copy link
Member

georgeclaghorn commented Jul 28, 2019

+1 for bucket-level instead of file-level permissions.

Did you consider allowing a blob to be stored in any named service? The reason I ask is that we’d like to make that possible eventually, and layering it on top of the public-private split implemented here is going to be harder down the road.

The blobs table would have a service column containing the name of a configured service. Each service would have a public option that defaults to false. That option would determine how the service generates URLs.

@peterzhu2118

This comment has been minimized.

Copy link
Contributor Author

peterzhu2118 commented Jul 28, 2019

Hey @georgeclaghorn, thanks for taking a look at this PR and giving comments!

I agree that supporting multiple services to be used simultaneously is a very valuable feature and would be a better solution to this. Are you already working on this feature? If not, I can attempt to tackle it.

@georgeclaghorn

This comment has been minimized.

Copy link
Member

georgeclaghorn commented Jul 28, 2019

I am not. All yours.

@peterzhu2118 peterzhu2118 referenced this pull request Aug 2, 2019
0 of 2 tasks complete
@georgeclaghorn georgeclaghorn added this to the 6.1.0 milestone Sep 23, 2019
@courtsimas

This comment has been minimized.

Copy link

courtsimas commented Sep 24, 2019

@peterzhu2118 out of curiosity, did you end up tackling this?

@peterzhu2118

This comment has been minimized.

Copy link
Contributor Author

peterzhu2118 commented Sep 24, 2019

Hey @courtsimas! I had a proposal in #36835, however, it appears that #34935 is more active, so it's more likely that one will get merged.

Copy link
Member

gmcgibbon left a comment

@peterzhu2118 are you able to rebase your work now that #34935 is merged? If we denote public v private at the service level (eg. in storage.yml) we should be able to switch between services without too much branching logic.

@peterzhu2118 peterzhu2118 force-pushed the peterzhu2118:activestorage-permanent-2 branch from daa0183 to 792ffef Oct 2, 2019
@rails-bot rails-bot bot added the railties label Oct 2, 2019
@peterzhu2118 peterzhu2118 force-pushed the peterzhu2118:activestorage-permanent-2 branch 2 times, most recently from b215650 to bfd276f Oct 2, 2019
@peterzhu2118

This comment has been minimized.

Copy link
Contributor Author

peterzhu2118 commented Oct 2, 2019

Hey @gmcgibbon,

I've made a rebase, and I've updated the code to follow what's been done in the merged PR. Instead of the access level of a blob being per blob level stored in the database, it is now per service level (i.e. public_service key in configurations.yml). Note that in order for CI to pass on master, configurations.yml.enc must be updated with gcs_public, azure_public, and s3_public services that allow public blobs.

I'm introducing a deprecation in Blob#service_url by renaming into Blob#url since service_url implies that it's a short-lived, expiring URL. However, this method will now return a non-expiring URL for public files.

Please let me know how it looks, and I can update documentation accordingly when the code is more finalized.

else
head :not_found
end
end

This comment has been minimized.

Copy link
@georgeclaghorn

georgeclaghorn Oct 2, 2019

Member

First, I’d like to see this moved to a separate controller: ActiveStorage::Public::DiskController#show.

serve_file is general enough that it could move to Action Controller eventually, but for now we can just move it to ActiveStorage::BaseController so ActiveStorage::DiskController and ActiveStorage::Public::DiskController can share it.

Second, you can also inline the key local variable:

def show
  if blob = ActiveStorage::Blob.find_by(key: params[:key])
    if blob.service.public_service?
      serve_file disk_service.path_for(blob.key), content_type: blob.content_type, disposition: :inline
    else
      head :unauthorized
    end
  else
    head :not_found
  end
end

Third, this action should honor the blob’s potentially-custom disk service by using blob.service in place of disk_service.

Fourth, I’d like to add ActiveStorage::Blob#public? and use it here:

class ActiveStorage::Blob
  def public?
    service.public?
  end
end

This comment has been minimized.

Copy link
@peterzhu2118

peterzhu2118 Oct 2, 2019

Author Contributor

I've refactored by splitting the public controller portion to ActiveStorage::PublicDiskController. I didn't see an advantage in placing this controller in another level of namespace nesting.

@peterzhu2118 peterzhu2118 force-pushed the peterzhu2118:activestorage-permanent-2 branch 2 times, most recently from e4839c4 to 7fb5870 Oct 3, 2019
@peterzhu2118 peterzhu2118 force-pushed the peterzhu2118:activestorage-permanent-2 branch 3 times, most recently from 0b866c1 to c969d36 Oct 4, 2019
@peterzhu2118 peterzhu2118 force-pushed the peterzhu2118:activestorage-permanent-2 branch from c969d36 to 38ceb9e Oct 4, 2019
require "service/shared_service_tests"
require "uri"

if SERVICE_CONFIGURATIONS[:azure]

This comment has been minimized.

Copy link
@gmcgibbon

gmcgibbon Oct 7, 2019

Member
Suggested change
if SERVICE_CONFIGURATIONS[:azure]
if SERVICE_CONFIGURATIONS[:azure_public]
require "net/http"
require "database/setup"

if SERVICE_CONFIGURATIONS[:s3]

This comment has been minimized.

Copy link
@gmcgibbon

gmcgibbon Oct 7, 2019

Member
Suggested change
if SERVICE_CONFIGURATIONS[:s3]
if SERVICE_CONFIGURATIONS[:s3_public]
@@ -1,3 +1,13 @@
* Permanent URLs for blobs.

This comment has been minimized.

Copy link
@gmcgibbon

gmcgibbon Oct 7, 2019

Member
Suggested change
* Permanent URLs for blobs.
* Permanent URLs for public storage blobs.
@@ -1,3 +1,13 @@
* Permanent URLs for blobs.

Services can be configured in `configurations.yml` with a new key

This comment has been minimized.

Copy link
@gmcgibbon

gmcgibbon Oct 7, 2019

Member
Suggested change
Services can be configured in `configurations.yml` with a new key
Services can be configured in `config/storage.yml` with a new key
@gmcgibbon gmcgibbon closed this in 94584c2 Oct 11, 2019
@gmcgibbon

This comment has been minimized.

Copy link
Member

gmcgibbon commented Oct 11, 2019

Squashed and merged with some minor edits in 94584c2 ❤️

Thanks @peterzhu2118!!

@peterzhu2118

This comment has been minimized.

Copy link
Contributor Author

peterzhu2118 commented Oct 11, 2019

@gmcgibbon Thanks for applying the changes and merging! I just realized that I forgot to address your comments, really sorry about that!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.