-
Notifications
You must be signed in to change notification settings - Fork 0
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Figure out how to and implement the app to properly and securely enforce restrictions on access to files on S3 #832
Comments
Don't forget about combined audio derivatives too |
Some outstanding questions:
|
Also recall that we can't break OHMS editor, which may not follow redirects and isn't okay with query parameters in URLs for audio files. Doh! |
Other challenges with putting presigned URLs directly on the page as an This is probably why ActiveStorage does the redirect technique. |
The "every img src a redirect" appraoch in ActiveStorage DOES give some people trouble with too much traffic to rails server. rails/rails#34552 There are some discussions about what to do about this on the web. Most of the solutions proposed with ActiveStorage assume all your files are public, they are about getting ActiveStorage to do something more like we do right now. |
We could go over to all "signed" S3 URLs, with the app gatekeeping whether a user can get a signed url. but the problems with this are potentially:
Plan
Our originals are already treated as non-public, we will make sure originals bucket has public access blocked, the app is already set up to work with that.
Derivatives are the problem, due to performance and efficiency problems with delivering all the thumbnails.
After much analysis, the least bad thing to do is: Leave existing derivatives how they are on a public S3 bucket, but provide an additional flag on Assets for "secure derivative storage type". If an asset has that flag set, derivatives will be stored in a different location, in a public-access-blocked S3 bucket.
Such assets for now will generally not have thumbnails that can be delivered. As our use case is restricted OH content (PDF and MP3) that won't be shown to the public with thumbnails, and doens't really have useful thumbs to show staff either. So we'll work with what we got.
Steps
bucket — in S3, ansible, Rails Env setting.
attribute on asset that can toggle between
public
andrestricted
storage type First part of flag to store derivatives on restricted access S3 #855Configure shrine so at time of ingest, look at flag to decide which storage to put it in. By also configuring a new Shrine
storage
calledrestricted_kithe_derivatives
, with support in ScihistDigicoll::Env like existing storages. First part of flag to store derivatives on restricted access S3 #855Q Where will secure derivatives actually be stored? Brand new bucket (does it need to be backed up? yet another bucketor even two?)? Or sub-dir of originals bucket (a bit confusing, requires no change to architectural decision to keep originals secure). --> Decided in a sub-path on originals bucket for now!
Asset admin display needs to tell you if derivatives are secure, and link to them so you can view them even if they are. First part of flag to store derivatives on restricted access S3 #855
Q DZI should not be created for secure derivatives (we don’t have secure DZI)? Don't create DZI if derivative_storage_type != public #861
If setting is CHANGED to/from restricted derivatives, need to move files to other storage -- May need to delete/create DZI? Facilities to move derivatives between public and restricted storage type #857 (deleting DZI if present on switch to restricted; not currently creating on switch to public)
Q What about derivatives-backups though? We need to remove files from backup if you change setting?? Cause they are public to be used as swappable failover. and backups keep old versions. oh man, this is a pain. We may just not backup the secure derivatives though. (originals still backed up!) --> we considered getting rid of derivatives backup entirely but decided not for now there is a DZI backups bucket too! See Analysis of access control issues with S3 backup buckets #866 for more analysis and remove all versions and backups when moving derivatives to restricted #868 for implementation.
Need routine for ‘audit’ -- what should it actually check? Just the db metadata? More somehow? (More can be expensive in terms of both time and money if it actually is going to touch S3) (should also check to make sure there's nothing published with secure derivatives?) -- ** do we need to check for orphans too?* See basic derivative storage type auditor #867 and Fix S3 derivative "orphan" file checker, and run checkers regularly as cronjob, with report #864
Should we make our ‘cache’ bucket enforced non-public too? I think so. uploads and uploads-mount. Temporarily uploaded files could be confidential too! https://bitbucket.org/ChemicalHeritageFoundation/ansible-inventory/pull-requests/119/block-public-access-on-s3-uploads-buckets
Paragraph of text in github README about this. See Technical Notes on File Security and Derivative Storage Type #892
The text was updated successfully, but these errors were encountered: