-
Notifications
You must be signed in to change notification settings - Fork 41
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Investigate using signed URLs when downloading addon files from the CDN #4018
Comments
AFAIK there are only 2 ways to download an addon file in addons-server:
The latter is what I'm worried about, need to investigate more to figure out if using a signed URL, which has a short expiration date, would be fine for the update service. The alternative would be to force the update service to use our downloads view - which would then redirect properly - but I'm not sure we can sustain the load it would generate. |
Also, we don't want to rewrite all our code that uses posix filesystem API to use django storage APIs - that's why we use EFS. I need to figure out if we can have the two coexist - don't change the way we read/write the files, but still be able to access them through the storage API when we want to expose them directly. |
More digging: we don't actually need to use the storage APIs - and in fact, we can't, Cloudfront mechanism for signed URLs appears to be different from S3. We can just generate the signed url using boto API: http://boto.readthedocs.io/en/latest/ref/cloudfront.html#boto.cloudfront.distribution.Distribution.create_signed_url We might need a different cloudfront distribution specifically for the downloads if we're doing this, to prevent access without the signed url by default. Not sure how that works yet exactly. |
@jasonthomas 3 questions for you: First, my plan is that, once we're done with this issue:
Could you validate that this sounds doable ? Second, could you help me clarify how user-uploaded media are handled in AMO ? As far as I can tell:
Is that right ? Finally, related to the previous question, when I go to the AWS console I can't quite find the media I'm looking for. I'd like to find this for instance: https://addons-dev-cdn.allizom.org/user-media/previews/thumbs/76/76403.png Am I looking at the wrong distribution / S3 bucket ? Or could it be a permission issue ? Thanks! |
Let me give a background on what our CDN offloads today. Most of the routing logic is managed at the CDN origin (nginx is what use) such that we can switch to any CDN provider (or use multiple) without having a custom configuration at the CDN [1]. Today we use Cloudfront globally except for China where we use a China based CDN provider [2]. -dev and stage do not use a CDN today, they just use the CDN origin directly. I have a todo to configure -dev and stage to use Cloudfront so that we can finally test HTTP/2.0. The CDN serves the following:
[1] https://github.com/mozilla-services/cloudops-deployment/blob/master/projects/addons/puppet/modules/amo_proxy/templates/nginx.addons.conf.erb#L24-L153
For our current setup this isn't possible since we are using a non Cloudfront CDN provider for China. Let's say we didn't have a China CDN, then based on the documentation we can use a Cloudfront Custom Policy to allow Signed URLs for specific paths, so we would want to enable this on /user-media/addons/. Since light weight themes lives in this path as well we would have to do it for those as well, which is probably not ideal. Also this may complicate how we serve add-on/theme updates via versioncheck since we will need to generate signed URLs for all the permutations of add-on update requests for at a minimum of the current nginx cache TTL (1 hour).
I explained some of this above, but there is no S3 bucket for user generated content. AMO writes directly to EFS and served via EFS.
Also explained above, these assets live in EFS storage. Please let me know if anything is unclear. We may want to have a quick meeting to discuss options/solutions. Just let me know. |
Also we would need figure out how CDN caching is affected by Signed URLs. If it increases the number of requests to our origins this might be a problem. |
/cc @bqbn if he has any input. |
Thanks a lot for clarifying. So, to sum up:
All of these points are not trivial, it certainly makes things difficult, but only the first one really worries me because we don't have a good solution for it :( |
I agree. We can discuss removing it with Mozilla China team but I think they would be against it. I would prefer a solution that allows us to be CDN agnostic so that we have flexibility to move from one provider. We are using Cloudfront today but we have changed our CDN providers several times over the life of AMO. |
On IRC @jasonthomas mentionned https://www.nginx.com/blog/securing-urls-secure-link-module-nginx-plus/ which might work, if we can make sure it's not possible to bypass nginx and download the file directly by guessing the CDN domain name/path. Will look into this. |
This issue has been automatically marked as stale because it has not had recent activity. If you think this bug should stay open, please comment on the issue with further details. Thank you for your contributions. |
The way we currently prevent disabled files from being accessed by the public is by moving them to a special directory,
GUARDED_ADDONS_PATH
, which is not publicly accessible. When a developer or reviewer wants to download a disabled file, we serve it ourselves in a django view usingX-Accel-Redirect
header.This causes several problems, because moving files around is costly and not instantaneous. To make matters more complicated, you can disable/re-enable an entire add-on, and all its files should follow and be moved accordingly.
Instead of moving files around, when a user hits our download view, we should use signed URLs when redirecting to the CDN. That way, we can store all files in the same place, because the URLs are no longer publicly guessable, so it's impossible to download a disabled file if you don't have the permission for it. Bonus point, it would even mean we wouldn't need to be using
X-Accel-Redirect
when the user does have the right permission.Doing this would make it possible to get rid of the crons that fix the file paths (#3548) and store unlisted files like disabled ones (#3546)
The text was updated successfully, but these errors were encountered: