Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate using signed URLs when downloading addon files from the CDN #4018

Closed
diox opened this issue Feb 9, 2017 · 11 comments
Closed

Investigate using signed URLs when downloading addon files from the CDN #4018

diox opened this issue Feb 9, 2017 · 11 comments
Labels
repository:addons-server Issue relating to addons-server state:stale Issues marked as stale. These can be re-opened should there be plans to fix them.

Comments

@diox
Copy link
Member

diox commented Feb 9, 2017

The way we currently prevent disabled files from being accessed by the public is by moving them to a special directory, GUARDED_ADDONS_PATH, which is not publicly accessible. When a developer or reviewer wants to download a disabled file, we serve it ourselves in a django view using X-Accel-Redirect header.

This causes several problems, because moving files around is costly and not instantaneous. To make matters more complicated, you can disable/re-enable an entire add-on, and all its files should follow and be moved accordingly.

Instead of moving files around, when a user hits our download view, we should use signed URLs when redirecting to the CDN. That way, we can store all files in the same place, because the URLs are no longer publicly guessable, so it's impossible to download a disabled file if you don't have the permission for it. Bonus point, it would even mean we wouldn't need to be using X-Accel-Redirect when the user does have the right permission.

Doing this would make it possible to get rid of the crons that fix the file paths (#3548) and store unlisted files like disabled ones (#3546)

@diox diox self-assigned this Feb 9, 2017
@diox
Copy link
Member Author

diox commented Feb 9, 2017

AFAIK there are only 2 ways to download an addon file in addons-server:

  • Through the download_file view, which redirects to the CDN
  • Through the update service, which bypasses the download view and give you the CDN URL directly.

The latter is what I'm worried about, need to investigate more to figure out if using a signed URL, which has a short expiration date, would be fine for the update service. The alternative would be to force the update service to use our downloads view - which would then redirect properly - but I'm not sure we can sustain the load it would generate.

@diox
Copy link
Member Author

diox commented Feb 9, 2017

Also, we don't want to rewrite all our code that uses posix filesystem API to use django storage APIs - that's why we use EFS.

I need to figure out if we can have the two coexist - don't change the way we read/write the files, but still be able to access them through the storage API when we want to expose them directly.

@diox
Copy link
Member Author

diox commented Feb 10, 2017

More digging: we don't actually need to use the storage APIs - and in fact, we can't, Cloudfront mechanism for signed URLs appears to be different from S3. We can just generate the signed url using boto API: http://boto.readthedocs.io/en/latest/ref/cloudfront.html#boto.cloudfront.distribution.Distribution.create_signed_url

We might need a different cloudfront distribution specifically for the downloads if we're doing this, to prevent access without the signed url by default. Not sure how that works yet exactly.

@diox
Copy link
Member Author

diox commented Feb 13, 2017

@jasonthomas 3 questions for you:

First, my plan is that, once we're done with this issue:

  • static files (i.e. our own css/js/img assets) would work exactly as before
  • previews, icons, any user-uploaded media that is not an add-on file would work exactly as before
  • it would no longer be possible to access add-on downloads directly without going through a cloudfront signed URL that we generate.

Could you validate that this sounds doable ?


Second, could you help me clarify how user-uploaded media are handled in AMO ? As far as I can tell:

  • An S3 bucket is mounted through EFS, making it look like a regular filesystem.
  • AMO saves files to that bucket through EFS
  • When an user tries to download the file, AMO redirects it to a Cloudfront distribution, which is configured to pull data from the right S3 bucket.

Is that right ?


Finally, related to the previous question, when I go to the AWS console I can't quite find the media I'm looking for. I'd like to find this for instance: https://addons-dev-cdn.allizom.org/user-media/previews/thumbs/76/76403.png
I see we have a Cloudfront distribution which has addons-amo-dev-cdn.allizom.org CNAME set, but
that's not the same domain name. Its origin is net-mozaws-dev-amo-amo-static-amodev1.s3.amazonaws.com but when browsing this
S3 bucket I can't find user-media/, or previews/.

Am I looking at the wrong distribution / S3 bucket ? Or could it be a permission issue ?


Thanks!

@jasonthomas
Copy link
Member

jasonthomas commented Feb 13, 2017

Let me give a background on what our CDN offloads today. Most of the routing logic is managed at the CDN origin (nginx is what use) such that we can switch to any CDN provider (or use multiple) without having a custom configuration at the CDN [1]. Today we use Cloudfront globally except for China where we use a China based CDN provider [2]. -dev and stage do not use a CDN today, they just use the CDN origin directly. I have a todo to configure -dev and stage to use Cloudfront so that we can finally test HTTP/2.0.

The CDN serves the following:

  • /static - These are build assets (css, jss, fonts, etc) which are copied to a S3 bucket at code deployment time. We do this so that we don't experience cache busting issues we have seen in the past.
    Requests Flow -> CDN -> Nginx Caching -> Nginx -> S3.
    Example https://addons.cdn.mozilla.net/static/js/preload-min.js?build=2841cb7-589b262c
  • /user-media - User generated content (add-ons, user-profile pics, light weight themes, etc) that are stored on a EFS. The EFS mounted via NFS between all the web instances and CDN origins. We don't have a actual /user-media directory but we manage a few symlinks so that we can make use of nginx's root directive and not have to write a ton of internal rewrites. If you log-in into a -dev instance and look at /mnt/efs/addons-dev.allizom.org/shared_storage/uploads you can see what is served from /user-media.
    Request Flow: User -> CDN -> Nginx -> EFS
    Example: https://addons-dev-cdn.allizom.org/user-media/previews/thumbs/76/76403.png on the file system would be /mnt/efs/addons-dev.allizom.org/shared_storage/uploads/previews/thumbs/76/76403.png
  • ~* /.*.(css|gif|ico|jar|jwk|jpg|js|ogv|pdf|png|rdf|svg|webm|zip)$ - everything else, this includes assets that are dynamically generated by addons-server.
    Request Flow: User -> CDN -> Nginx -> addons-server web worker
    Example https://addons.cdn.mozilla.net/en-US/firefox/addons/buttons.js?b=2841cb7-589b262c

[1] https://github.com/mozilla-services/cloudops-deployment/blob/master/projects/addons/puppet/modules/amo_proxy/templates/nginx.addons.conf.erb#L24-L153
[2] https://bugzilla.mozilla.org/show_bug.cgi?id=1143062

First, my plan is that, once we're done with this issue:
it would no longer be possible to access add-on downloads directly without going through a cloudfront signed URL that we generate.
Could you validate that this sounds doable ?

For our current setup this isn't possible since we are using a non Cloudfront CDN provider for China.
We can potentially get rid of the China CDN but it's going to make a lot of China and Mozilla China users sad.

Let's say we didn't have a China CDN, then based on the documentation we can use a Cloudfront Custom Policy to allow Signed URLs for specific paths, so we would want to enable this on /user-media/addons/. Since light weight themes lives in this path as well we would have to do it for those as well, which is probably not ideal. Also this may complicate how we serve add-on/theme updates via versioncheck since we will need to generate signed URLs for all the permutations of add-on update requests for at a minimum of the current nginx cache TTL (1 hour).

Second, could you help me clarify how user-uploaded media are handled in AMO ? As far as I can tell: ... Is that right ?

I explained some of this above, but there is no S3 bucket for user generated content. AMO writes directly to EFS and served via EFS.

Am I looking at the wrong distribution / S3 bucket ? Or could it be a permission issue ?

Also explained above, these assets live in EFS storage.

Please let me know if anything is unclear. We may want to have a quick meeting to discuss options/solutions. Just let me know.

@jasonthomas
Copy link
Member

Also we would need figure out how CDN caching is affected by Signed URLs. If it increases the number of requests to our origins this might be a problem.

@jasonthomas
Copy link
Member

/cc @bqbn if he has any input.

@diox
Copy link
Member Author

diox commented Feb 14, 2017

Thanks a lot for clarifying.

So, to sum up:

  1. China CDN makes the whole thing more complex indeed. Might invalidate the whole thing.
  2. We might need to separate add-on downloads from lightweight themes assets - not sure, because maybe lightweight themes would benefit from this as well (they too need to support disabling/enabling better)
  3. Signed URLs would need to be valid for at least one hour, probably more
  4. We'd need to make sure CDN caching is not affected

All of these points are not trivial, it certainly makes things difficult, but only the first one really worries me because we don't have a good solution for it :(

@jasonthomas
Copy link
Member

China CDN makes the whole thing more complex indeed. Might invalidate the whole thing.

I agree. We can discuss removing it with Mozilla China team but I think they would be against it.

I would prefer a solution that allows us to be CDN agnostic so that we have flexibility to move from one provider. We are using Cloudfront today but we have changed our CDN providers several times over the life of AMO.

@diox
Copy link
Member Author

diox commented Feb 15, 2017

On IRC @jasonthomas mentionned https://www.nginx.com/blog/securing-urls-secure-link-module-nginx-plus/ which might work, if we can make sure it's not possible to bypass nginx and download the file directly by guessing the CDN domain name/path. Will look into this.

@stale
Copy link

stale bot commented Sep 6, 2019

This issue has been automatically marked as stale because it has not had recent activity. If you think this bug should stay open, please comment on the issue with further details. Thank you for your contributions.

@stale stale bot added the state:stale Issues marked as stale. These can be re-opened should there be plans to fix them. label Sep 6, 2019
@stale stale bot closed this as completed Sep 20, 2019
@KevinMind KevinMind transferred this issue from mozilla/addons-server May 4, 2024
@KevinMind KevinMind added repository:addons-server Issue relating to addons-server migration:2024 labels May 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
repository:addons-server Issue relating to addons-server state:stale Issues marked as stale. These can be re-opened should there be plans to fix them.
Projects
None yet
Development

No branches or pull requests

5 participants