Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

s3.getSignedUrl() with custom domain and path via CloudFront #1753

Closed
nalbion opened this issue Oct 11, 2017 · 6 comments
Closed

s3.getSignedUrl() with custom domain and path via CloudFront #1753

nalbion opened this issue Oct 11, 2017 · 6 comments
Labels
guidance Question that needs advice or information.

Comments

@nalbion
Copy link

nalbion commented Oct 11, 2017

I'm trying to use s3.getSignedUrl() to download a file '1234.txt' from S3 'my-download-bucket'.

Some users are behind proxies which do not allow access to *.amazonaws.com, so I'm trying to use CloudFront to map the S3 origin my-download-bucket.s3.amazonaws.com with a behavior path pattern downloads/*. With DNS mapping the CloudFront distribution to files.mydomain.com users should be able to download https://files.mydomain.com/downloads/1234.txt (with extra params provided by getSignedUrl())

(I also have an upload bucket which is mapped to the default path of the CloudFront distribution, and everything is working well there)

let s3 = new AWS.S3({
    region: 'ap-southeast-2',
    signatureVersion: 'v4',
    // downloads were working okay until I added the following 2 lines:
    endpoint:  'files.mydomain.com/downloads',
    s3BucketEndpoint: true
});

return s3.getSignedUrl('getObject', {
    Bucket: 'my-download-bucket',
    Key: '1234.txt',
    Expires: 10
});

The code above generates the URL as desired, and it seems that the request is being passed through to my-downlod-bucket but an error is returned:

<?xml version="1.0" encoding="UTF-8"?>
<Error>
    <Code>SignatureDoesNotMatch</Code>
    <Message>The request signature we calculated does not match the signature you provided. Check your key and signing method.
    </Message>
    <AWSAccessKeyId>....</AWSAccessKeyId>
    <StringToSign>AWS4-HMAC-SHA256 20171011T024435Z 20171011/ap-southeast-2/s3/aws4_request blabla</StringToSign>
    <SignatureProvided>57a...</SignatureProvided>
    <StringToSignBytes>41 57 ...</StringToSignBytes>
    <CanonicalRequest>
      GET /downloads/1234.txt X-Amz-Algorithm=AWS4-HMAC-SHA256&amp;X-Amz-Credential=A...%2F20171011%2Fap-southeast-2%2Fs3%2Faws4_request&amp;X-Amz-Date=20171011T024435Z&amp;X-Amz-Expires=10&amp;X-Amz-Security-Token=GENERATED_TOKEN...&amp;X-Amz-SignedHeaders=host
host:my-downlod-bucket.s3.amazonaws.com host UNSIGNED-PAYLOAD
    </CanonicalRequest>
    <CanonicalRequestBytes>47 45 54 ...</CanonicalRequestBytes>
    <RequestId>E8ED893EED3CA0FC</RequestId>
    <HostId>xxxx=</HostId>
</Error>

2 things stand out to me:

  • GET /downloads/1234.txt should be GET /1234.txt as far as the bucket is concerned
  • should host:my-downlod-bucket.s3.amazonaws.com be files.mydomain.com?
@nalbion
Copy link
Author

nalbion commented Oct 11, 2017

...oh, the CloudFront behavior path pattern is not a prefix that is removed - the path is not (can not) be rewritten by CloudFront - that's a shame

@jeskew
Copy link
Contributor

jeskew commented Oct 11, 2017

It looks like CloudFront is forwarding the request path (including the signature parameters in the query string) on to S3 but overwriting the host. This is expected behavior for CloudFront — it’s meant to act as a caching proxy whose DNS record is distinct from that of the backend to which it forwards requests — so I’m not sure it would be possible to access a presigned S3 URL through a CloudFront web distribution.

CloudFront has its own presigning mechanism that you might want to look into. The developer guide for the feature can be found here, and the API docs for the AWS SDK for JavaScript’s CloudFront presigner can be found here. Because CloudFront presigning uses RSA private keys, it is only available server-side and is not supported in the browser SDK.

@sqlbot
Copy link

sqlbot commented Oct 13, 2017

I’m not sure it would be possible to access a presigned S3 URL through a CloudFront web distribution.

@jeskew for what it's worth... it's technically "possible" but it's awkward, rarely useful, and not the intended solution. Here's what's required:

  • You have to sign the request with the exact Host: header that CloudFront will use in the request it sends the bucket (testing suggests that this is always ${bucket}.s3.amazonaws.com, regardless of the bucket's region), and
  • You have to specify the object key with the path constructed the way the bucket will see it in the final request, not as it will appear in the URL sent to CloudFront (the path that S3 sees may have more components on the beginning if you use an origin path to prepend a string, or may differ if you're using Lambda@Edge to rewrite part of the path), and
  • You have to configure the distribution's Cache Behavior to "forward all, cache based on all" of the query string, and
  • You have to string-replace the path components in the resulting signed URL to match what CloudFront expects, if it differs from the actual object key (again, this is only in cases where you have configured CloudFront or Lambda@Edge to prepend or rewrite part of the path), and
  • You have to string-replace the hostname in the host portion (only) of the resulting signed URL with the hostname of the CloudFront distribution.

Some of these steps seem counter-intuitive, particularly the idea of modifying a pre-signed URL without invalidating it -- but it works because you're modifying the URL in such a way that it's valid after CloudFront modifies it (again) such that it will match the canonical request that was originally signed. You're basically tweaking the signed URL so that it's for the final request, not the initial one, and then modifying the result to match what CloudFront is expecting.

When you are done, you end up with a signed URL that will "work," but when actually used, it results in CloudFront caching responses that it will never again actually serve from the cache... unless you're actually reusing that exact same identical signed URL, signature and all (unlikely). This is because CloudFront (correctly) uses the entire forwarded request -- including the forwarded query string -- as the cache key used when doing a cache lookup. So they requests will always be X-Cache: Miss from CloudFront, because CloudFront has never seen that identical request before (the signature and much of the rest of the query string differs).

The net result of this approach is that you end up with something very similar to S3 Transfer Acceleration -- you're transporting the request on the AWS "edge network," which may speed up transfers (particularly to viewers more distant from the bucket), but not actually doing any caching.

This also requires that the distribution be configured without an origin access identity, and with "Restrict Viewer Access" set to disabled, because CloudFront isn't an active participant in the authentication -- it just passes through the requests. Without the signature, S3 will still deny the requests if the bucket and objects aren't public.

Using the CloudFront signing solution is definitely the way to go, except perhaps in rare cases.

@jeskew
Copy link
Contributor

jeskew commented Oct 13, 2017

Thanks for the deep dive on that, @sqlbot! It didn't occur to me that even if you could get S3 presigned URLs to work, using them would make nearly all requests uncacheable.

@nalbion, I would strongly recommend using CloudFront's presigning mechanism. In addition to avoiding the complexity described above, CloudFront will be able to cache objects based on their identifiers (bucket name + key name) rather than on the exact signature used.

@jeskew jeskew closed this as completed Oct 13, 2017
@nalbion
Copy link
Author

nalbion commented Oct 17, 2017

thanks @jeskew and @sqlbot

@srchase srchase added guidance Question that needs advice or information. and removed Question labels Jan 4, 2019
@lock
Copy link

lock bot commented Sep 28, 2019

This thread has been automatically locked since there has not been any recent activity after it was closed. Please open a new issue for related bugs and link to relevant comments in this thread.

@lock lock bot locked as resolved and limited conversation to collaborators Sep 28, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
guidance Question that needs advice or information.
Projects
None yet
Development

No branches or pull requests

4 participants