Skip to content
This repository has been archived by the owner on Jul 25, 2020. It is now read-only.

add aws s3 signature v4 #678

Closed
wants to merge 14 commits into from
Closed

Conversation

zhaojin0
Copy link
Contributor

AWS S3 signature v4 impl

@zhaojin0
Copy link
Contributor Author

add aws s3 signature v4, plz review.

return Optional.absent();
}

@SuppressWarnings("unchecked")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a lot of gratuitous reindentation.

@gaul
Copy link
Member

gaul commented Feb 11, 2015

@zhaojin0 Please address Checkstyle violations and remove gratuitous reindentation. I will look at this some more afterwards.

@gaul
Copy link
Member

gaul commented Feb 13, 2015

Could you tag this commit with JCLOUDS-480 so that JIRA will note it?

@gaul
Copy link
Member

gaul commented Feb 19, 2015

@zhaojin0 Any updates on this pull request? I would like to include it in the upcoming 1.9.0 release.

@zhaojin0
Copy link
Contributor Author

I'm sorry for my lazy...
i add temporary access signature code, but i dont known how to test it. It could work in aws region cn-north-1.

public class AWSS3BlobStoreContextModule extends S3BlobStoreContextModule {
   //...
   @Override
   protected void bindRequestSigner() {
      bind(BlobRequestSigner.class).to(AWSS3BlobRequestSignerV4.class);
   }
}

@gaul
Copy link
Member

gaul commented Feb 26, 2015

@zhaojin0 I do not understand how this is supposed to work -- you should bind the V4 signer somewhere like AWSS3BlobStoreContextModule:

bind(BlobRequestSigner.class).to(AWSS3BlobRequestSignerV4.class);

I also do not understand the overall strategy here, but I believe that jclouds should use the new V4 signer for all AWS calls and regions. The generic S3 can continue to use the V2 code.

I played around with this a bit and do not see any new live tests. I added one to AWSBucketsLiveTest which requires a region with a V4 signer:

public void testV4Region() throws Exception {
    String bucketName = getScratchContainerName() + "-v4-only";
    getApi().putBucketInRegion(Region.EU_CENTRAL_1, bucketName);
    try {
        assertEquals(Region.EU_CENTRAL_1, getApi().getBucketLocation(bucketName));
    } finally {
        destroyContainer(bucketName);
    } 
}

However a see a variety of signing errors of the form:

org.jclouds.aws.AWSResponseException: request GET https://gaul-blobstore-2489004073675868056-v4-only.s3-eu-central-1.amazonaws.com/ HTTP/1.1 failed with code 400, error: AWSError{requestId='7EF96215BA50542A', requestToken='t2sOzdsOU5tJqer0pLcfbuAQnuASYf4X3xf1qfWQkjSESCp5hBcgKhMuoO3zRJ6u14OLnNwHokE=', code='InvalidRequest', message='The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.', context='{HostId=t2sOzdsOU5tJqer0pLcfbuAQnuASYf4X3xf1qfWQkjSESCp5hBcgKhMuoO3zRJ6u14OLnNwHokE=}'}

Finally there are many spurious changes and Checkstyle violations in this pull request which make it hard to read. You must correct these before I can continue to review.

As for the signer tests, the general ones in BaseBlobSignerLiveTest should exercise your code if you bind the signer correctly as per my first comment.

@zhaojin0
Copy link
Contributor Author

it's use for sign a temporary access...
I provided testcase AWSS3BlobSignerV4ExpectTest.
sorry, this's my first pull request, I format some code....

}
String contentSha256 = base16().lowerCase().encode(hash(payloadStream));
try {
payloadStream.reset();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when the payload is not repeatable?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

payload stream use calculate content hash.
if can not be repeatable, the payload cannot append to http request body.

@gaul
Copy link
Member

gaul commented Mar 31, 2015

Any plans to continue with this pull request?

@zhaojin0
Copy link
Contributor Author

zhaojin0 commented Apr 1, 2015

AWS Sign V4 use sha256 content hash. If payload can not be reset, aws supported chunked uploads.
http://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html

@gaul
Copy link
Member

gaul commented Apr 17, 2015

@zhaojin0 I am testing this and see many errors of the form:

org.jclouds.aws.AWSResponseException: request GET https://gaul-blobstore3760643340725640912-v4-only.s3-eu-central-1.amazonaws.com/ HTTP/1.1 failed with code 400, error: AWSError{requestId='0F84681CE6013127', requestToken='THXHcOkpHKTdN7DedbIb8qCj7/MuvvvGyu31O42h6BjkTFSvHdKNggDL/aQ6Mm1IGROwsAO58fw=', code='InvalidRequest', message='The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.', context='{HostId=THXHcOkpHKTdN7DedbIb8qCj7/MuvvvGyu31O42h6BjkTFSvHdKNggDL/aQ6Mm1IGROwsAO58fw=}'}

when deleting items in the container between runs. Any suggestions on this? Also do all the integration tests pass for you? I see a few errors:

  AWSS3ContainerIntegrationLiveTest>BaseContainerIntegrationTest.deleteContainerIfEmptyWithoutContents:315 expected [true] but found [false]
  AWSS3ContainerLiveTest>BaseContainerLiveTest.testPublicAccessInNonDefaultLocationWithBigBlob:112->BaseContainerLiveTest.runCreateContainerInLocation:124->BaseBlobStoreIntegrationTest.assertConsistencyAwareContainerExists:361->BaseBlobStoreIntegrationTest.assertConsistencyAware:248->BaseBlobStoreIntegrationTest.assertConsistencyAware:235 » HttpResponse
  AWSS3ServiceIntegrationLiveTest>BaseServiceIntegrationTest.testAllLocations:52->BaseBlobStoreIntegrationTest.assertConsistencyAware:248->BaseBlobStoreIntegrationTest.assertConsistencyAware:235 » HttpResponse
  AWSS3ServiceIntegrationLiveTest>BaseServiceIntegrationTest.testGetAssignableLocations:93 {scope=REGION, id=eu-central-1, description=eu-central-1, parent=aws-s3, iso3166Codes=[DE-HE]} ||{scope=PROVIDER, id=aws-s3, description=https://s3.amazonaws.com, iso3166Codes=[US, US-CA, US-OR, BR-SP, IE, SG, AU-NSW, JP-13]}

Tests run: 99, Failures: 4, Errors: 0, Skipped: 1

import java.io.IOException;

import static org.testng.Assert.assertEquals;

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Spurious code movement?

@zhaojin0
Copy link
Contributor Author

It's seem as region eu-central-1 doesn't supported AWS sign V2.

在 2015/4/18 2:46, Andrew Gaul 写道:

@zhaojin0 https://github.com/zhaojin0 I am testing this and see many
errors of the form:

|org.jclouds.aws.AWSResponseException: request GET https://gaul-blobstore3760643340725640912-v4-only.s3-eu-central-1.amazonaws.com/ HTTP/1.1 failed with code 400, error: AWSError{requestId='0F84681CE6013127', requestToken='THXHcOkpHKTdN7DedbIb8qCj7/MuvvvGyu31O42h6BjkTFSvHdKNggDL/aQ6Mm1IGROwsAO58fw=', code='InvalidRequest', message='The authorization mechanism you have provided is not supported. Please use AWS4-HMAC-SHA256.', context='{HostId=THXHcOkpHKTdN7DedbIb8qCj7/MuvvvGyu31O42h6BjkTFSvHdKNggDL/aQ6Mm1IGROwsAO58fw=}'}
|

when deleting items in the container between runs. Any suggestions on
this? Also do all the integration tests pass for you? I see a few errors:

| AWSS3ContainerIntegrationLiveTest>BaseContainerIntegrationTest.deleteContainerIfEmptyWithoutContents:315 expected [true] but found [false]
AWSS3ContainerLiveTest>BaseContainerLiveTest.testPublicAccessInNonDefaultLocationWithBigBlob:112->BaseContainerLiveTest.runCreateContainerInLocation:124->BaseBlobStoreIntegrationTest.assertConsistencyAwareContainerExists:361->BaseBlobStoreIntegrationTest.assertConsistencyAware:248->BaseBlobStoreIntegrationTest.assertConsistencyAware:235 » HttpResponse
AWSS3ServiceIntegrationLiveTest>BaseServiceIntegrationTest.testAllLocations:52->BaseBlobStoreIntegrationTest.assertConsistencyAware:248->BaseBlobStoreIntegrationTest.assertConsistencyAware:235 » HttpResponse
AWSS3ServiceIntegrationLiveTest>BaseServiceIntegrationTest.testGetAssignableLocations:93 {scope=REGION, id=eu-central-1, description=eu-central-1, parent=aws-s3, iso3166Codes=[DE-HE]} ||{scope=PROVIDER, id=aws-s3, description=https://s3.amazonaws.com, iso3166Codes=[US, US-CA, US-OR, BR-SP, IE, SG, AU-NSW, JP-13]}

Tests run: 99, Failures: 4, Errors: 0, Skipped: 1
|


Reply to this email directly or view it on GitHub
#678 (comment).

赵金
Zhao Jin
18610722868

北京优创联动科技有限公司
北京市 海淀区 学清路38号 金码大厦16层
100083

@zhaojin0
Copy link
Contributor Author

zhaojin0 commented May 4, 2015

Hi, I impl aws s3 signer v4 chunked upload, use when put object, payload cannot repeatable.

@pswvg
Copy link

pswvg commented May 15, 2015

For our project, we took the PR, applied some minor fixes (zhaojin0#1) and backported it to stable 1.8.1. With our Integration tests, it works very well. Tested it against Frankfurt - which supports v4 only.

public InputStream nextElement() {
int bytesRead;
try {
bytesRead = inputStream.read(buffer, 0, buffer.length);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bytesRead can be less than buffer.length!

This caused problems during our tests, when the underlying inputStream only allows 8k blocks. The result was, that the pre-calculated content-length (using 64k chunks) was less than the actual send content length (using 8k chunks got from the InputStream).

That leads to HttpUrlConnection's "too many bytes written" Exception later on. ByteArrayInputStream and FileInputStream normally return the length as requested. So our integration tests didn't fail in the first place. When we live tested it, it failed because the inputStream we got from tomcat (from an upload) is using 8k blocks :(.

We fixed that on our side with a wrapping InputStream that always returns the requested length when read(buf,off,len) is invoked. Performing n additional reads from the underlying inputStream if necessary.

This way, the expected 64k blocks were send and the pre-calculated content-length matched the actual content length.

Maybe a loop must be added here to read exactly buffer.length bytes from the inputStream matching the chunkedBlockSize, like we did in our inputStream wrapper.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can also call ByteStreams.readFully(InputStream, byte[]).

@pswvg
Copy link

pswvg commented May 20, 2015

I thought about the content-length again. The old V2 implementation wasn't able to handle unknown content-lengths. So does the V4. But I think it wouldn't be so difficult to add this. Because chunking is already implemented nicely and instead of sending the content-length of the complete payload, one could switch to transfer-encoding instead - like "Note"d here:

http://docs.aws.amazon.com/AmazonS3/latest/API/sigv4-streaming.html

That would mean:

Transfer-Encoding: chunked

should do the trick and so omitting Content-Length completely.

That would be really nice, since in our project we aren't always be able to get the complete
content length in advance (live streaming from browser clients etc.). So in the worst-case, we have
temporarily store the whole stream to disk (to get the content length) before we can send it to s3.

What do you guys think?

@zhaojin0
Copy link
Contributor Author

try Multipart Upload.
initiate Multipart Upload..temporarily store specific length part, upload all stream part......complete multipart upload...
part requires Content-Length. not all a file. but I have not tried :)

@pswvg
Copy link

pswvg commented Jun 22, 2015

The last time I tried Multipart Upload with jclouds aws-s3, it also forced me to specify the content length in advance. See my post here:

https://www.mail-archive.com/user@jclouds.apache.org/msg01562.html.

But that's another story.

What is the state about this PR? What is missing before it could be merged? I think the build failure was because of a checkstyle error.

@gaul
Copy link
Member

gaul commented Jun 22, 2015

@pswvg This PR still needs a thorough review -- S3 is our most popular provider and I want to ensure we do not introduce any regressions for our users.

@zhaojin0 Sorry about the delay; I will try to get to this soon.

@freddy33
Copy link

Any reason for this not to have been merged?
Is it reported to 2.0 version only?
Could it be merge on 1.9 branch?

@pswvg
Copy link

pswvg commented Sep 11, 2015

AFAIK this is reported against 1.8.x. I've applied this PR to a 1.8.1 fork without a problem.

@manzke
Copy link

manzke commented Dec 16, 2015

hey guys any progress on it? I hate that we have to use a patch version in production, but we already use it for some time and have load tested it as well. 👍

@demobox
Copy link
Member

demobox commented Dec 16, 2015

@andrewgaul Ping? Is this something you'd be able to take a look at?

@gaul
Copy link
Member

gaul commented Dec 19, 2015

@demobox I will try to make a run at this early next week when I work on some related V4 issues. Sorry for the delay!

@gaul
Copy link
Member

gaul commented Jan 13, 2016

@zhaojin0 I made some stylistic modifications, squashed the commits, and pushed to master as 8bddbb4. Thank you for your contribution and for your patience with this extraordinarily long review!

@gaul gaul closed this Jan 13, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
6 participants