New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why ETAG is sent as base16 encoded string of contentMD5 #151

Closed
kishore25kumar opened this Issue Sep 12, 2016 · 8 comments

Comments

Projects
None yet
4 participants
@kishore25kumar
Contributor

kishore25kumar commented Sep 12, 2016

Why the md5 of cloud provider is not sent directly to the client? Why etag is set using base16 encode of contentMD5 ?

Here is the link to the code

https://github.com/andrewgaul/s3proxy/blob/master/src/main/java/org/gaul/s3proxy/S3ProxyHandler.java#L2387

When initiating partial download using range query, Azure doesn't provide any contentMD5 but it provides ETAG. Because s3proxy sends etag as base16 encoded string of contentMD5, I am not getting any etag in the response when doing a partial download

@gaul

This comment has been minimized.

Show comment
Hide comment
@gaul

gaul Sep 12, 2016

Owner

S3Proxy has this behavior due to a jclouds limitation. jclouds has a higher-level concept of MD5 and decodes a variety of headers to populate this field. Instead it should provide the opaque ETag as sent by the provider, which is not MD5 for Azure and B2. This is simple to fix but requires touching a bunch of call sites in jclouds.

Owner

gaul commented Sep 12, 2016

S3Proxy has this behavior due to a jclouds limitation. jclouds has a higher-level concept of MD5 and decodes a variety of headers to populate this field. Instead it should provide the opaque ETag as sent by the provider, which is not MD5 for Azure and B2. This is simple to fix but requires touching a bunch of call sites in jclouds.

@jpoon

This comment has been minimized.

Show comment
Hide comment
@jpoon

jpoon Sep 12, 2016

Hey @andrewgaul,

I'd like to help to see how we can fix this. Can you elaborate on the various call sites in jcloud that would need to be modified in order to return the appropriate etag?

jpoon commented Sep 12, 2016

Hey @andrewgaul,

I'd like to help to see how we can fix this. Can you elaborate on the various call sites in jcloud that would need to be modified in order to return the appropriate etag?

@ritazh

This comment has been minimized.

Show comment
Hide comment
@ritazh

ritazh Sep 12, 2016

Contributor

Hi @andrewgaul, I think this is a good opportunity to get this fixed in jclouds once for all. @jpoon has offered to help us 😃 Can you please help identify the files to update in jclouds and we can work with you to get this done.

Contributor

ritazh commented Sep 12, 2016

Hi @andrewgaul, I think this is a good opportunity to get this fixed in jclouds once for all. @jpoon has offered to help us 😃 Can you please help identify the files to update in jclouds and we can work with you to get this done.

@kishore25kumar

This comment has been minimized.

Show comment
Hide comment
@kishore25kumar

kishore25kumar Sep 13, 2016

Contributor

@andrewgaul why can't we send etag provided by azure directly to the client. What is the issue with the code below

if (metadata.getETag() != null) {
byte[] etagBytes = metadata.getETag().getBytes();
response.addHeader(HttpHeaders.ETAG,BaseEncoding.base16().lowerCase().encode(etagBytes));
}

I understand that the etag sent by the azure is not MD5 of the content.

Contributor

kishore25kumar commented Sep 13, 2016

@andrewgaul why can't we send etag provided by azure directly to the client. What is the issue with the code below

if (metadata.getETag() != null) {
byte[] etagBytes = metadata.getETag().getBytes();
response.addHeader(HttpHeaders.ETAG,BaseEncoding.base16().lowerCase().encode(etagBytes));
}

I understand that the etag sent by the azure is not MD5 of the content.

@gaul gaul closed this in bb64884 Sep 13, 2016

@gaul

This comment has been minimized.

Show comment
Hide comment
@gaul

gaul Sep 13, 2016

Owner

Sorry my explanation was bogus; jclouds does populate ETag which is how operations like copy blob work. I fixed this in a slightly different way; @kishore25kumar could you please test the latest commit?

Owner

gaul commented Sep 13, 2016

Sorry my explanation was bogus; jclouds does populate ETag which is how operations like copy blob work. I fixed this in a slightly different way; @kishore25kumar could you please test the latest commit?

@kishore25kumar

This comment has been minimized.

Show comment
Hide comment
@kishore25kumar

kishore25kumar Sep 13, 2016

Contributor

@andrewgaul no its not working. I am getting the below error

java.lang.IllegalArgumentException: Input is expected to be encoded in multiple of 2 bytes but found: 17
at com.amazonaws.util.Base16Codec.decode(Base16Codec.java:76)
at com.amazonaws.util.Base16Lower.decode(Base16Lower.java:53)
at com.amazonaws.util.BinaryUtils.fromHex(BinaryUtils.java:48)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1334)
at S3ProxyTest.main(S3ProxyTest.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)

BTW where are we setting Content-MD5 header?

Contributor

kishore25kumar commented Sep 13, 2016

@andrewgaul no its not working. I am getting the below error

java.lang.IllegalArgumentException: Input is expected to be encoded in multiple of 2 bytes but found: 17
at com.amazonaws.util.Base16Codec.decode(Base16Codec.java:76)
at com.amazonaws.util.Base16Lower.decode(Base16Lower.java:53)
at com.amazonaws.util.BinaryUtils.fromHex(BinaryUtils.java:48)
at com.amazonaws.services.s3.AmazonS3Client.getObject(AmazonS3Client.java:1334)
at S3ProxyTest.main(S3ProxyTest.java:29)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at com.intellij.rt.execution.application.AppMain.main(AppMain.java:147)

BTW where are we setting Content-MD5 header?

@kishore25kumar

This comment has been minimized.

Show comment
Hide comment
@kishore25kumar

kishore25kumar Sep 13, 2016

Contributor

@andrewgaul Sorry with your suggested change it worked. I forgot to add System.setProperty("com.amazonaws.services.s3.disableGetObjectMD5Validation", "true"); in my java code.

Contributor

kishore25kumar commented Sep 13, 2016

@andrewgaul Sorry with your suggested change it worked. I forgot to add System.setProperty("com.amazonaws.services.s3.disableGetObjectMD5Validation", "true"); in my java code.

@gaul

This comment has been minimized.

Show comment
Hide comment
@gaul

gaul Sep 13, 2016

Owner

To complete the discussion, Azure returns both a Content-MD5 and ETag header, with the latter having some opaque meaning. AWS S3 returns the Content-MD5 as the ETag for single-part uploads. S3Proxy returns the Azure ETag as the S3 ETag instead of the MD5 because other operations like conditional copies and gets require this. Thus some clients need configuration as you discovered to operate with Azure.

Owner

gaul commented Sep 13, 2016

To complete the discussion, Azure returns both a Content-MD5 and ETag header, with the latter having some opaque meaning. AWS S3 returns the Content-MD5 as the ETag for single-part uploads. S3Proxy returns the Azure ETag as the S3 ETag instead of the MD5 because other operations like conditional copies and gets require this. Thus some clients need configuration as you discovered to operate with Azure.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment