Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PutObject streams with unknown size use streaming V4. #750

Merged
merged 1 commit into from Jul 18, 2017

Conversation

harshavardhana
Copy link
Member

Currently we used to fall back and use non-streaming
version but we can instead just use the streaming
version instead.

Does not fix any apparent bug, but it just allows
for avoiding any checksum calculation for large buffers.

api.go Outdated
@@ -700,7 +700,8 @@ func (c Client) newRequest(method string, metadata requestMetadata) (req *http.R
case signerType.IsV2():
// Add signature version '2' authorization header.
req = s3signer.SignV2(*req, accessKeyID, secretAccessKey)
case signerType.IsStreamingV4() && method == "PUT":
case metadata.objectName != "" && method == "PUT":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This change has the following user-visible impact.

  1. signature type set in configuration or overridden via overrideSignerType` is not honored, for all PUT requests on object resources. This may confuse existing users.

  2. removes some performance advantages that were available to users previously.
    E.g, user uploading an object of known size over HTTPS would like to benefit from not having to compute sha256 sum of the payload. It looks like this request will be signed using streaming signature forcing sha256 sum computation for every 'chunk'.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct we just default to sha256 anyways.. This is faster than actually doing md5sum with https in the benchmarks.. Defaulting to streaming was already done for all PUT operations, this change just goes one step further and enables that also for situation where size of the object is not known.

Streaming signature is a subset of V4 not a different auth mechanism in itself . That is why the differentiation is removed and implemented in such a way that .. if V4 is configured and object is being uploaded they all default to streaming signature implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Correct we just default to sha256 anyways.. This is faster than actually doing md5sum with https in the benchmarks..

How does the performance of streaming signature compare with not computing sha256 sum of payload when object is being uploaded over a HTTPS connection using signature v4? Is streaming signature based upload (which computes sha256 sum of every chunk) faster than UNSIGNED PAYLOAD using signature v4 over HTTPS?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Even with unsigned payload we calculate content-md5 . That is what slows it down.. but you are right I can change that for secure connection can just skip this as well.

@harshavardhana harshavardhana force-pushed the avoid-mutation branch 2 times, most recently from 5f97bac to fdfe15c Compare July 11, 2017 18:44
Copy link
Member

@krisis krisis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The code changes look good to me. I have a minor (yet important) comment w.r.t a comment.

api.go Outdated
@@ -700,7 +700,11 @@ func (c Client) newRequest(method string, metadata requestMetadata) (req *http.R
case signerType.IsV2():
// Add signature version '2' authorization header.
req = s3signer.SignV2(*req, accessKeyID, secretAccessKey)
case signerType.IsStreamingV4() && method == "PUT":
case metadata.objectName != "" && method == "PUT" && metadata.customHeader.Get("X-Amz-Copy-Source") == "" && !c.secure:
// Following condition is only invoked when objectName is set,
Copy link
Member

@krisis krisis Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This comment describes the low-level details which are available in the code. We could describe the same at a higher-level as,
// Streaming signature is used by default for a PUT object request on a HTTP object storage server.
In a higher-level description the reader is required to make the connection between the goal and the case expression that represents it. This seems reasonable to me. Thoughts?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes.. done.

@deekoder deekoder requested review from vadmeste and removed request for donatello July 14, 2017 21:43
@@ -700,7 +700,8 @@ func (c Client) newRequest(method string, metadata requestMetadata) (req *http.R
case signerType.IsV2():
// Add signature version '2' authorization header.
req = s3signer.SignV2(*req, accessKeyID, secretAccessKey)
case signerType.IsStreamingV4() && method == "PUT":
case metadata.objectName != "" && method == "PUT" && metadata.customHeader.Get("X-Amz-Copy-Source") == "" && !c.secure:
Copy link
Member

@vadmeste vadmeste Jul 14, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@harshavardhana, can you add a comment to explain why we check c.secure ?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because we don't want to do TLS sha256.

@vadmeste
Copy link
Member

Looks like minio-go tests are not passing here:

--- FAIL: TestCorePutObject (5.90s)
        core_test.go:354: Error expected: nil, got:  <nil>

Could be not related to this PR.

@harshavardhana
Copy link
Member Author

Looks like minio-go tests are not passing here:

--- FAIL: TestCorePutObject (5.90s)
core_test.go:354: Error expected: nil, got:
Could be not related to this PR.

This looks like some other problem. There shouldn't be an error here.

Currently we used to fall back and use non-streaming
version but we can instead just use the streaming
version instead.

Does not fix any apparent bug, but it just allows
for avoiding any checksum calculation for large buffers.
@harshavardhana
Copy link
Member Author

@vadmeste can you take a look again the PR is updated.

if size < minPartSize {
return c.putObjectNoChecksum(bucketName, objectName, reader, size, metadata, progress)
}

// For all sizes greater than 64MiB do multipart.
return c.putObjectMultipartStream(bucketName, objectName, reader, size, metadata, progress)
}

func (c Client) putObjectMultipartStreamNoLength(bucketName, objectName string, reader io.Reader, size int64,
Copy link
Member

@vadmeste vadmeste Jul 16, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the delay. I don't understand one thing: why are we passing size parameter here though this function name is "putObjectMultipartStreamNoLength" (nolength). Besides, this function seems to be called only when size < 0.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess we can change that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done in a later PR which refactors PutObject and get this merged ?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can this be done in a later PR which refactors PutObject and get this merged ?

Sure

Copy link
Member

@vadmeste vadmeste left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@vadmeste vadmeste merged commit 75c1ccd into minio:master Jul 18, 2017
@harshavardhana harshavardhana deleted the avoid-mutation branch July 18, 2017 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants