Require that early-return compressed-blobs bytestream uploads set committed_size -1 #213

mostynb · 2022-01-30T23:53:57Z

We require that uncompressed bytestream uploads specify committed_size set to the size of the blob when returning early (if the blob already exists on the server).

We also require that for compressed bytestream uploads committed_size refers to the initial write offset plus the number of compressed bytes uploaded. But if the server wants to return early in this case it doesn't know how many compressed bytes would have been uploaded (the client might not know this ahead of time either). So let's require that the server set committed_size to -1 in this case.

For early return to work, we also need to ensure that the server does not return an error code.

Resolves #212.

…mitted_size -1 We require that uncompressed bytestream uploads specify committed_size set to the size of the blob when returning early (if the blob already exists on the server). We also require that for compressed bytestream uploads committed_size refers to the initial write offset plus the number of compressed bytes uploaded. But if the server wants to return early in this case it doesn't know how many compressed bytes would have been uploaded (the client might not know this ahead of time either). So let's require that the server set committed_size to -1 in this case. For early return to work, we also need to ensure that the server does not return an error code. Resolves bazelbuild#212.

bduffany · 2022-01-31T21:44:34Z

IIUC, based on the reasoning in #212 (comment) we don't want to return AlreadyExists in this case because it is backwards incompatible with existing clients (notably, all versions of Bazel that currently exist in the wild).

But it seems like this approach of returning -1 is also incompatible, no? Clients currently treat committed_size == -1 as an error. Bazel, in particular, is effectively checking for committed_size == total length of client-compressed stream, whenever the server has sent back a WriteResponse, and Bazel has finished sending all bytestream chunks to the server, irrespective of whether the server sent back the WriteResponse early or not. If this check fails, then it returns the following error: https://github.com/bazelbuild/bazel/blob/master/src/main/java/com/google/devtools/build/lib/remote/ByteStreamUploader.java#L474

bduffany · 2022-01-31T21:58:55Z

Just spitballing here, but another option might be to make use of the new /optional_metadata field that was added to the ByteStream resource name specification, and allow the client to specify that they support short-circuiting, maybe like /?shortCircuit=true. If they do, then the server can safely return AlreadyExists.

mostynb · 2022-01-31T22:04:36Z

To clarify, I think we should maintain backwards compatibility for uncompressed uploads (ie clients can check for committed_size == digest size), but we need to make a breaking change for compressed uploads because some existing clients are always going to be broken in the current setup unless we forbid servers from doing an early-return for compressed uploads (which doesn't sound great to me).

This feature is still experimental in bazel, so IMO we can decide on a reasonable behaviour that supports early-return for compressed uploads without changing the requirements for uncompressed uploads, and then submit a fix for bazel. I think specifying -1 for early-returned compressed uploads is simple, but I would also be ok with returning AlreadyExists (though it would be a slightly larger bazel-side fix).

Re using the bytestream resource metadata field, it's generally disliked (and anyway it would require a client-side fix, which which case we can pick another client side fix).

mostynb · 2022-02-04T23:42:30Z

CC @AlessandroPatti who implemented this functionality in bazel- it looks like you tried to account for this scenario. Maybe there's a corner case where the uploader sends all the data but the server "returns early" in the last chunk? What do you think about the server returning -1 in this case (or alternatively returning an AlreadyExists error)?

bergsieker · 2022-02-07T22:45:42Z

My first impulse here is to make compressed and uncompressed uploads match. Both should return the total number of uncompressed bytes captured the blob (effectively, the size field in the blob's digest). This means that the server must uncompress the uploaded data to calculate the size, but servers must already do this to validate the digest. Of course, then we get back to the issue of returning compressed vs. uncompressed bytes that plagues our implementation of Bytestream.Write.

I think that if we're going to make a breaking change--which it does indeed seem like this situation requires--we should aim for the "correct" semantic of returning AlreadyExists.

@EricBurnett for any relevant context on the history of why we didn't do this in the first place.

@coeuvre for review on the Bazel side.

EricBurnett · 2022-02-07T23:05:16Z

+1 for setting Committed Size to the *uncompressed* size of the blob for early returns. That also has the advantage of matching QueryWriteStatusResponse, which should be returning the uncompressed size for a complete blob irrespective of encoding. The problem we've had with ByteStream.Write is rooted in https://github.com/googleapis/googleapis/blob/master/google/bytestream/bytestream.proto#L138-L144 - our use of ByteStream comes with its own field semantics, which don't match well to the way we use bytestreams for compressed payloads of logically-uncompressed data. We definitely want to fork away from the ByteStream proto when we do a breaking change of the API, at which point we can split the fields to have a clean separation between "logical sizes" and "transported bytes".

…

On Mon, Feb 7, 2022 at 5:45 PM Steven Bergsieker ***@***.***> wrote: My first impulse here is to make compressed and uncompressed uploads match. Both should return the total number of uncompressed bytes captured the blob (effectively, the size field in the blob's digest). This means that the server must uncompress the uploaded data to calculate the size, but servers must already do this to validate the digest. Of course, then we get back to the issue of returning compressed vs. uncompressed bytes that plagues our implementation of Bytestream.Write. I think that if we're going to make a breaking change--which it does indeed seem like this situation requires--we should aim for the "correct" semantic of returning AlreadyExists. @EricBurnett <https://github.com/EricBurnett> for any relevant context on the history of why we didn't do this in the first place. @coeuvre <https://github.com/coeuvre> for review on the Bazel side. — Reply to this email directly, view it on GitHub <#213 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AABREWYNOCVTYJ2KVPUVPQDU2BDSBANCNFSM5NE4PYCQ> . Triage notifications on the go with GitHub Mobile for iOS <https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675> or Android <https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub>. You are receiving this because you were mentioned.Message ID: ***@***.***>

mostynb · 2022-02-07T23:17:39Z

My first impulse here is to make compressed and uncompressed uploads match. Both should return the total number of uncompressed bytes captured the blob (effectively, the size field in the blob's digest). This means that the server must uncompress the uploaded data to calculate the size, but servers must already do this to validate the digest. Of course, then we get back to the issue of returning compressed vs. uncompressed bytes that plagues our implementation of Bytestream.Write.

I don't think there is necessarily a mapping of "N compressed bytes transferred" to "M uncompressed bytes transferred" for arbitrary compression formats, unless you allow for possibly confusing scenarios like "The client wrote a bunch of compressed data successfully, but not enough for the server to decode another frame, so the uncompressed bytes written did not increase". This is why we chose a kind of fake definition of committed_size for compressed blobs (initial uncompressed offset + compressed bytes written).

+1 for setting Committed Size to the uncompressed size of the blob for
early returns. That also has the advantage of matching
QueryWriteStatusResponse, which should be returning the uncompressed size
for a complete blob irrespective of encoding.

The downside of that approach is that committed_size might decrease during a successful upload, which feels super confusing (eg it increases monotonically as expected, for ztsd encoded data which turns out to be larger than the uncompressed data, and then the server does an early-return near the end of the transfer).

bergsieker · 2022-02-08T04:12:14Z

+1 for setting Committed Size to the uncompressed size of the blob for
early returns. That also has the advantage of matching
QueryWriteStatusResponse, which should be returning the uncompressed size
for a complete blob irrespective of encoding.

The downside of that approach is that committed_size might decrease during a successful upload, which feels super confusing (eg it increases monotonically as expected, for ztsd encoded data which turns out to be larger than the uncompressed data, and then the server does an early-return near the end of the transfer).

Hmmm, looking at this again, I think that returning uncompressed size contradicts the general guidance that we've given that committed_bytes should be (uncompressed offset) + (sum of compressed data bundles). So in that case successful, full compressed uploads should return the compressed data size, while successful, early-terminated compressed uploads would return the uncompressed data size. That makes no sense.

Side note: I thought we'd settled on returning the uncompressed data size for compressed uploads, for the reason that Eric mentioned--it's the only value that makes sense to provide as an offset to future calls. Am I misremembering?

I continue to believe that returning ALREADY_EXISTS for early termination of compressed uploads is the right approach. I think the only reason we didn't do that for uncompressed uploads is that it would have been a breaking change for Bazel, but if Bazel support for compressed uploads is experimental and we're OK with breaking it, we might as well break it in the right way.

mostynb · 2022-02-08T13:33:08Z

Side note: I thought we'd settled on returning the uncompressed data size for compressed uploads, for the reason that Eric mentioned--it's the only value that makes sense to provide as an offset to future calls. Am I misremembering?

We decided on this mostly useless value in @nodirg's PR #193 but we missed the early-return corner case.

As I see it we probably have two kinds of clients that support compressed-blobs bytestream uploads:

Clients like bazel which perform the wrong committed_size check and will randomly consider early-returned compressed-blobs uploads as failed. These will need updating regardless of whether we switch to an AlreadyExists error or a -1 sentinel value.
Clients which ignore the committed_size check, but will consider any bytestream upload as failed if any non-EOF error is returned (since we specify no other behaviour for error values and streaming grpc is underdocumented IMO). These work now and will continue working if we use a -1 sentinel value, but will break if servers change to use an AlreadyExists error.

bergsieker · 2022-02-10T22:20:56Z

Per comments in the meeting: we believe that taking the less-disruptive course of returning -1 is a good idea here. The semantics of compressed vs. uncompressed bytes are already a mess, but we can't meaningfully clean them up until we move away from Bytestream.

Note that returning -1 still breaks Bazel's (experimental) support for compressed uploads.

bergsieker · 2022-02-10T22:23:22Z

Adding Sven for approval from Bazel before merging. He can stand in for Chi, who is having trouble with his Github account.

AlessandroPatti · 2022-02-10T23:26:14Z

I see there has been a consensus towards using -1, so I might be late to the party. FWIW, I'd also lean towards using AlreadyExists, which seems the most semantically correct solution here.

Re using the bytestream resource metadata field, it's generally disliked

@mostynb Could you elaborate on why is that? It seems like it could allow using AlreadyExists while keeping backward comptibility with existing clients

mostynb · 2022-02-10T23:49:52Z

@mostynb Could you elaborate on why is that? It seems like it could allow using AlreadyExists while keeping backward comptibility with existing clients

Using the bytestream metadata came up early in the compressed-blobs feature planning discussion for other purposes, but IIRC it was considered too free-form. We don't know what if anything existing clients use it for so we can't avoid causing potential conflicts.

Another point against using the AlreadyExists error code came up in the meeting- apparently it was used in an early (pre v1?) version of the spec, and was removed for some reason that nobody can quite remember, but it was significant enough to make people accept a breaking change.

This is an implementation of this REAPI spec update: bazelbuild/remote-apis#213 Which is part of the solution to this issue: bazelbuild/bazel#14654

This is an implementation of this REAPI spec update: bazelbuild/remote-apis#213 Here's a bazel-remote build that can be used to test this change: buchgr/bazel-remote#527 Fixes bazelbuild#14654

mostynb · 2022-02-19T13:02:02Z

Here's a bazel-remote PR that can be used to test this change: buchgr/bazel-remote#527
And a bazel PR: bazelbuild/bazel#14870

coeuvre

Sorry for the delay. My GitHub account was suspended for no reasons so I lost all the notifications during that period.

LGTM from Bazel side.

This is an implementation of this REAPI spec update: bazelbuild/remote-apis#213 Here's a bazel-remote build that can be used to test this change: buchgr/bazel-remote#527 Fixes bazelbuild#14654

This is an implementation of this REAPI spec update: bazelbuild/remote-apis#213 Here's a bazel-remote build that can be used to test this change: buchgr/bazel-remote#527 Fixes #14654 Closes #14870. PiperOrigin-RevId: 430167812

This is an implementation of this REAPI spec update: bazelbuild/remote-apis#213 Here's a bazel-remote build that can be used to test this change: buchgr/bazel-remote#527 Fixes bazelbuild#14654 Closes bazelbuild#14870. PiperOrigin-RevId: 430167812 (cherry picked from commit d184e48)

This is an implementation of this REAPI spec update: bazelbuild/remote-apis#213 Here's a bazel-remote build that can be used to test this change: buchgr/bazel-remote#527 Fixes #14654 Closes #14870. PiperOrigin-RevId: 430167812 (cherry picked from commit d184e48) Co-authored-by: Mostyn Bramley-Moore <mostyn@antipode.se>

mostynb · 2022-02-23T16:28:48Z

The fix has landed in bazel now, any objections to merging this so I can make a new bazel-remote release?

This is an implementation of this REAPI spec update: bazelbuild/remote-apis#213 Which is part of the solution to this issue: bazelbuild/bazel#14654

mostynb requested review from bergsieker and buchgr as code owners January 30, 2022 23:53

bergsieker removed the request for review from buchgr February 7, 2022 22:46

bergsieker approved these changes Feb 10, 2022

View reviewed changes

bergsieker requested a review from sventiffe February 10, 2022 22:22

mostynb mentioned this pull request Feb 19, 2022

Implement unambiguous compressed blobs bytestream upload early return buchgr/bazel-remote#527

Merged

mostynb mentioned this pull request Feb 19, 2022

Remote: handle early return of compressed blobs uploads bazelbuild/bazel#14870

Closed

meisterT requested a review from coeuvre February 21, 2022 14:19

coeuvre approved these changes Feb 21, 2022

View reviewed changes

brentleyjones mentioned this pull request Feb 22, 2022

[5.1] Remote: handle early return of compressed blobs uploads bazelbuild/bazel#14885

Merged

mostynb mentioned this pull request Feb 22, 2022

GRPC health check? buchgr/bazel-remote#522

Closed

bergsieker merged commit 04784f4 into bazelbuild:main Feb 23, 2022

mostynb deleted the clarify_compressed_bytestream_write_committed_size branch February 23, 2022 18:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Require that early-return compressed-blobs bytestream uploads set committed_size -1 #213

Require that early-return compressed-blobs bytestream uploads set committed_size -1 #213

mostynb commented Jan 30, 2022

bduffany commented Jan 31, 2022 •

edited

Loading

bduffany commented Jan 31, 2022 •

edited

Loading

mostynb commented Jan 31, 2022

mostynb commented Feb 4, 2022

bergsieker commented Feb 7, 2022

EricBurnett commented Feb 7, 2022 via email

mostynb commented Feb 7, 2022 •

edited

Loading

bergsieker commented Feb 8, 2022

mostynb commented Feb 8, 2022 •

edited

Loading

bergsieker commented Feb 10, 2022

bergsieker commented Feb 10, 2022

AlessandroPatti commented Feb 10, 2022

mostynb commented Feb 10, 2022

mostynb commented Feb 19, 2022

coeuvre left a comment

mostynb commented Feb 23, 2022

Require that early-return compressed-blobs bytestream uploads set committed_size -1 #213

Require that early-return compressed-blobs bytestream uploads set committed_size -1 #213

Conversation

mostynb commented Jan 30, 2022

bduffany commented Jan 31, 2022 • edited Loading

bduffany commented Jan 31, 2022 • edited Loading

mostynb commented Jan 31, 2022

mostynb commented Feb 4, 2022

bergsieker commented Feb 7, 2022

EricBurnett commented Feb 7, 2022 via email

mostynb commented Feb 7, 2022 • edited Loading

bergsieker commented Feb 8, 2022

mostynb commented Feb 8, 2022 • edited Loading

bergsieker commented Feb 10, 2022

bergsieker commented Feb 10, 2022

AlessandroPatti commented Feb 10, 2022

mostynb commented Feb 10, 2022

mostynb commented Feb 19, 2022

coeuvre left a comment

Choose a reason for hiding this comment

mostynb commented Feb 23, 2022

bduffany commented Jan 31, 2022 •

edited

Loading

bduffany commented Jan 31, 2022 •

edited

Loading

mostynb commented Feb 7, 2022 •

edited

Loading

mostynb commented Feb 8, 2022 •

edited

Loading