Compression: Further specify ByteStream WriteResponse committed_size field for compressed blobs #212

bduffany · 2022-01-27T22:17:05Z

There is an ambiguity with the current way that the ByteStream.Write protocol is specified for compressed blobs. The relevant part of the spec with the ambiguity is here:

remote-apis/build/bazel/remote/execution/v2/remote_execution.proto

Lines 256 to 264 in 636121a

    
           // When attempting an upload, if another client has already completed the upload 
        
           // (which may occur in the middle of a single upload if another client uploads 
        
           // the same blob concurrently), the request will terminate immediately with 
        
           // a response whose `committed_size` is the full size of the uploaded file 
        
           // (regardless of how much data was transmitted by the client). If the client 
        
           // completes the upload but the 
        
           // [Digest][build.bazel.remote.execution.v2.Digest] does not match, an 
        
           // `INVALID_ARGUMENT` error will be returned. In either case, the client should 
        
           // not attempt to retry the upload.

The ambiguity is: what does "full size of the uploaded file" mean? Does it mean "compressed size" or "uncompressed size"? If it's compressed size, then it's not useful info to be returned in the response, since compressed size can vary depending on the compression level used by the other client that uploaded the file.

Note also that Bazel 5.0.0 effectively expects this to match the compressed size that it uploaded. Whether or not this is a bug, it means that servers have to wait for Bazel 5.0.0 to upload all chunks before they can reply with a committed_size that Bazel 5.0.0 will be happy with, effectively preventing the "early-exit" strategy when an object already exists in the cache.

From @mostynb on that thread:

Maybe the best we can do is update the REAPI spec to advise clients to ignore committed_size for compressed writes and to rely on the error status instead in that case?

I'm not sure how the early-exit mechanism is useful in practice actually. As you mentioned the client calls Send until it thinks it has sent all the data, and only then calls CloseAndRecv to get the WriteResponse (at least in the go bindings). At this point the client has sent all the data even if the server decided to return early. So instead of returning early the server could have discarded all the received data and just counted how much compressed data was sent and returned that number. So maybe we should instead update the REAPI spec to advise servers to do that for compressed-blobs writes instead of returning early?

This workaround of discarding all received data will work, but it's unclear how significant of a performance impact this will have in practice, compared to early-exit. Hoping that some folks with REAPI experience can chime in.

mostynb · 2022-01-30T23:26:24Z

I dug into this a bit, and early-return can work if this happens (referring to generated go bindings):

In the server's Write(srv bytestream.ByteStream_WriteServer) method:

Immediately detect that the blob already exists (in this example, but it could also be detected later).
Call srv.SendAndClose(&resp) with a non-nil *WriteResponse as described (with committed_size set to some value that the client will check).
Return a nil error.

On the client side:

Create the ByteStream_WriteClient.
Send the first chunk of a blob that already exists, get nil error.
Send the second chunk of a blob that already exists, get io.EOF (which signifies that the server stopped listening).
Call CloseAndRecv(), receive the *WriteResponse that the server specified in step (2) and a nil error.

Note that in step (3) of the server, if we return a non-nil error like the gRPC AlreadyExists code instead of nil, then in step (4) of the client CloseAndRecv() returns a nil *WriteResponse and the error set by the server (eg AlreadyExists). I think this makes more sense, but changing to this behaviour would break existing clients so I don't think we should do that- instead I think we should state the server's step (3) should return a nil error explicitly in the quoted remote_execution.proto block.

Then I think we should require the server to set a placeholder committed_size value of -1 for duplicated compressed-blobs, and clients would have an unambiguous value to check for and still be able to support early returns.

…mitted_size -1 We require that uncompressed bytestream uploads specify committed_size set to the size of the blob when returning early (if the blob already exists on the server). We also require that for compressed bytestream uploads committed_size refers to the initial write offset plus the number of compressed bytes uploaded. But if the server wants to return early in this case it doesn't know how many compressed bytes would have been uploaded (the client might not know this ahead of time either). So let's require that the server set committed_size to -1 in this case. For early return to work, we also need to ensure that the server does not return an error code. Resolves bazelbuild#212.

…mitted_size -1 (#213) We require that uncompressed bytestream uploads specify committed_size set to the size of the blob when returning early (if the blob already exists on the server). We also require that for compressed bytestream uploads committed_size refers to the initial write offset plus the number of compressed bytes uploaded. But if the server wants to return early in this case it doesn't know how many compressed bytes would have been uploaded (the client might not know this ahead of time either). So let's require that the server set committed_size to -1 in this case. For early return to work, we also need to ensure that the server does not return an error code. Resolves #212.

mostynb mentioned this issue Jan 30, 2022

Require that early-return compressed-blobs bytestream uploads set committed_size -1 #213

Merged

mostynb mentioned this issue Feb 12, 2022

Can not http proxy to another bazel-remote instance buchgr/bazel-remote#524

Open

bergsieker closed this as completed in #213 Feb 23, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Compression: Further specify ByteStream WriteResponse committed_size field for compressed blobs #212

Compression: Further specify ByteStream WriteResponse committed_size field for compressed blobs #212

bduffany commented Jan 27, 2022 •

edited

Loading

mostynb commented Jan 30, 2022 •

edited

Loading

Compression: Further specify ByteStream WriteResponse committed_size field for compressed blobs #212

Compression: Further specify ByteStream WriteResponse committed_size field for compressed blobs #212

Comments

bduffany commented Jan 27, 2022 • edited Loading

mostynb commented Jan 30, 2022 • edited Loading

bduffany commented Jan 27, 2022 •

edited

Loading

mostynb commented Jan 30, 2022 •

edited

Loading