New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File upload (Blob streaming) #414

Closed
labkode opened this Issue Oct 25, 2015 · 13 comments

Comments

Projects
None yet
10 participants
@labkode

labkode commented Oct 25, 2015

Is it possible to easily stream binary blobs with gRPC ?

With an HTTP/1.1 server I can pipe the request.Body to a Writer easily.

The only way I've found to do a big blob steaming with gRPC is to create a Chunk protobuf message

message DataChunk {
    bytes data = 1;
}

and to create a server streaming endpoint.

Then, the client reads from a Reader and fills a little buffer. After, this little buffer is streamed trough the gRPC connection using the DataChunk message format.

It would be great to have the same Reader/Writer easiness with gRPC streaming.

I've tried to find some info about file upload /blob upload and the only resource I've found is this article.

It shows that file uploads are handled by HTTP and metadata handling by gRPC.

Is gRPC able/designed to handle these file uploads without using my poor approach ?

@dsymonds

This comment has been minimized.

Show comment
Hide comment
@dsymonds

dsymonds Oct 25, 2015

Contributor

gRPC isn't designed for arbitrary streams, only streams of messages. Your DataChunk is exactly what's supported.

Contributor

dsymonds commented Oct 25, 2015

gRPC isn't designed for arbitrary streams, only streams of messages. Your DataChunk is exactly what's supported.

@labkode

This comment has been minimized.

Show comment
Hide comment
@labkode

labkode Oct 25, 2015

@dsymonds Thanks for your quick response. We can close this issue.

labkode commented Oct 25, 2015

@dsymonds Thanks for your quick response. We can close this issue.

@labkode labkode closed this Oct 25, 2015

@c4milo

This comment has been minimized.

Show comment
Hide comment
@c4milo

c4milo Oct 30, 2016

Contributor

So the suggestion is not to use gRPC for uploading arbitrary blobs of data? is it better to use HTTP multipart requests?

-- Apologies for commenting on closed issues.

Contributor

c4milo commented Oct 30, 2016

So the suggestion is not to use gRPC for uploading arbitrary blobs of data? is it better to use HTTP multipart requests?

-- Apologies for commenting on closed issues.

@vitalyisaev2

This comment has been minimized.

Show comment
Hide comment
@vitalyisaev2

vitalyisaev2 Dec 26, 2016

Contributor

@c4milo I guess that everyone who need to transfer big objects via grpc will have to implement chuncking on the top of grpc streaming.

Contributor

vitalyisaev2 commented Dec 26, 2016

@c4milo I guess that everyone who need to transfer big objects via grpc will have to implement chuncking on the top of grpc streaming.

@c4milo

This comment has been minimized.

Show comment
Hide comment
@c4milo

c4milo Mar 6, 2017

Contributor

@vitalyisaev2 do you have any suggestion about how to do this? I came up with an approach I'm not feeling very proud of (using unsafe.Sizeof on unserialized messages and wild guesses).

Serializing beforehand in order to get the real size seems too wasteful since gRPC would encode the message again before sending.

Perhaps having a custom codec with a timer and flushing whenever it reaches a size limit or a timeout?

Contributor

c4milo commented Mar 6, 2017

@vitalyisaev2 do you have any suggestion about how to do this? I came up with an approach I'm not feeling very proud of (using unsafe.Sizeof on unserialized messages and wild guesses).

Serializing beforehand in order to get the real size seems too wasteful since gRPC would encode the message again before sending.

Perhaps having a custom codec with a timer and flushing whenever it reaches a size limit or a timeout?

@vitalyisaev2

This comment has been minimized.

Show comment
Hide comment
@vitalyisaev2

vitalyisaev2 Mar 6, 2017

Contributor

@c4milo I would suggest you to try the following approach:

service BlobKeeper {
    rpc Put (stream PutRequest) returns (PutResponse);
}

message PutRequest {
    message Key {
         string key = 1;
    }
    message Chunk {
          bytes data = 1;
          int64 position = 2;
    }
    oneof value {
        Key key = 1;
        Chunk chunk = 2;
    }
}

On the client side you'll have to split your data to particles and send it sequentially within a stream. Probably in the first message you should provide some kind of metadata, like data key or something like that. On the server side you can do whatever you want: yoy may buffer the chunks, or you can put them on disk immidiately.

The one's is for sure: you'll have to do a lot of manual work here, but it's quite straightforward.

Contributor

vitalyisaev2 commented Mar 6, 2017

@c4milo I would suggest you to try the following approach:

service BlobKeeper {
    rpc Put (stream PutRequest) returns (PutResponse);
}

message PutRequest {
    message Key {
         string key = 1;
    }
    message Chunk {
          bytes data = 1;
          int64 position = 2;
    }
    oneof value {
        Key key = 1;
        Chunk chunk = 2;
    }
}

On the client side you'll have to split your data to particles and send it sequentially within a stream. Probably in the first message you should provide some kind of metadata, like data key or something like that. On the server side you can do whatever you want: yoy may buffer the chunks, or you can put them on disk immidiately.

The one's is for sure: you'll have to do a lot of manual work here, but it's quite straightforward.

@c4milo

This comment has been minimized.

Show comment
Hide comment
@c4milo

c4milo Mar 6, 2017

Contributor

@vitalyisaev2 nice, I had something very similar. I'll keep using this approach then. Thank you!

Contributor

c4milo commented Mar 6, 2017

@vitalyisaev2 nice, I had something very similar. I'll keep using this approach then. Thank you!

@cirocosta

This comment has been minimized.

Show comment
Hide comment
@cirocosta

cirocosta Mar 8, 2017

Are there any infos about how people are doing this? Any libraries that already do this on top of grpc? Multipart, as suggested?
Thx!

cirocosta commented Mar 8, 2017

Are there any infos about how people are doing this? Any libraries that already do this on top of grpc? Multipart, as suggested?
Thx!

@matiasinsaurralde

This comment has been minimized.

Show comment
Hide comment
@matiasinsaurralde

matiasinsaurralde Jul 25, 2017

@cirocosta I'm interested on this as well!

matiasinsaurralde commented Jul 25, 2017

@cirocosta I'm interested on this as well!

@zhaohansprt

This comment has been minimized.

Show comment
Hide comment
@zhaohansprt

zhaohansprt Aug 11, 2017

i m in this too

zhaohansprt commented Aug 11, 2017

i m in this too

@Sicos1977

This comment has been minimized.

Show comment
Hide comment
@Sicos1977

Sicos1977 commented Aug 11, 2017

Same here

@tzutalin

This comment has been minimized.

Show comment
Hide comment
@tzutalin

tzutalin Aug 17, 2017

I try to implement chuncking in java and cpp, and write some sample code and protobuf. Hope it helps. Any feedback is appreciated.

tzutalin commented Aug 17, 2017

I try to implement chuncking in java and cpp, and write some sample code and protobuf. Hope it helps. Any feedback is appreciated.

@LakeCarrot

This comment has been minimized.

Show comment
Hide comment
@LakeCarrot

LakeCarrot Feb 6, 2018

@tzutalin
Never mind the previous post.
I have two more concerns,

  1. https://github.com/tzutalin/example-grpc/blob/0bde7f0ed4a9b0961429697307be88314734cbe5/java/src/main/java/UploadFileClient.java#L74
    For this line, we need to change to
ByteString byteString = ByteString.copyFrom(buffer, 0, tmp);

to make sure we didn't waste space and change the original file size.

  1. Another thing is for the file with multiple chunks to send, we need to wait for a while after sending each chunking, otherwise, the server will refuse to process the request.

LakeCarrot commented Feb 6, 2018

@tzutalin
Never mind the previous post.
I have two more concerns,

  1. https://github.com/tzutalin/example-grpc/blob/0bde7f0ed4a9b0961429697307be88314734cbe5/java/src/main/java/UploadFileClient.java#L74
    For this line, we need to change to
ByteString byteString = ByteString.copyFrom(buffer, 0, tmp);

to make sure we didn't waste space and change the original file size.

  1. Another thing is for the file with multiple chunks to send, we need to wait for a while after sending each chunking, otherwise, the server will refuse to process the request.

@lock lock bot locked as resolved and limited conversation to collaborators Sep 26, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.