New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Output all rows as single list in Protobuf format #16436
Comments
I would like to work on this |
Is there any new progress on this issue? @akuzm If not, could you assign this issue to me : ) I would like to work on it. |
Sure, your help will be very appreciated! |
I found that the gRPC interface may already contain multi-rows supported. Test by grpcurl:
Response:
We can decode the output base64 string:
It seems that the issue has been done. Is that right? @akuzm |
Hi @akuzm Just as I said above, the multi-rows format in gRPC has been implemented now. All rows are included into a single bytes field Don't you mean to say that we can split the output bytes array into |
@An-DJ We want to have this option in format level. |
Robert Schulze will be doing it. |
Created a GitHub account in the meantime. Feel free to assign to me. |
Just for my understanding: The docs for the Protobuf input/output format state that the schemafile for the current format Protobuf looks like this:
The table will be serialized as a sequence of messages (= one per row). Because messages are prefixed each with their byte size (as varint, i.e. "length-delimited" format), space is wasted unnecessarily. This becomes more painful when the ratio between the number of table rows and the number of columns grows. So the goal would be to add a format ProtobufList with a schemafile
which produces a list of rows within a single message, see akuzms first comment. We would save the repeated per-message size prefix. However, protobuf would still somehow need to discriminate Row-s in the serialized representation, and as far as I understand, this will be done using a standard 1-byte key before each Row which encodes the field id (1 for Row), and the wire type (I am not sure what is used for a composite structures like here, perhaps "Start Group" as per protobuf encoding documentation). As a result, the space savings will be less significant than desired. So my question would be if above proposed format is what you had in mind or something else? |
The schema for |
Currently
Protobuf
outputs each row as a separate message, but some consumers require a list.We could add a new format
ProtobufList
that does that.The text was updated successfully, but these errors were encountered: