New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Go][Rust] Deserializing a Byte array to a schema #13853
Comments
That looks correct for retrieving the schema, but that byte payload looks incorrect. The first four bytes should be either a continuation indicator (0xFFFFFFFF) followed by the message length as a 32-bit integer, or just the message length as a 32-bit integer. In your example byte array, the first four bytes are the 32-bit integer @alamb Can you or someone on the arrow-rs side take a look here? |
I tried setting up the same process in a python script, but I don't know if I am doing this correctly either, I don't have much python experience.
But I get this error, which seems weird to me:
When I print the SchemaResult some object is definitely there( Is my code broken? |
This is the Rust code if anyone experienced with that decides to pitch in:
|
@agneborn98 It looks like it's an issue with the Rust server. I'm going to guess that they made a similar mistake to one I made with my original implementation of the Go Flight server/client. The original .proto file appeared to claim that the return from SchemaResult should just be the FlatBuffers Schema message on it's own. But the convention that was established by the C++ and Java implementations was that it should be a full IPC encoded schema message, so my original implementations didn't interact well with those until I fixed it to be a full IPC encoded message. It's possible the Rust server implementation makes the same mistake? Not sure. But given that you're getting a similar issue from two different flight client implementations, i'm inclined to think the issue is with the Rust server.... but I'm not familiar enough with rust to debug that. |
@agneborn98 Out of curiousity can you provide the schema you're serializing? I want to try serializing it with Go and comparing the byte slice to what you received |
@zeroshade I'll look into my rust implementation again, and see if I can fix it. The schema I am trying to serialize is:
|
So i tried serializing that schema myself in Go and got the following:
Note the To prove my case I took your original bytes, appended the bytes [255 255 255 255 0 0 0 0] to the end of them, then prepended it with [255 255 255 255] and the length of the byte slice, then deserialized the schema. here's the code and the result: data := []byte{16, 0, 0, 0, 0, 0, 10, 0, 14, 0, 12, 0, 11, 0, 4, 0, 10, 0,
0, 0, 20, 0, 0, 0, 0, 0, 0, 1, 4, 0, 10, 0, 12, 0, 0, 0, 8, 0,
4, 0, 10, 0, 0, 0, 8, 0, 0, 0, 8, 0, 0, 0, 0, 0, 0, 0, 3, 0, 0,
0, 136, 0, 0, 0, 52, 0, 0, 0, 4, 0, 0, 0, 148, 255, 255, 255,
16, 0, 0, 0, 20, 0, 0, 0, 0, 0, 0, 3, 16, 0, 0, 0, 206, 255, 255,
255, 0, 0, 1, 0, 0, 0, 0, 0, 5, 0, 0, 0, 118, 97, 108, 117, 101,
0, 0, 0, 192, 255, 255, 255, 28, 0, 0, 0, 12, 0, 0, 0, 0, 0, 0,
10, 32, 0, 0, 0, 0, 0, 0, 0, 0, 0, 6, 0, 8, 0, 6, 0, 6, 0, 0, 0,
0, 0, 1, 0, 0, 0, 0, 0, 0, 0, 0, 0, 9, 0, 0, 0, 116, 105, 109,
101, 115, 116, 97, 109, 112, 0, 0, 0, 16, 0, 20, 0, 16, 0, 0, 0,
15, 0, 4, 0, 0, 0, 8, 0, 16, 0, 0, 0, 24, 0, 0, 0, 32, 0, 0, 0, 0,
0, 0, 2, 28, 0, 0, 0, 8, 0, 12, 0, 4, 0, 11, 0, 8, 0, 0, 0, 32,
0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 3, 0, 0, 0, 116, 105, 100, 0}
data = append(data, []byte{255, 255, 255, 255, 0, 0, 0, 0}...)
v := uint32(len(data))
data = append([]byte{255, 255, 255, 255, 0, 0, 0, 0}, data...)
binary.LittleEndian.PutUint32(data[4:], v)
fmt.Println(flight.DeserializeSchema(data, memory.DefaultAllocator)) Output:
So it's definition an issue with the message coming from the Rust server |
Case definitely proven 😄 Thanks again! |
@agneborn98 You can create an issue on https://github.com/apache/arrow-rs or file a Jira card for this if you like! |
Thank you @zeroshade I have to check my own rust implementation to see if there is something I screwed up before I embarrass myself posting something that was easily solvable. In case the issue still persists, I'll return in a few days and post it on one of those forums. |
@zeroshade sounds like a very plausible explanation. @agneborn98 I don't fully all the details as I am not an IPC expert but filing an issue on https://github.com/apache/arrow-rs i think is the correct way to get people who are to be aware of the issue |
@alamb I'll try debugging my rust server and see if I've missed some option when converting my schema to an IPC message. |
why c++ java return the full IPC encoded schema message for the method
Hi @zeroshade , I from arrow-rs community and try to resolve apache/arrow-rs#2445 about the I think the gab between arrow-rs and c++/go is the different implementation for the |
@liukun4515 Slight correction: The C++/Go does not convert it to a full flight message, it just returns the full IPC Schema message bytes which can then be slotted into the flight SchemaResult protobuf. |
That means the c++/go add the prefix Is there some documents to explain this interface |
I think we can close this issue. Thanks @zeroshade @agneborn98 |
@zeroshade |
C++/Java I think actually had this incompatibility before, but it was fixed very early on, like in 0.12.0 or something, so there wasn't any consideration of backwards compatibility. You could just try to parse it twice (checking if the stream starts with the IPC continuation token or not to see if it's even worthwhile). |
@liukun4515 Go also fixed the incompatibly early on and so didn't have any consideration of backwards compatibility. I agree with @lidavidm that you should check for the continuation token or not and try parsing each way |
Did we file a Jira for this? We should add more methods to the integration tests to try to catch this kind of thing earlier. (Another thing that would be useful is to finally tackle ARROW-4419. C♯ exposed an issue that such a test would presumably have caught earlier.) |
Do you means that go just support the format with the continuation token. |
@liukun4515 what are you referring to? The Java code uses an IPC message with the continuation token. (Note that arrow/java/flight/flight-core/src/main/java/org/apache/arrow/flight/SchemaResult.java Lines 64 to 95 in 93b63e8
|
Filed ARROW-17568 |
@lidavidm @zeroshade I think i got it. It's a good opportunity to implement integration test for the RPC method. I can take the integration test for rust side for the rpc method. |
@zeroshade Sorry to interrupt. The data I read from the Java interface is 0xffffffff 0x00000300 0x00000010 0x000a0000. Does this mean that the length of the subsequent data is 768 bytes? |
@MicroGery It means that the length of the metadata (the flatbuffer bytes + padding to an 8-byte boundary) is 768 bytes. Once you decode the flatbuffer metadata, it'll contain the length of the body data as a member which should follow the 768 bytes you read. Does that makes sense? You can refer to the documentation for more description of interpreting the bytes |
I am trying to retrieve a schema for a flight descriptor from a server written in Rust onto a client written in Go.
To do this, I am using the
func DeserializeSchema
to convert it from the Byte array received from the server back to thearrow.Schema
format used by the client.I do this by doing the following:
The Byte array I receive looks good enough, I think:
[16 0 0 0 0 0 10 0 14 0 12 0 11 0 4 0 10 0 0 0 20 0 0 0 0 0 0 1 4 0 10 0 12 0 0 0 8 0 4 0 10 0 0 0 8 0 0 0 8 0 0 0 0 0 0 0 3 0 0 0 136 0 0 0 52 0 0 0 4 0 0 0 148 255 255 255 16 0 0 0 20 0 0 0 0 0 0 3 16 0 0 0 206 255 255 255 0 0 1 0 0 0 0 0 5 0 0 0 118 97 108 117 101 0 0 0 192 255 255 255 28 0 0 0 12 0 0 0 0 0 0 10 32 0 0 0 0 0 0 0 0 0 6 0 8 0 6 0 6 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 9 0 0 0 116 105 109 101 115 116 97 109 112 0 0 0 16 0 20 0 16 0 0 0 15 0 4 0 0 0 8 0 16 0 0 0 24 0 0 0 32 0 0 0 0 0 0 2 28 0 0 0 8 0 12 0 4 0 11 0 8 0 0 0 32 0 0 0 0 0 0 1 0 0 0 0 3 0 0 0 116 105 100 0]
Unfortunately, I get the following error:
stderr: 2022/08/11 15:21:10 arrow/ipc: unknown error while reading: runtime error: slice bounds out of range [655360:16]
Am I doing something wrong here? If so, what?
Thank you
The text was updated successfully, but these errors were encountered: