-
Notifications
You must be signed in to change notification settings - Fork 75
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Encode_into might not write data at the intended offset location if buffer is too small. #697
Comments
This is the currently intended (or at least not unintended) behavior. In my mind there are 3 main use cases for
buffer = bytearray(1024)
for msg in msgs:
enc.encode_into(msg, buffer)
socket.sendall(msg)
buffer = bytearray(64)
encoder.encode_into(msg, buffer, 4) # offset for prefix, but 4 is less than 64 and both are hardcoded sizes (not data dependent)
n = len(buffer) - 4
buffer[:4] = n.to_bytes(4, "big")
socket.sendall(buffer)
3. Accumulating several writes into the same buffer, perhaps with a delimiter. In this case you're always writing to the end of the buffer, so offset is -1
```python
buffer = bytearray()
for msg in msgs:
encoder.encode_into(msg, buffer, -1)
buffer.append("\n") # perhaps we're writing line-delimited json That all said, I can see how the current behavior would be surprising, and would be open to changing it to automatically grow the buffer to support the specified Out of curiousity, what was your use case that led to finding this issue? |
The use case was actually case 2.
I just forgot to add the size on one of my buffers. Python has no problem doing Since both just work, it was quite hard to pinpoint where my message got "corrupted". A mention about this in the docs would've saved me some time here. For context i'm writing an ASGI RPC app (and client) using msgspec for serialisation. |
Hmmm, in that case maybe we should error instead if |
I can see benefits for both options. Expanding the buffer will "Just work" and I can't really see a use case where you might not want that to happen, especially since On the other hand going by the zen of python:
and
If anyone ever has a use case where they don't want this expanding to happen, it might be hard to figure out what is going on. |
Description
When encoding into a buffer at an offset that is larger than the buffer size, the the msgpack
Encoder.encode_into()
function will start writing atoffset = -1
instead.Example:
Notice it started at offset 2 instead of offset 4.
Expected behaviour:
encode_into
starts writing at the requested offset, no matter how large the buffer currently is.Note
Just mentioning this in the docs as intended behaviour is fine for me as longs as it's documented somewhere.
The text was updated successfully, but these errors were encountered: