Describe the feature
Motivation
Many high-performance serialization frameworks expose messages as multiple independent memory segments instead of a single contiguous buffer.
Examples include:
- Cap'n Proto (
to_segments() / from_segments())
- Apache Arrow
- FlatBuffers (advanced builders)
- Custom shared-memory allocators
Currently, zenoh-python primarily exposes payloads as a single bytes-like object. As a result, applications often need to:
- Serialize into multiple segments
- Copy and concatenate those segments into one contiguous buffer
- Publish through zenoh
- Potentially split them again on the receiving side
This introduces unnecessary memory copies and allocation overhead, especially for large messages and high-frequency data streams.
For robotics, simulation, perception, and ML workloads, payloads can easily reach several megabytes per sample, making these copies a significant bottleneck.
Proposed API
Expose a scatter-gather / multi-segment payload interface in zenoh-python.
Publisher Side
Example:
segments = msg.to_segments()
payload = zenoh.ZBytes.from_segments(
memoryview(seg) for seg in segments
)
pub.put(payload)
or
pub.put_segments([
memoryview(seg0),
memoryview(seg1),
memoryview(seg2),
])
Subscriber Side
Example:
def callback(sample):
segments = sample.payload.segments()
msg = MyCapnpType.from_segments(segments)
where each returned segment is exposed as a zero-copy Python buffer (memoryview or equivalent).
Benefits
-
Zero-copy integration
-
Enables efficient integration with:
Cap'n Proto
Apache Arrow
Shared-memory allocators
Custom serialization frameworks
Reduced memory bandwidth
-
Avoids unnecessary buffer concatenation and splitting.
-
Better SHM utilization
Fits naturally with zenoh shared-memory transports where payloads may already be represented as multiple buffers internally.
-
Alignment with Rust APIs
The Rust implementation already contains abstractions such as ZBytes and slice-based payload handling.
Exposing similar capabilities in Python would allow advanced users to leverage the same performance characteristics.
Lifetime and Ownership Considerations
One important aspect of this feature is the lifetime relationship between Python buffer objects and the underlying zenoh payload.
For publisher-side APIs, it should be explicit whether ZBytes.from_segments(...):
- Copies the input buffers immediately, or
- Retains references to the Python buffer owners until the payload is no longer needed by zenoh.
For zero-copy behavior, option 2 is preferred, but the API must guarantee that the referenced Python objects remain alive for the full lifetime of the constructed ZBytes object and any asynchronous send operation using it.
For example:
segments = msg.to_segments()
payload = zenoh.ZBytes.from_segments(segments)
pub.put(payload)
In this case, payload should keep the original segment owners alive until zenoh no longer needs the data.
Similarly, for subscriber-side APIs, if sample.payload.segments() returns memoryview objects, those memoryviews must keep the underlying payload alive for as long as the memoryviews are accessible.
A safe design could follow these principles:
ZBytes.from_segments(...) stores strong references to the Python buffer-exporting objects.
- The returned ZBytes owns or references those Python objects for its full lifetime.
pub.put(payload) must either synchronously complete the necessary handoff or retain the payload internally until transmission is complete.
sample.payload.segments() returns memoryviews whose base object keeps the Sample / ZBytes payload alive.
- The API should clearly document whether the returned views are valid only inside the callback or can outlive it.
This is especially important for integration with Python buffer protocol objects such as:
memoryview
bytes
bytearray
mmap
numpy.ndarray
torch.Tensor on CPU
Cap'n Proto segment buffers
shared-memory-backed arrays
Without clear ownership and lifetime guarantees, exposing a scatter-gather zero-copy API could lead to use-after-free bugs or hidden copies that defeat the purpose of the feature.
Additional Consideration
The proposal is not intended to define transport-level multipart semantics.
Applications that require stable message framing can still encode framing information in their payload format.
The goal is simply to expose payloads as multiple memory segments when possible, allowing applications to avoid unnecessary copies.
Use Case
One concrete example is Cap'n Proto:
Current path:
Capnp Segments
↓
Concatenate
↓
Python bytes
↓
zenoh.put()
Desired path:
Capnp Segments
↓
ZBytes / Segments
↓
zenoh.put()
and on the receiving side:
zenoh Payload Segments
↓
Capnp.from_segments()
allowing true end-to-end zero-copy operation.
Describe the feature
Motivation
Many high-performance serialization frameworks expose messages as multiple independent memory segments instead of a single contiguous buffer.
Examples include:
to_segments()/from_segments())Currently, zenoh-python primarily exposes payloads as a single bytes-like object. As a result, applications often need to:
This introduces unnecessary memory copies and allocation overhead, especially for large messages and high-frequency data streams.
For robotics, simulation, perception, and ML workloads, payloads can easily reach several megabytes per sample, making these copies a significant bottleneck.
Proposed API
Expose a scatter-gather / multi-segment payload interface in zenoh-python.
Publisher Side
Example:
or
Example:
where each returned segment is exposed as a zero-copy Python buffer (memoryview or equivalent).
Benefits
Zero-copy integration
Enables efficient integration with:
Cap'n Proto
Apache Arrow
Shared-memory allocators
Custom serialization frameworks
Reduced memory bandwidth
Avoids unnecessary buffer concatenation and splitting.
Better SHM utilization
Fits naturally with zenoh shared-memory transports where payloads may already be represented as multiple buffers internally.
Alignment with Rust APIs
The Rust implementation already contains abstractions such as ZBytes and slice-based payload handling.
Exposing similar capabilities in Python would allow advanced users to leverage the same performance characteristics.
Lifetime and Ownership Considerations
One important aspect of this feature is the lifetime relationship between Python buffer objects and the underlying zenoh payload.
For publisher-side APIs, it should be explicit whether
ZBytes.from_segments(...):For zero-copy behavior, option 2 is preferred, but the API must guarantee that the referenced Python objects remain alive for the full lifetime of the constructed ZBytes object and any asynchronous send operation using it.
For example:
In this case, payload should keep the original segment owners alive until zenoh no longer needs the data.
Similarly, for subscriber-side APIs, if
sample.payload.segments()returns memoryview objects, those memoryviews must keep the underlying payload alive for as long as the memoryviews are accessible.A safe design could follow these principles:
ZBytes.from_segments(...)stores strong references to the Python buffer-exporting objects.pub.put(payload)must either synchronously complete the necessary handoff or retain the payload internally until transmission is complete.sample.payload.segments()returns memoryviews whose base object keeps theSample/ZBytespayload alive.This is especially important for integration with Python buffer protocol objects such as:
memoryview
bytes
bytearray
mmap
numpy.ndarray
torch.Tensor on CPU
Cap'n Proto segment buffers
shared-memory-backed arrays
Without clear ownership and lifetime guarantees, exposing a scatter-gather zero-copy API could lead to use-after-free bugs or hidden copies that defeat the purpose of the feature.
Additional Consideration
The proposal is not intended to define transport-level multipart semantics.
Applications that require stable message framing can still encode framing information in their payload format.
The goal is simply to expose payloads as multiple memory segments when possible, allowing applications to avoid unnecessary copies.
Use Case
One concrete example is Cap'n Proto:
Current path:
Desired path:
and on the receiving side:
allowing true end-to-end zero-copy operation.