Use of RecyclableMemoryStreamManager with Confluent.Kafka #342

NicolaAtorino · 2024-04-29T22:41:10Z

Hello, i would like to use the RecyclableMemoryStream manager in order to handle serialization for producing messages using the c# Kafka client Confluent.

This is the interface that confluent exposes :

public interface ISerializer<T> { byte[] Serialize(T data, SerializationContext context); }

since this requires a simple byte array to return, i am not sure there are benefits in using the RecyclableMemoryManager.

An implementation i tried was this one :

public byte[] Serialize(T data, SerializationContext context)
{
        if (data == null) return null;
        using var stream = manager.GetStream();
        MemoryPackSerializer.Serialize(stream as IBufferWriter<byte>, data, MemoryPackSerializerOptions.Utf16);
        var ros = stream.GetReadOnlySequence();
        return ros.ToArray();   
}

the documentation says that calling stream.ToArray() defeats the purpose of the library. Is it the same for GetReadOnlySequence.ToArray() ?

And another question :
90% of the cases, the resulting stream after serialization has a size of around 2 megabytes. What would be in that case the optimal configuration of BlockSize and LargeBufferMultiple ? I'm trying to understand properly how to configure but i am not able to define clearly these settings.

any help is appreciated. Thank you very much

The text was updated successfully, but these errors were encountered:

benmwatson · 2024-05-02T16:37:08Z

That's really unfortunate for that interface. Anything that takes a pure byte array is just going to be inefficient and almost certainly require copying the bytes to a new array.

That doesn't necessarily mean RMS won't help you. Maybe there is significant internal work that can be covered, and you'll still get some benefit, even with the built-in interface.

To get the full benefit, you'd have to find a way to use a different interface that took a byte range instead. A new version of the library, something that takes Span? completely different serialization interface/method?

I hesitate to provide optimal settings for people because so much depends on more than just buffer size. It's how it's used, how many simultaneous usages, whether you want to avoid LOH allocs, and more.

Certainly, if most of the cases end up being 2 MB, you could "collapse" the two types of buffers into one flat pool and just use a 2 MB (or larger) block size. That would avoid a memory copy whenever you need that full buffer. A little different than the original use case, but I think it could work fine.

Only real answer is to measure and see!

NicolaAtorino · 2024-05-22T19:54:36Z

Thanks for your answer. Unfortunately we cannot change the interface, but I could try and open a request on the confluent repo to see if this would be even doable. Even if the interface is changed, if internally the system requires a byte array (and it may, since that's a wrapper around a C++ library), we still wouldn't get very far.

Thanks for the tips - will continue exploring.

benmwatson closed this as completed May 30, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use of RecyclableMemoryStreamManager with Confluent.Kafka #342

Use of RecyclableMemoryStreamManager with Confluent.Kafka #342

NicolaAtorino commented Apr 29, 2024 •

edited

Loading

benmwatson commented May 2, 2024

NicolaAtorino commented May 22, 2024

Use of RecyclableMemoryStreamManager with Confluent.Kafka #342

Use of RecyclableMemoryStreamManager with Confluent.Kafka #342

Comments

NicolaAtorino commented Apr 29, 2024 • edited Loading

benmwatson commented May 2, 2024

NicolaAtorino commented May 22, 2024

NicolaAtorino commented Apr 29, 2024 •

edited

Loading