Protobuf deserializer is allocating unnecessary memory #1701

MichalBrylka · 2021-10-16T10:07:31Z

Description

I'd like to address issue in DeserializeAsync metod of ProtobufDeserializer class:
https://github.com/confluentinc/confluent-kafka-dotnet/blob/master/src/Confluent.SchemaRegistry.Serdes.Protobuf/ProtobufDeserializer.cs#L97

I like the idea that API for IAsyncDeserializer uses ReadOnlyMemory (or ReadOnlySpan for sync version). In deserialization code however there is at lot of unnecessary allocations:

data.ToArray();
using (var stream = new MemoryStream(array))
using (var reader = new BinaryReader(stream))

I understand that they make code more smooth-looking but please have a look at this benchmark:
https://github.com/nemesissoft/KafkaProtobufSyncOverAsyncPerf/blob/main/DeserializerBenchmarks.cs
I've attached my results:
https://github.com/nemesissoft/KafkaProtobufSyncOverAsyncPerf/blob/main/README.md

Could we improve this class by removing these allocations ? I've already proposed a quick solution:
https://github.com/nemesissoft/KafkaProtobufSyncOverAsyncPerf/blob/main/Deserializers/NonAllocProtobufDeserializer.cs

Sadly there is no SpanStream class that would make coding easier (though I plan to provide one in future) but solution I've proposed (span + position to mock stream read-advance behaviors) should be enough for now

I personally do not see benefit of having async deserializer just to call .ConfigureAwait(continueOnCapturedContext: false).GetAwaiter().GetResult(); but this overhead seems negligible. So even if we leave Deserialize method as async (or potentially add sync version as well), could we remove allocations ?

How to reproduce

Run provided benchmark or look at results:
https://github.com/nemesissoft/KafkaProtobufSyncOverAsyncPerf/blob/main/README.md

Checklist

Please provide the following information:

A complete (i.e. we can run it), minimal program demonstrating the problem. No need to supply a project file: see description
Confluent.Kafka nuget version: 1.8.1
Apache Kafka version: N/A
Client configuration: N/A
Operating system: Windows 10 but should not matter
Provide logs (with "debug" : "..." as necessary in configuration): N/A
Provide broker log excerpts: N/A
Critical issue.: NO

The text was updated successfully, but these errors were encountered:

MichalBrylka · 2021-10-18T08:47:02Z

as per data.ToArray(); this is copied from documentation:
Copies the contents of this read-only span into a new array. This heap allocates, so should generally be avoided, however it is sometimes necessary to bridge the gap with APIs written in terms of arrays.

MichalBrylka · 2021-11-02T09:28:13Z

Hello Team
would you like me to provide a pull request with a solution ?

MichalBrylka · 2022-04-23T09:52:06Z

hi, would you like to accept a PR with improved serializer and deserializer ?

MichalBrylka · 2022-05-30T13:31:11Z

Hi,
I took the liberty of creating an efficient deserializer for both sync and async version:
EfficientProtobufDeserializer

Would you like me to create a pull request for your codebase with such deserializer?

mhowlett · 2022-10-17T14:29:35Z

thanks. yes, we'd like to but need to prioritize reviewing etc.
at some point in the future, I expect we'll review many aspects of the serdes for performance, and consider this then.

MichalBrylka · 2022-10-17T14:58:43Z

@mhowlett let me know if/when I should prepare a PR. Code itself is ready

mhowlett added enhancement LOW labels Oct 17, 2022

bjornbouetsmith mentioned this issue Mar 28, 2023

Shave bytes #2020

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Protobuf deserializer is allocating unnecessary memory #1701

Protobuf deserializer is allocating unnecessary memory #1701

MichalBrylka commented Oct 16, 2021 •

edited

MichalBrylka commented Oct 18, 2021

MichalBrylka commented Nov 2, 2021

MichalBrylka commented Apr 23, 2022

MichalBrylka commented May 30, 2022

mhowlett commented Oct 17, 2022

MichalBrylka commented Oct 17, 2022

Protobuf deserializer is allocating unnecessary memory #1701

Protobuf deserializer is allocating unnecessary memory #1701

Comments

MichalBrylka commented Oct 16, 2021 • edited

Description

How to reproduce

Checklist

MichalBrylka commented Oct 18, 2021

MichalBrylka commented Nov 2, 2021

MichalBrylka commented Apr 23, 2022

MichalBrylka commented May 30, 2022

mhowlett commented Oct 17, 2022

MichalBrylka commented Oct 17, 2022

MichalBrylka commented Oct 16, 2021 •

edited