Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Client support for ReadOnlySequence as payload #1918

Closed
wants to merge 5 commits into from

Conversation

mregen
Copy link

@mregen mregen commented Jan 27, 2024

Draft prototype to natively support ReadOnlySequence as a payload in the MQTTnet client.

The MQTTNet write method only accepts an ArraySegment which is a view into a single buffer. It is currently necessary to call ToArray to support a RecyclableMemoryStream.
As a result, perfomance is affected by additional copies and memory is fragment by the additional random allocation of a big buffer.

Please provide the feedback if and how such an enhancement can be accepted by the project.

see #1917 with the issue explained.

@mregen mregen marked this pull request as ready for review February 7, 2024 08:40
@mregen
Copy link
Author

mregen commented Feb 7, 2024

note: intentionally not fixed all tests, until there is some feedback from the maintainers that this is a workable approach

@mregen mregen marked this pull request as draft February 11, 2024 07:26
@chkr1011
Copy link
Collaborator

chkr1011 commented Feb 14, 2024

@mregen You mentioned that the library is currently doing additional copies. But I was not able to identify them. The entire packet is sent in basically two parts (the header and the payload). There are not combined together before sending. Only if packet fragmentation is not enabled, they got combined. So, does your change only affect the payload itself (you changed the type of the payload) or am I lost? 😄

Update: I was on the completely wrong track 😵‍💫 . I assume you have the data you want to publish in the form of a ReadOnlySequence and for this library you have to convert it into an ArraySegment which will require the additional allocation you want to avoid right?

@mregen
Copy link
Author

mregen commented Feb 14, 2024

Hi @chkr1011, thanks for taking a look!

Update: I was on the completely wrong track 😵‍💫 . I assume you have the data you want to publish in the form of a ReadOnlySequence and for this library you have to convert it into an ArraySegment which will require the additional allocation you want to avoid right?

Yes, exactly, the additional ToArray is required outside the library to be able to feed the ReadOnlySequence buffers to the existing API. ToArray or RecyclableMemoryStream.GetBuffer causes a memory allocation and a copy, if the stream size exceeds the buffer size. Our pipeline produces OPC UA PubSub compliant payload with an option to compress. The size of the packets varies by small amounts, in the 10s to 100s of kb ranges. By using the fixed size buffers of the RecyclableMemoryStream the implementation can become quite efficient with regards to buffer reuse and memory fragmentation. But only if all parties support the ReadOnlySequence end to end.

In this PR I just try to outline the changes which were required to support the alternate payload. I would like to have some guidance what would be an acceptable implementation for the project to support, e.g. should it continue to support ArraySegment as payload (imho yes, otherwise everything needs to be wrapped in a ReadOnlySequence).
Another idea could be a batching mode which allows to send multiple payload chunks after the header.
Also which tests/benchmarks we should provide etc.

Another option were to get it into the next major release, as proposed in #1923.

Thanks,
Martin

@chkr1011
Copy link
Collaborator

@mregen In my opinion we should not drop support for an array segment right now. From what I already see in the PR we should be able to make it work mostly by adding some platform specific code. The only challenge I see is dealing with the class MqttApplicationMessage. It is part of the public API and thus changing that will lead to breaking changes. Is there a way to keep it compatible with some implicit operators etc? Or do you see another way of making it non-breaking?

@mregen
Copy link
Author

mregen commented Feb 15, 2024

Hi @chkr1011, I fully agree with your suggestion,

It is part of the public API and thus changing that will lead to breaking changes. Is there a way to keep it compatible with some implicit operators etc?

let me do some research and I get back to you.

@mregen
Copy link
Author

mregen commented Feb 20, 2024

Hi @chkr1011, I ended up a little head scratching as to how to add this to the data structures without having too much impact in the API and the codebase and keep the efficiency of the current implementation.

Previously the change from PayLoad to PayloadSegment was done in a way that both options are still available.
So my current thinking would be to expose the PayloadSequence as the third option in MqttApplicationMessage, because it is only useful for large buffers and PayloadSegment may still be the most efficient use for many cases. Enforcing the conversion to ReadOnlySequence may be possible here but doesn't seem right to avoid extra conversions.
But the use of PayloadSegment and PayloadSequence must be mutually exclusive. If somebody picks to do cross conversions by setting a sequence but reading a segment, they may end up with buffer conversion penalties, but thats then app developer who needs to be aware.
From a standpoint of breaking API its really not much, an existing App would still work as before by just using the Payload or PayloadSegment.

Internally its then the question how to forward the buffer(s) to e.g. MQTTPublishPacket.
I'd also prefer to pass along PayloadSegment and PayloadSequence´and let the writer choose the available one.

I haven't thought and looked into a subscriber support. Currently it can not produce a PayloadSegment and it maybe difficult without MemoryStream to support it in the subscriber.

What do you think?

@chkr1011
Copy link
Collaborator

chkr1011 commented Mar 2, 2024

Having a new property at the MqttApplicationMessage sounds good to me. For the upcoming version 5 we can delete all other variations of the payload because we may have several breaking changes anyways.

Regarding the internal handling: What do you think about having either the segment or the sequence as the payload (MqttPublishPacket) by using #if NET_50_OR_GREATER etc.? This is probably more code to develop but also a preparation for version 5 of this library so that we can simply delete the "old" code.

@mregen
Copy link
Author

mregen commented Mar 5, 2024

Hi @chkr1011 , thanks for your guidance.
In MqttPublishPacket sounds good to only carry PayloadSequence for newer .NET versions.
I'm fine with your suggestion and hopefully provide an update soon.

@chkr1011
Copy link
Collaborator

chkr1011 commented May 7, 2024

Closing this PR because we can address this issue with version 5 of this library which will not require converting other data types properly. The ReadOnlySequence will become the default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants