Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

For many small stream writes, RecyclableMemoryStream is slower than other implementations #339

Open
mregen opened this issue Mar 27, 2024 · 4 comments

Comments

@mregen
Copy link

mregen commented Mar 27, 2024

For a project which implements its own buffered memory stream I added the int Read(Span<byte> buffer) and void Write(ReadOnlySpan<byte> buffer) signatures and compared the performance of MemoryStream, ArraySegmentStream (custom based on ArrayPool) and RecyclableMemoryStream.

To my surprise, RecyclableMemoryStream is always outperformed by MemoryStream and ArraySegmentStream by a factor of two, as if the Write(ReadOnlyMemory<byte>) is not properly implemented or disabled in RecyclableMemoryStream?

see benchmark results here: OPCFoundation/UA-.NETStandard#2556

see results of BinaryEncoderRecyclableMemoryStream vs BinaryEncoderArraySegmentStream and BinaryEncoderMemoryStream.

Any idea what could be wrong in the benchmark or why RecyclableMemoryStream.Write is always slower?

@sgorozco
Copy link

sgorozco commented Mar 28, 2024 via email

@mregen
Copy link
Author

mregen commented Mar 28, 2024

Thanks @sgorozco, this is in fact whats happening here, a BinaryWriter writes a lot of little chunks (e.g. int, byte, etc) for a OPC UA binary encoder. The Write implementation should only copy the small chunks into the buffer, keep two indexes (one for the actual buffer, one for the write pointer into the actual buffer). An overflow may split the write and allocates a new buffer or moves to the next buffer. Thats it. I would be interested to see why a division is necessary.

Update -- I checked:

--> GetBlockAndRelativeOffset uses a division.

I guess its a design decision to make write faster or the calculation of the position, or what to use as the source of truth.
My expectation were that the Write is almost as fast as the Write in the MemoryStream base class. Wishful thinking :-)

@mregen
Copy link
Author

mregen commented Apr 2, 2024

As an additional sidenote, turns out that in my JSONEncoder Tests where writing uses StreamWriter in contrary to BinaryWriter above, RecyclableMemoryStream outperforms all other stream implementations... this is currently our main use case so technically I'm a happy camper with the current implementation.

@benmwatson
Copy link
Member

Yeah, the issue with lots of small writes is something that has come up a few times before. Unfortunately, it just wasn't optimized for that scenario. The workaround would be to operate directly on the buffer, but not all APIs will support that, and that does defeat the purpose of using a stream.

A way to fix this would be to change the source of truth for the current position. Currently, we store a long position, that we then convert to block/offset on every necessary operation, which involves, as you noted, two divisions. We could change this to track the block/offset instead, and then when we need the position, do the multiply+add, which should be much faster. Alternatively, we could just track both values and keep them in sync, but I suspect this doesn't offer any additional benefit (we'd be doing the multiply+add anyway, or a lot of individual adds).

I'd have to think about any negative performance ramifications (e.g., calling Position is much more expensive), but perhaps it's not so bad. I'll ponder it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants