Fix 'fs.io.readInputStreamGeneric' overallocation of underlying buffers #3318

seigert · 2023-10-11T15:32:45Z

Current implementation of fs2.io.readInputStream allocates new array of chunkSize size for every invocation of InputStream#read(..). The problem is that read spec does not require InputStream implementations to write provided buffer fully or until stream exhaustion.

This leads to situation where if readInputStream chunk size is big (megabytes) and underlying input stream innner 'chunks' are small (bytes or kilobytes), returned Stream[F, Byte] are very 'sparse' where in every chunk only small amount of allocated capacity is actually used.

It could be shown easily by combining fs2.io.toInputStream with fs2.io.readInputStream.

This PR fixes this by reusing leftovers of allocated Array[Byte] for consequent 'InputStream' reads until either buffer is fully written or stream is exhausted. Like current implementation fs2.Stream chunks are published with each successful InputStream#read invocation but may share underlying array buffer.

armanbilge

Thanks for the PR! Unfortunately I do not think this is the right way to solve this problem. The concern is what if there is actually not enough data available on the InputStream at a particular moment, but it is not yet closed. E.g. consider interacting with an external process over stdin/stdout.

~~Have you seen this issue?~~

readInputStream(in, chunkSize) allocates a full chunkSize for each read operation #3106

~~I proposed a couple different ideas to solve it in #3106 (comment).~~

Ah, I'm sorry, I missed your edit. This is interesting :)

Like current implementation fs2.Stream chunks are published with each successful InputStream#read invocation but may share underlying array buffer.

armanbilge · 2023-10-11T17:37:46Z

chunks ... may share underlying array buffer.

I guess my only concern with this approach is that a single Chunk that is retained for a long time may still have a surprisingly large memory footprint due to a large backing array. I'm not sure how realistic this is in practice :)

To avoid that would require allocating an appropriately sized array and copying into that, as suggested in #3106 (comment).

seigert · 2023-10-12T08:35:48Z

@armanbilge, thanks for your input!

The issue I'm trying to resolve is that we have network data in size of megabytes and, naturally, we assume that buffer of 1MiB would be good enough. Only problem is that underlying InputStream#read decided to write data in 1KiB chunks. And suddenly per every mere megabyte of data we have whooping 1 Gigabyte of allocated RAM that only 0.1% populated.

What's worse, it is really hard to diagnose: at first everything works, maybe using more memory than expecting (it's JVM after all), then suddenly there is OOM error, then you spend some time to find a leak, except there's no leaks, and only finally you find that memory is full of fs2.Chunk.ArraySlice of 1MiB each with only 1 first kilo written.

To fix this currently we need to put custom Pull or something like .chunkN(chunkSize).map(_.compact).unchunks after every readInputStream. :(

I guess my only concern with this approach is that a single Chunk that is retained for a long time may still have a surprisingly large memory footprint due to a large backing array. I'm not sure how realistic this is in practice :)

To avoid that would require allocating an appropriately sized array and copying into that, as suggested in #3106 (comment).

I think situation of 'short data streams' if not less frequent then at least much more detectable than issue above.

To replicate issue above you will need 1024 unconsumed streams of 1KiB each and I would say that it is more question if wrongly guessing median data size.

Also, use of copy introduces some questions:

What should we do in unsafeReadInputStream? Copy, write into buffer from zero index each time, write in circular pattern?
What should be the ratio when we copy? 1:1000, 1:10, 1:2?
With every copy we introduce additional allocation and pointer indirection. Yes, currently we allocate new ArraySlice too but at least backing array is the same. Maybe we could even optimise chunk ++ to detect concat of two nearest slices with common backing array (maybe .compact already does that?). This way unchunkN, unchunkMin, etc won't add another nesting level.

I would argue that this implementation tries its best to translate user intention: 'write data from stream into continuous memory regions of chunkSize bytes'.

…process completion

mpilquist · 2023-10-27T13:21:32Z

I like this approach and I'm curious if we should do something similar for TCP sockets.

armanbilge reviewed Oct 11, 2023

View reviewed changes

seigert force-pushed the fix/io-read_input_stream-overallocation branch from 96e2787 to aea8f85 Compare October 12, 2023 10:03

seigert requested a review from armanbilge October 18, 2023 12:09

seigert force-pushed the fix/io-read_input_stream-overallocation branch from ae7ced1 to 71bd067 Compare October 20, 2023 14:18

seigert added 4 commits October 20, 2023 18:09

fix 'fs.io.readInputStreamGeneric' overallocation of underlying buffers

df5ede1

fix 'Process' stdout interop as this stream would not exhaust before …

cb4834d

…process completion

fix bincompat

3747c5c

add test for buffer reuse

4e10816

seigert force-pushed the fix/io-read_input_stream-overallocation branch from 71bd067 to 4e10816 Compare October 20, 2023 15:19

Merge branch 'main' into fix/io-read_input_stream-overallocation

ca5b901

mpilquist merged commit 93c87e9 into typelevel:main Oct 28, 2023
15 checks passed

christianharrington mentioned this pull request Nov 7, 2023

readInputStream(in, chunkSize) allocates a full chunkSize for each read operation #3106

Closed

seigert mentioned this pull request Mar 20, 2024

Draft: reuse allocated buffer in stream reads for TCP/Unix sockets #3411

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix 'fs.io.readInputStreamGeneric' overallocation of underlying buffers #3318

Fix 'fs.io.readInputStreamGeneric' overallocation of underlying buffers #3318

seigert commented Oct 11, 2023 •

edited

Loading

armanbilge left a comment •

edited

Loading

armanbilge commented Oct 11, 2023 •

edited

Loading

seigert commented Oct 12, 2023 •

edited

Loading

mpilquist commented Oct 27, 2023

Fix 'fs.io.readInputStreamGeneric' overallocation of underlying buffers #3318

Fix 'fs.io.readInputStreamGeneric' overallocation of underlying buffers #3318

Conversation

seigert commented Oct 11, 2023 • edited Loading

armanbilge left a comment • edited Loading

Choose a reason for hiding this comment

armanbilge commented Oct 11, 2023 • edited Loading

seigert commented Oct 12, 2023 • edited Loading

mpilquist commented Oct 27, 2023

seigert commented Oct 11, 2023 •

edited

Loading

armanbilge left a comment •

edited

Loading

armanbilge commented Oct 11, 2023 •

edited

Loading

seigert commented Oct 12, 2023 •

edited

Loading