AoC PR: lib: performance and API improvements for stdin and reader #2343

fridis · 2023-12-09T22:59:56Z

There are three changes in this patch:

big preformance improvements by avoiding list iteration
smaller performance improvements by reading several bytes at once
additioonal convenience APIs for reading a file fully

Replaced while (mutate.flat_map_sequence u8 x->x).count < n by n_read < n. This was by far the biggest performance issue when reading bytes since it required time in O(n²).

Replaced drop in reader.discard by slice which is O(1) and does not cause deep recursion.

Add a version of io.buffered.reader.read with an argument max_n that gives the desired number of bytes to be read.

io.stdin.read_provider.read no longer restrics the number of bytes to just one, but tries to read count bytes. This change currently breaks the C backend since the C intrinsic apparently attempts to read all bytes until count or end-of-file is reached. The C backend should instead check the number of bytes that are available to read without blocking and read only those unless there are none available. Only then it should block until data is available and then read all available data up to the given limit of bytes.

I will create an issue for this C backend problem to be fixed later.

Added two convenience features io.buffered.read_fully and io.buffered.read_lines to read all bytes into a large array or read all lines of a file into an array of Strings, respectively.

There are three changes in this patch: 1. big preformance improvements by avoiding list iteration 2. smaller performance improvements by reading several bytes at once 3. additioonal convenience APIs for reading a file fully Replaced `while (mutate.flat_map_sequence u8 x->x).count < n` by `n_read < n`. This was by far the biggest performance issue when reading bytes since it required time in O(n²). Replaced `drop` in `reader.discard` by `slice` which is O(1) and does not cause deep recursion. Add a version of io.buffered.reader.read with an argument max_n that gives the desired number of bytes to be read. io.stdin.read_provider.read no longer restrics the number of bytes to just one, but tries to read count bytes. This change currently breaks the C backend since the C intrinsic apparently attempts to read all bytes until count or end-of-file is reached. The C backend should instead check the number of bytes that are available to read without blocking and read only those unless there are none available. Only then it should block until data is available and then read all available data up to the given limit of bytes. I will create an issue for this C backend problem to be fixed later. Added two convenience features `io.buffered.read_fully` and `io.buffered.read_lines` to read all bytes into a large array or read all lines of a file into an array of Strings, respectively.

fridis requested a review from michaellilltokiwa December 9, 2023 22:59

michaellilltokiwa approved these changes Dec 11, 2023

View reviewed changes

michaellilltokiwa merged commit b3bfdc5 into main Dec 11, 2023
5 checks passed

michaellilltokiwa deleted the lib_stdin_and_reader_improvements branch December 11, 2023 09:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AoC PR: lib: performance and API improvements for stdin and reader #2343

AoC PR: lib: performance and API improvements for stdin and reader #2343

fridis commented Dec 9, 2023

AoC PR: lib: performance and API improvements for stdin and reader #2343

AoC PR: lib: performance and API improvements for stdin and reader #2343

Conversation

fridis commented Dec 9, 2023