Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AoC PR: lib: performance and API improvements for stdin and reader #2343

Merged
merged 1 commit into from
Dec 11, 2023

Conversation

fridis
Copy link
Member

@fridis fridis commented Dec 9, 2023

There are three changes in this patch:

  1. big preformance improvements by avoiding list iteration
  2. smaller performance improvements by reading several bytes at once
  3. additioonal convenience APIs for reading a file fully

Replaced while (mutate.flat_map_sequence u8 x->x).count < n by n_read < n. This was by far the biggest performance issue when reading bytes since it required time in O(n²).

Replaced drop in reader.discard by slice which is O(1) and does not cause deep recursion.

Add a version of io.buffered.reader.read with an argument max_n that gives the desired number of bytes to be read.

io.stdin.read_provider.read no longer restrics the number of bytes to just one, but tries to read count bytes. This change currently breaks the C backend since the C intrinsic apparently attempts to read all bytes until count or end-of-file is reached. The C backend should instead check the number of bytes that are available to read without blocking and read only those unless there are none available. Only then it should block until data is available and then read all available data up to the given limit of bytes.

I will create an issue for this C backend problem to be fixed later.

Added two convenience features io.buffered.read_fully and io.buffered.read_lines to read all bytes into a large array or read all lines of a file into an array of Strings, respectively.

There are three changes in this patch:

 1. big preformance improvements by avoiding list iteration
 2. smaller performance improvements by reading several bytes at once
 3. additioonal convenience APIs for reading a file fully

Replaced `while (mutate.flat_map_sequence u8 x->x).count < n` by `n_read <
n`. This was by far the biggest performance issue when reading bytes since it
required time in O(n²).

Replaced `drop` in `reader.discard` by `slice` which is O(1) and does not cause
deep recursion.

Add a version of io.buffered.reader.read with an argument max_n that gives the
desired number of bytes to be read.

io.stdin.read_provider.read no longer restrics the number of bytes to just one,
but tries to read count bytes. This change currently breaks the C backend since
the C intrinsic apparently attempts to read all bytes until count or end-of-file
is reached.  The C backend should instead check the number of bytes that are
available to read without blocking and read only those unless there are none
available. Only then it should block until data is available and then read all
available data up to the given limit of bytes.

I will create an issue for this C backend problem to be fixed later.

Added two convenience features `io.buffered.read_fully` and
`io.buffered.read_lines` to read all bytes into a large array or read all lines
of a file into an array of Strings, respectively.
@michaellilltokiwa michaellilltokiwa merged commit b3bfdc5 into main Dec 11, 2023
5 checks passed
@michaellilltokiwa michaellilltokiwa deleted the lib_stdin_and_reader_improvements branch December 11, 2023 09:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants