Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read more fasterer #8552

Merged
merged 2 commits into from
Mar 13, 2022
Merged

Read more fasterer #8552

merged 2 commits into from
Mar 13, 2022

Conversation

faho
Copy link
Member

@faho faho commented Dec 14, 2021

We can't always read in chunks because we often can't bear to
overread:

echo foo\nbar | begin
    read -l foo
    read -l bar
end

needs to have the first read read foo and the second read bar. So
here we can only read one byte at a time.

However, when we are directly redirected:

echo foo | read foo

we can, because the data is only for us anyway. The stream will be
closed after, so anything not read just goes away. Nobody else is
there to read.

This dramatically speeds up read of long lines through a pipe. How
much depends on the length of the line.

With lines of 5000 characters it's about 15x, with lines of 50
characters about 2x, lines of 5 characters about 1.07x.

See #8542.

TODOs:

  • [N/A] Changes to fish usage are reflected in user documentation/manpages.
  • Tests have been added for regressions fixed
  • User-visible changes noted in CHANGELOG.rst

Tentatively scheduling this for after 3.4 because I am not entirely sure if the assumptions hold. Is there a case where we are directly redirected but the fd is used?

This was never a problem. If we need it, it's in the git log
We can't always read in chunks because we often can't bear to
overread:

```fish
echo foo\nbar | begin
    read -l foo
    read -l bar
end
```

needs to have the first read read `foo` and the second read `bar`. So
here we can only read one byte at a time.

However, when we are directly redirected:

```fish
echo foo | read foo
```

we can, because the data is only for us anyway. The stream will be
closed after, so anything not read just goes away. Nobody else is
there to read.

This dramatically speeds up `read` of long lines through a pipe. How
much depends on the length of the line.

With lines of 5000 characters it's about 15x, with lines of 50
characters about 2x, lines of 5 characters about 1.07x.

See fish-shell#8542.
@faho faho added the performance Purely performance-related enhancement without any changes in black box output label Dec 14, 2021
@faho faho added this to the fish-future milestone Dec 14, 2021
@faho
Copy link
Member Author

faho commented Dec 14, 2021

With lines of 5000 characters it's about 15x, with lines of 50
characters about 2x, lines of 5 characters about 1.07x.

The theoretical upper limit on this is 128x, because that's how many more bytes we read. I don't think it's likely to be reached because there are other factors in play and it would have to line up perfectly.

In practice the maximum I can seem to reach with an end-to-end benchmark of

for i in (seq NUM1)
    string repeat -n NUM2 a | read -l foo
end

is about 24x, with lines of 1 million bytes.

@faho
Copy link
Member Author

faho commented Dec 14, 2021

Alternative branch name: how-much-read-would-a-builtin-read-if-a-builtin-would-read-chunks.

And this also removes the lseek when we're directly redirected to a seekable fd. That's the case in read foo < file. Here the performance improves more for short lines (because it's just one thing less to do). For a 50 char line it's 10%, for a 5000 char one it's 1%. It shouldn't ever be slower.

@IlanCosman
Copy link
Contributor

Bwahahah, now there will never be a reason to use command substitution ever again! 😈

how-much-read-would-a-builtin-read-if-a-builtin-would-read-chunks

Alternative-alternative branch name, now with 33% more read + past tense read:
what-would-the-read-speed-of-builtin-read-be-if-builtin-read-read-chunks

Tentatively scheduling this for after 3.4

Noooooo, give performance now! 😋

@faho
Copy link
Member Author

faho commented Dec 14, 2021

It's important to note that this is a microbenchmark. It's only really a thing in really really tight loops, like in that benchmark.

In an actual useful fish script most of the time the time is spent in an external command. E.g. I believe this would not speed up fish_git_prompt any perceptible amount because that one is stuck waiting on git status when it takes any significant amount of time.

And when it is stuck inside fish, it's stuck in something like functions buffering output, heavy stringery in e.g. string or heavy file crunching in globs. Which is why things like e.g. #5635 or 6af3896 (plus the other recent glob performance work) are much more important for actual perceptible performance.

That's why this is rather low priority, and why I would rather wait a release to ensure there's not an edge case I've missed.

// under the assumption that the stream will be closed soon anyway.
// You don't rewind VHS tapes before throwing them in the trash.
// TODO: Do this when nchars is set by seeking back.
exit_res = read_in_chunks(streams.stdin_fd, buff, opts.split_null, !streams.stdin_is_directly_redirected);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like a reasonable change.

// TODO: Do this when nchars is set by seeking back.

I'm not sure how that will look like. The stream may not be seekable, so we can't actually lseek() back.
But the same assumption holds: we're draining the stream, so it's okay to spill.
Passing nchars to read_in_chunks seems ugly, so I'd probably just disable this optimization for read --nchars.

Is there a case where we are directly redirected but the fd is used?

Probably not today but in bash we can do something like

exec 3< README.rst
read line1 <&3
read line2 <&3

which would break if read too much.
That's a pretty low level API. In this case it's more elegant than a loop but it's also obscure and easy to misuse.
I don't expect that fish wants to expose fds directly. Would be interesting to know where this feature is used, but I wouldn't care about that now.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how that will look like. The stream may not be seekable, so we can't actually lseek() back.
But the same assumption holds: we're draining the stream, so it's okay to spill.

That would be the idea, yes. Either seek back to after the last byte of interest, or don't seek.

Passing nchars to read_in_chunks seems ugly, so I'd probably just disable this optimization for read --nchars.

Yeah it's also probably not super important for nchars, because there you most likely have a small n, and that's your linear factor. If you have an n of 5 you'd have a maximum of 5 read()s vs 1 read().

Probably not today but in bash we can do something like

Tbh I quite dislike how bash does it, especially how it abuses exec. However it is a case of fd reuse and we should keep that in mind, either by picking a different design that doesn't look directly redirected, or by detecting if it's redirected like that (e.g. checking if the redirection is to an explicit fd, or asking the streams object if the redirection is about to be closed after this).

I'm not sure if "directly redirected" is the absolute correct condition here - it's possible that e.g. we should be able to handle redirecting a block to another block, which (I think) buffers the output. I don't know off-hand if our buffers are seekable, but they theoretically could be, but probably not via lseek(). So reading directly from the fd might be too low-level, and maybe the right abstraction is to implement seeking on the different io types.

But it's a reasonable proxy for the simple case, and an easy win. And since it's purely an optimization, we can just walk it back in the worst case if we decide to enable additional semantics that prohibit it.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I believe this could be triggered via read <&0 already. I'll have to check later.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#5714. Squint and it looks like a feature.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What would be a better way to do this without "abusing" exec? I would like to see the feature land here someday but agree maybe something more intuitive could be found.

Copy link
Member

@floam floam Dec 17, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exec --man in ksh93:

If command is specified, then the current shell process will be replaced by
  command rather than running command and waiting for it to complete. Note that
  there is no need to use exec to enhance performance since the shell
  implicitly uses the exec mechanism internally whenever possible.

If no operands are specified, exec can be used to open or close files, or to
  manipulate file descriptors from 0 to 9 in the current shell environment
  using the standard redirection mechanism available with all commands. The
  close-on-exec flags will be set on file descriptor numbers greater than 2
  that are opened this way so that they will be closed when another program is
  invoked.

If I use their documentation as instruction, then for what this other modality under which exec is used, is basically for manipulating fds. I would suggest if we don't ever allow for the second form, we add a builtin fd or something. But I also think this is a somewhat rare, advanced thing to need to do, and maybe it's just simpler to follow the existing behaviors out there in the name of discoverability. Probably anybody needing to do this is already familiar.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exec --man in ksh93:

It's documented, sure. But how does redirecting arbitrary file descriptors belong to the "make this other command replace me" builtin?

It's two entirely disparate things to do! That's why I call it "abuse".

But anyway: This has very little to do with read, we have other issues open about it.

Copy link
Contributor

@mqudsi mqudsi Dec 26, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bash's exec n< foo can be subbed with mkfifo (and I actually have fish scripts that create, pipe to, and read from FIFOs for advanced parallel processing). Would that be an issue here?

nvm, @ridiculousfish already raised the same concern.

@ridiculousfish
Copy link
Member

This is clever but unfortunately it will run afoul of fifos as written, and perhaps some other exotic file-like objects. An example of how to reproduce a failure:

# terminal 1
mkfifo fifo
echo abc\ndef > fifo

#terminal 2
read var1 < fifo
read var2 < fifo

With this change, the first read reads both lines, so the second read hangs.

However I suspect it's safe with pipes, as pipes themselves buffer, so there's no way to stop a pipe from over-reading. Maybe we could make this safer by adding a stdin_is_piped bool, analogous to out_is_piped and err_is_piped, and only apply this if the stream is seekable or stdin is piped - ignoring file redirections.

@krobelus
Copy link
Member

krobelus commented Dec 19, 2021

With this change, the first read reads both lines, so the second read hangs.

This is expected: the second read already hung before, same in bash.
This is because fifos are just (named) pipes.
Opening a fifo for writing will block until someone else opens it for reading (and vice versa).
After that, a normal pipe is established for communication between the reader and writer processes, and no one else is supposed to tap the pipe.

Of course both readers are the same process but since we open and close the fifo in between, we can't share data.

We also have this problem with read_one_char_at_a_time()

# terminal 1
read a < fifo; read b < fifo; echo $a $b

# terminal 2
echo 1 > fifo; echo 2 > fifo

On my system, the first read occasionally consumes both lines.
I believe this is because

  1. the OS buffers pipe reads/writes, so even if we only read the first line, we might have consumed more.
  2. we don't close the fifo after the first echo
    We could probably try to fix that but the same thing happens with bash, so I don't think it's important.
    Would be great if we could somehow warn/error on such races but that seems very hard.

One interesting difference from bash is that "/bin/echo > fifo &" (without a reader) blocks in fish, while bash forks a process first.

It gets even trickier if one process uses a fifo in two places concurrently.
Commands like read a < fifo & read b < fifo are racy, we might try to warn/error in that case.

I'm not sure if "directly redirected" is the absolute correct condition here

Yeah, it feels close but probably not the exact criteria

@ridiculousfish
Copy link
Member

This is expected: the second read already hung before, same in bash. This is because fifos are just (named) pipes. Opening a fifo for writing will block until someone else opens it for reading (and vice versa). After that, a normal pipe is established for communication between the reader and writer processes, and no one else is supposed to tap the pipe.

Whoa, for years I had thought that FIFOs were always unbuffered at the kernel level, and that writes to a FIFO would block until there was a reader. TIL! Ok, then there's no behavior change here. I wonder why I saw inconsistent results.

@@ -366,3 +366,12 @@ function-scoped-read
# CHECK: $skamtebord[1]: |bar|
# CHECK: $craaab: set in local scope, unexported, with 1 elements
# CHECK: $craaab[1]: |baz|

echo foo\nbar\nbaz | begin
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest leaving a comment indicating that this is specifically testing chunked vs byte-by-byte reads, just to have pity on whoever needs to track down a problem at a later date :)

@mqudsi
Copy link
Contributor

mqudsi commented Dec 26, 2021

@krobelus

One interesting difference from bash is that "/bin/echo > fifo &" (without a reader) blocks in fish, while bash forks a process first.

See #7422 - I thought it was a more fundamental issue but it turns out it's just for saner error handling and we can technically patch this today.

@mqudsi
Copy link
Contributor

mqudsi commented Dec 26, 2021

Previous discussion on read performance issues caused by unbuffered reading here: #4972

@faho faho merged commit 9ada7d9 into fish-shell:master Mar 13, 2022
@faho faho modified the milestones: fish-future, fish 3.5.0 Mar 13, 2022
@faho
Copy link
Member Author

faho commented Mar 13, 2022

Merged now, for 3.5.0. If we find any issues with it it's trivial to back out.

@IlanCosman
Copy link
Contributor

Hmm yah, this did break stuff unfortunately 😞

begin
    echo 1
    echo 2
end | read -l --line foo bar
echo $foo $bar

on fish 3.4 prints 1 2. On fish, version 3.4.0-20-g97127c3e0, it prints 1.

faho added a commit that referenced this pull request Mar 14, 2022
@faho
Copy link
Member Author

faho commented Mar 14, 2022

Fixed. Yeah, --line is a bit of a mess. It's implemented in terms of reading one line per run through the loop n-times, and that doesn't "really" work if we're just doing direct buffering like that.

So let's just let it read byte-by-byte like it did before.

@faho faho deleted the read-more-fasterer branch May 24, 2022 10:14
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 29, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement performance Purely performance-related enhancement without any changes in black box output
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants