ripgrep doesn't stop when its pipe is closed #200

BurntSushi · 2016-10-29T12:58:03Z

For example, the following two rg commands take the same amount of time, but the second one should be much shorter:

[andrew@Cheetah subtitles] ls -lh OpenSubtitles2016.raw.en 
-rw-r--r-- 1 andrew users 9.3G Sep 10 11:51 OpenSubtitles2016.raw.en
[andrew@Cheetah subtitles] time rg 'Sherlock Holmes' OpenSubtitles2016.raw.en | wc -l
5107

real    0m1.602s
user    0m1.250s
sys     0m0.350s
[andrew@Cheetah subtitles] time rg 'Sherlock Holmes' OpenSubtitles2016.raw.en | head -n1
You read Sherlock Holmes to deduce that?

real    0m1.626s
user    0m1.247s
sys     0m0.377s

The text was updated successfully, but these errors were encountered:

BurntSushi · 2016-10-30T02:06:35Z

This one isn't going to be fun to fix. This is what I get for ignoring any errors that occur when printing to stdout. (And I really should know better, I handle this correctly in xsv.) The issue is that a write to stdout in this case will fail with a pipe error, at which point, we should stop searching and quit.

The primary difficulty at present is bubbling up an error from the printing code all the way through the search code. Both the search/print code assume no errors can happen. We could just thread an error through everything.

My question for this in terms of UX is: do we treat all IO errors equally when writing output? Should we do something different if we see a pipe error versus, say, a permission error? Maybe any type of error causes ripgrep to stop whatever it's doing, but a pipe error indicates normal termination where as anything else results in a non-zero exit code and the error being printed to stderr.

BurntSushi · 2016-10-30T02:09:09Z

I might elect to forgo fixing this until I factor the search code out into a separate crate. (Which will be a while. My guess is at least a month.)

danr · 2017-02-08T09:57:09Z

I upgraded yesterday from rg 0.3.2-1 to 0.4 and now these commands are essentially useless:

rg --files | head
rg --files | fzf

This is too bad, since I use rg+fzf in my workflow.

BurntSushi · 2017-02-18T21:37:10Z

@danr There was no regression. This bug was in 0.3.2 as well.

"useless" sounds like a bit of an exaggeration.

danr · 2017-02-27T09:37:48Z

Sorry, I really did not mean to sound harsh. Thank you for your work on ripgrep!

kpp · 2017-03-31T19:45:41Z

This is how grep acts:

$ strace grep Holmes subtitles/OpenSubtitles2016.raw.en 2>grep.log  | head -n 1

...
read(3, "s\nWell, Watson, what about him?\n"..., 32768) = 32768
write(1, "Roy Holmes.\nThen Simon Harrison "..., 4096) = 4096
read(3, "me, because he married against h"..., 32768) = 32768
...
read(3, "uch.\nWell, you built a real nice"..., 32768) = 32768
write(1, " a swamp.\nMr. Holmes, do not rea"..., 4096) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=11211, si_uid=1000} ---
+++ killed by SIGPIPE +++

This is how rg acts:

$ strace rg Holmes subtitles/OpenSubtitles2016.raw.en 2>rg.log  | head -n 1 

...
write(1, "Roy Holmes.\n", 12)           = 12
write(1, "Then Simon Harrison killing Barr"..., 80) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=11310, si_uid=1000} ---
write(1, "Then Simon Harrison killing Barr"..., 80) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=11310, si_uid=1000} ---
write(1, "Then Simon Harrison killing Barr"..., 80) = -1 EPIPE (Broken pipe)
--- SIGPIPE {si_signo=SIGPIPE, si_code=SI_USER, si_pid=11310, si_uid=1000} ---
...

BurntSushi · 2017-03-31T19:47:39Z

@kpp Right, this bug is understood, it's just a pain to fix. Please do not fix it. I would like to fix it personally when I refactor this code into a separate crate.

kpp · 2017-03-31T19:50:49Z

grep is faster than rg! This issue is the proof! All hail grep! grep, grep, grep!

glandium · 2017-03-31T22:07:10Z

Actually, what that strace says is that grep does an explicit "panic" of some sort (it raises SIGPIPE) when it gets a EPIPE error from write. It doesn't bubble the error up or anything, which is what your proposed fix needs refactoring for.

BurntSushi · 2017-03-31T22:14:06Z

@glandium Right. Rust's standard library (IIRC) suppresses the SIGPIPE signal so that consumers need to explicitly handle it. The downside of that design is precisely that bugs like this occur, but the upside is that you get a bit more control. I was just being stupid when I wrote down the initial printer code. :-)

oconnor663 · 2017-08-24T19:56:33Z

Could we just libc::signal(libc::SIGPIPE, libc::SIG_DFL) somewhere early in main(), if we wanted an easy workaround for the short term? Not sure what the Windows equivalent is though. Maybe the more portable thing would be to std::process::exit on write errors? Gross in library code, but anyway just until the error plumbing is there.

See BurntSushi#200.

See #200.

BurntSushi · 2017-08-27T19:01:52Z

@oconnor663 Graciously submitted a PR implementing the libc::signal idea, which means this is fixed for now on Unix. I'm going to leave this open to track a proper fix.

junegunn · 2017-09-27T03:36:57Z

@BurntSushi Hi, any plans for a patch release including the fix? Many users, myself included, would want to install ripgrep using Homebrew/Linuxbrew, and use it with a secondary filter like fzf.

BurntSushi · 2017-09-27T10:27:21Z

@junegunn Sure, I can do that. Hopefully soon.

junegunn · 2017-10-23T02:37:12Z

Confirmed fixed in 0.7.1 on macOS. Thanks.

Since BurntSushi/ripgrep#200 is fixed in 0.7.1, we can safely suggest ripgrep as the candidate generator as it has a more precise implementation of gitignore filtering than the silver searcher.

dpnova · 2017-11-01T01:20:09Z

@junegunn it's still an issue for me on 0.7.1 on ubuntu - did you use a binary release?

Using this command:

export FZF_DEFAULT_COMMAND='rg --files --hidden --follow --glob "!.git/*" --glob "!target/*"'

BurntSushi · 2017-11-01T01:27:25Z

@dpnova Could you please provide a reproducible example on a corpus that is public without using fzf? I've tested this myself on Linux and it works fine.

dpnova · 2017-11-01T02:04:10Z

Just confirming... to repro I should be able to run

rg --files --hidden --follow --glob "" --glob '!target/*' in the same folder I'm starting vim (with the fzf.vim plugin)

dpnova · 2017-11-01T10:06:40Z

FWIW I can't repro without fzf. My test case is simply running the command. I'm sure I'm missing something though.

pbogut · 2017-11-01T10:17:12Z

Do you have any big file in your repo? I had this problem when there was like 2GB sql file in my folder. Once I've added this file to .gitignore it started to work without an issue.

BurntSushi · 2017-11-01T11:06:40Z

@pbogut The rg command in question is using --files, which means it isn't searching files.

@dpnova I don't see how that is a complete test case. Could you please provide more details? This bug doesn't impact the correctness of ripgrep (so whether you're able to "run" an rg command or not is not relevant). Rather, you need to check whether rg ... | head -n1 is noticeably faster than simply rg .... For this to make any sense at all, the rg ... command needs to run over a directory tree that is somewhat larger, otherwise you're unlikely to see a difference anyway.

For example, if I run rg --files | wc -l in a checkout of the Chromium repository (git://github.com/nwjs/chromium.src), then it takes 0.320 seconds to complete. But if I run rg --files | head -n1 | wc -l in the same repository, then it takes 0.023 seconds to complete. Before this fix landed, the latter command would always take the same amount of time as the former command because ripgrep wouldn't quit when its output pipe closed.

People, please, I'm begging you. Don't simply stop at "it doesn't work." Describe what you observe in as much detail as possible so that other people can diagnose your problem. Please, understand that not everyone uses FZF, so saying, "here's this FZF config and it doesn't work" will not get us anywhere.

dpnova · 2017-11-01T21:39:50Z

Sorry I wasn't clear enough that I was asking for help to repro, I didnt get a response, so I tried something. Now I have a concrete case, thanks.

I'm currently waiting for the chromium repo to clone (yay Australian broadband).

In my own repo where I'm seeing the issue in fzf.vim the two cases look like this:

( rg --files; )  0.01s user 0.01s system 129% cpu 0.017 total
( rg --files | head -n1; )  0.01s user 0.01s system 119% cpu 0.015 total

dpnova · 2017-11-01T21:49:28Z

This is running it in my home directory (fairly new formatted machine).

( rg --files; )  0.10s user 0.16s system 116% cpu 0.222 total
( rg --files | head -n1; )  0.00s user 0.00s system 116% cpu 0.008 total

To me this says this specific github issue isn't the case I'm talking about, despite the fzf github referencing this one. I'll take my discussion back over there. Sorry for any confusion.

BurntSushi · 2017-11-02T11:54:47Z

@dpnova I agree with your conclusion. :-) Thanks for sticking it out and confirming that this particular bug isn't it!

ghost · 2017-12-04T07:46:40Z

@BurntSushi : Can you have a look at this issue that use rg with fzf?. I can't find a way to reproduce it without fzf.
junegunn/fzf.vim#539

junegunn · 2017-12-04T08:32:27Z

@tuyenpm9 If you read the thread, you can see that this issue is not related to your problem.

ghost · 2017-12-04T08:38:10Z

Yes, sorry about that.

This commit updates the CHANGELOG to reflect all the work done to make libripgrep a reality. * Closes #162 (libripgrep) * Closes #176 (multiline search) * Closes #188 (opt-in PCRE2 support) * Closes #244 (JSON output) * Closes #416 (Windows CRLF support) * Closes #917 (trim prefix whitespace) * Closes #993 (add --null-data flag) * Closes #997 (--passthru works with --replace) * Fixes #2 (memory maps and context handling work) * Fixes #200 (ripgrep stops when pipe is closed) * Fixes #389 (more intuitive `-w/--word-regexp`) * Fixes #643 (detection of stdin on Windows is better) * Fixes #441, Fixes #690, Fixes #980 (empty matching lines are weird) * Fixes #764 (coalesce color escapes) * Fixes #922 (memory maps failing is no big deal) * Fixes #937 (color escapes no longer used for empty matches) * Fixes #940 (--passthru does not impact exit status) * Fixes #1013 (show runtime CPU features in --version output)

bpstahlman · 2020-04-21T00:01:26Z

Although the fix definitely seems to have helped, ripgrep can still take a long time to notice SIGPIPE. I haven't looked at the source, but the behavior I'm seeing leads me to suspect that ripgrep doesn't notice the pipe has closed until the next time it attempts to write to it. This is problematic in a long-running search that has entered a phase in which matches are found infrequently (or not at all). I first noticed the problem with an rg | fzf pipeline on a large directory tree, but I can reproduce it with a simple shell script that just forwards its stdin to stdout after setting up a signal trap that allows me to tell it when to close the pipe. If I instruct the script to close when ripgrep is finding lots of matches, ripgrep terminates almost immediately, but if I wait until ripgrep is no longer finding matches (but hasn't yet finished the search), the pipeline continues to run (presumably until ripgrep finds another match or the search is complete). Obviously, the time during which the pipeline is effectively hung is highly dependent on the search parameters and the size of the directory tree being searched. Given that some very common use cases for ripgrep involve pipelines (e.g., vim $(rg --files-with-matches foo | fzf)), this seems like a significant issue. Does the Rust framework make it inordinately difficult to handle SIGPIPE asynchronously?

BurntSushi · 2020-04-21T00:49:37Z

@bpstahlman It would help if you could provide a more concrete reproduction that I can try. With that said, the behavior you're describing makes sense and it's what I would expect to happen.

but the behavior I'm seeing leads me to suspect that ripgrep doesn't notice the pipe has closed until the next time it attempts to write to it

That is correct and expected. That's when a pipe error occurs:

ripgrep/crates/core/main.rs

Lines 95 to 98 in 73103df

    
           // A broken pipe means graceful termination. 
        
           if err.kind() == io::ErrorKind::BrokenPipe { 
        
               break; 
        
           }

this seems like a significant issue

AFAIK, you are the first one to report this as a significant problem.

Does the Rust framework make it inordinately difficult to handle SIGPIPE asynchronously?

ripgrep does not use any Rust "framework."

For more context, Rust by default ignores SIGPIPE: rust-lang/rust#62569 (I make an appearance in that thread asking for something to be done, but there hasn't been any movement on that issue AFAIK.)

This means that pipe errors are only detected once an actual write occurs. At that point, the pipe error is reported "in band" instead of as a signal.

It is trivial to stop ignoring SIGPIPE. This would make ripgrep behave like a "normal" C UNIX application. SIGPIPE gets sent to the process, and since there is no signal handler for it, the process (by default) will terminate immediately. This likely achieves the behavior you want.

ripgrep currently does not do that because it's not portable. I'm not a Windows expert, but AFAIK, there is no SIGPIPE on Windows. So ripgrep has to deal with in-band pipe errors correctly anyway for compatibility with Windows. Dealing with these types of errors has been subtle and difficult to get right. For that reason, I don't really want to support both in-band (synchronous) and out-of-band (asynchronous) ways of terminating on pipe errors. Because now I won't be dog-fooding the handling of in-band pipe errors on Unix.

bpstahlman · 2020-04-21T15:29:08Z

I understand your reasoning and appreciate the detailed explanation. I suspect the reason for the lack of complaints regarding the status quo is that for small to medium-sized projects with suitable .gitignore files, the ripgrep search will often complete before the fzf user has finished making his file selection. And if fzf occasionally appears to hang after Enter is pressed, a user is unlikely to bother reporting unless it happens often or the delays are exceptionally long. I don't recall noticing the issue until I started using ripgrep on my home directory (which is fairly large and doesn't have a .gitignore).

As for the desire to avoid maintaining both in and out-of-band SIGPIPE handling logic... I wouldn't think it would be necessary to remove the "in-band" logic to add Linux-only "out-of-band" SIGPIPE handling. The "in-band" logic needed for Windows could still be tested in a Linux build compiled without the "out-of-band" logic, while the "out-of-band" logic would simply be compiled out of the Windows build. E.g.,

Windows Release: in-band only
Linux Release: out-of-band / in-band (optional)
Linux for Windows Test: in-band only

BurntSushi · 2020-04-21T15:53:12Z

I wouldn't think it would be necessary to remove the "in-band" logic

It isn't. The fact is that now two error paths need to be tested. That was my point.

bpstahlman · 2020-04-25T16:32:09Z

@BurntSushi @junegunn

This means that pipe errors are only detected once an actual write occurs. At that point, the pipe error is reported "in band" instead of as a signal.

It is trivial to stop ignoring SIGPIPE. This would make ripgrep behave like a "normal" C UNIX application. SIGPIPE gets sent to the process, and since there is no signal handler for it, the process (by default) will terminate immediately. This likely achieves the behavior you want.

Having looked into this a bit more, I'm not convinced that a change to SIGPIPE handling would have any impact on the behavior I'm seeing. IIUC, SIGPIPE isn't even generated until the writer process attempts to write to the closed pipe, which is too late. And I'm not even sure there's a way for a writer process to check for a closed pipe without attempting to write. I've tried using select(), fstat(), write() of 0 bytes, etc., and the only thing that gave any indication that stdout had closed was an attempt to write actual data to it. I would have expected there would be some way - even if it involved a relatively costly system call - for a writer process to check the status of a pipe without writing unwanted data if it happens to be open.

I'm not sure exactly how the fzf Vim plugin creates the rg|fzf pipeline, but I doubt it's possible for the plugin to force early termination, since it would have know way of knowing when the user's selection has been made. The fzf executable knows when the selection has been made, but may not have a clean way to kill the process at the write end of its pipeline. Ah well, perhaps the solution is to ensure that everything I want to search with rg|fzf has a good .ignore file...

Quaddroo · 2023-11-23T06:29:45Z

I believe some people that find this issue (in relation to rg | fzf) will find their woes relieved by this issue:
junegunn/fzf#2288
check the "process substitution" answer. Fixed what I wanted.

BurntSushi added the bug A bug. label Oct 29, 2016

BurntSushi mentioned this issue Dec 1, 2016

High CPU usage when piping into less? #258

Closed

magikid mentioned this issue Dec 18, 2016

[PENDING] Manual interruption #281

Closed

BurntSushi modified the milestone: libripgrep Jan 10, 2017

BurntSushi added the libripgrep An issue related to modularizing ripgrep into libraries. label Jan 11, 2017

BurntSushi mentioned this issue Jan 14, 2017

Many file skipped for unknown reasons #318

Closed

BurntSushi mentioned this issue Mar 15, 2017

Should stop when output pipe is closed #409

Closed

oconnor663 added a commit to oconnor663/ripgrep that referenced this issue Aug 27, 2017

restore the default SIGPIPE behavior as a temporary workaround

252dd97

See BurntSushi#200.

oconnor663 mentioned this issue Aug 27, 2017

restore the default SIGPIPE behavior as a temporary workaround #586

Merged

BurntSushi pushed a commit that referenced this issue Aug 27, 2017

restore the default SIGPIPE behavior as a temporary workaround

3065a8c

See #200.

junegunn mentioned this issue Sep 27, 2017

Significant delay when opening the selected file from fzf panel in bash junegunn/fzf#1064

Closed

15 tasks

junegunn mentioned this issue Oct 2, 2017

Suggestion for documentation junegunn/fzf#1067

Closed

3 tasks

This was referenced Oct 14, 2017

Update README to add _fzf_compgen_dir in addition to _fzf_compgen_path junegunn/fzf#1083

Closed

Do you have a plan to support ripgrep? junegunn/fzf.vim#468

Closed

junegunn mentioned this issue Dec 4, 2017

Soft kill terminal buffer junegunn/fzf.vim#539

Open

3 tasks

mdko mentioned this issue Jun 28, 2018

Zombie Processes Generated on File Name Generation junegunn/fzf#1324

Closed

15 tasks

BurntSushi mentioned this issue Aug 19, 2018

libripgrep: PCRE2 support, multiline search, JSON output and more #1017

Merged

BurntSushi closed this as completed in #1017 Aug 20, 2018

BurntSushi mentioned this issue Jan 22, 2020

Better interaction with pagers (Broken pipe) crev-dev/cargo-crev#287

Closed

rirze mentioned this issue Feb 29, 2020

exit on broken pipe lotabout/skim#279

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ripgrep doesn't stop when its pipe is closed #200

ripgrep doesn't stop when its pipe is closed #200

BurntSushi commented Oct 29, 2016

BurntSushi commented Oct 30, 2016 •

edited

Loading

BurntSushi commented Oct 30, 2016

danr commented Feb 8, 2017

BurntSushi commented Feb 18, 2017

danr commented Feb 27, 2017

kpp commented Mar 31, 2017 •

edited

Loading

BurntSushi commented Mar 31, 2017

kpp commented Mar 31, 2017

glandium commented Mar 31, 2017

BurntSushi commented Mar 31, 2017

oconnor663 commented Aug 24, 2017 •

edited

Loading

BurntSushi commented Aug 27, 2017

junegunn commented Sep 27, 2017

BurntSushi commented Sep 27, 2017

junegunn commented Oct 23, 2017

dpnova commented Nov 1, 2017 •

edited

Loading

BurntSushi commented Nov 1, 2017

dpnova commented Nov 1, 2017

dpnova commented Nov 1, 2017

pbogut commented Nov 1, 2017

BurntSushi commented Nov 1, 2017 •

edited

Loading

dpnova commented Nov 1, 2017 •

edited

Loading

dpnova commented Nov 1, 2017

BurntSushi commented Nov 2, 2017

ghost commented Dec 4, 2017 •

edited by ghost

Loading

junegunn commented Dec 4, 2017

ghost commented Dec 4, 2017

bpstahlman commented Apr 21, 2020

BurntSushi commented Apr 21, 2020 •

edited

Loading

bpstahlman commented Apr 21, 2020

BurntSushi commented Apr 21, 2020

bpstahlman commented Apr 25, 2020

Quaddroo commented Nov 23, 2023

ripgrep doesn't stop when its pipe is closed #200

ripgrep doesn't stop when its pipe is closed #200

Comments

BurntSushi commented Oct 29, 2016

BurntSushi commented Oct 30, 2016 • edited Loading

BurntSushi commented Oct 30, 2016

danr commented Feb 8, 2017

BurntSushi commented Feb 18, 2017

danr commented Feb 27, 2017

kpp commented Mar 31, 2017 • edited Loading

BurntSushi commented Mar 31, 2017

kpp commented Mar 31, 2017

glandium commented Mar 31, 2017

BurntSushi commented Mar 31, 2017

oconnor663 commented Aug 24, 2017 • edited Loading

BurntSushi commented Aug 27, 2017

junegunn commented Sep 27, 2017

BurntSushi commented Sep 27, 2017

junegunn commented Oct 23, 2017

dpnova commented Nov 1, 2017 • edited Loading

BurntSushi commented Nov 1, 2017

dpnova commented Nov 1, 2017

dpnova commented Nov 1, 2017

pbogut commented Nov 1, 2017

BurntSushi commented Nov 1, 2017 • edited Loading

dpnova commented Nov 1, 2017 • edited Loading

dpnova commented Nov 1, 2017

BurntSushi commented Nov 2, 2017

ghost commented Dec 4, 2017 • edited by ghost Loading

junegunn commented Dec 4, 2017

ghost commented Dec 4, 2017

bpstahlman commented Apr 21, 2020

BurntSushi commented Apr 21, 2020 • edited Loading

bpstahlman commented Apr 21, 2020

BurntSushi commented Apr 21, 2020

bpstahlman commented Apr 25, 2020

Quaddroo commented Nov 23, 2023

BurntSushi commented Oct 30, 2016 •

edited

Loading

kpp commented Mar 31, 2017 •

edited

Loading

oconnor663 commented Aug 24, 2017 •

edited

Loading

dpnova commented Nov 1, 2017 •

edited

Loading

BurntSushi commented Nov 1, 2017 •

edited

Loading

dpnova commented Nov 1, 2017 •

edited

Loading

ghost commented Dec 4, 2017 •

edited by ghost

Loading

BurntSushi commented Apr 21, 2020 •

edited

Loading