walk/git: split ls-files and stat paths concurrently#694
Conversation
…allel When --cached and --others are passed together, git only flushes output after the untracked scan is done. On nixpkgs that is ~1.5s before the first formatter can be spawned. After that we still Lstat every path sequentially. So run the two ls-files queries concurrently (the index-only one streams right away) and do the Lstat calls in a worker pool. On nixpkgs with --no-cache and nixfmt-rs this brings the wall time from 3.26s to 1.79s on a 64-core machine; the first formatter spawns after 80ms instead of 1564ms. The same paths are emitted, just no longer in a deterministic order. As far as I can tell nothing in the scheduler depends on that.
When the format loop exits early (Ctrl+C, or `on-unmatched = fatal`), walker.Close() is called while producers are still blocked on send -- git/jj on a full 64 KB stdout pipe (~1.6k paths), the filesystem walker on a full filesCh -- so errgroup.Wait() hangs until kill -9. Have Close() fire a cancellation signal first: a context for the git/jj readers (also kills the child via exec.CommandContext) and a done channel that filepath.Walk selects against. Regression tests overflow the buffer and assert Close() returns without Read() being called.
|
I noticed that, while working on nixfmt-rs and treefmt was slower than it should be... |
jfly
left a comment
There was a problem hiding this comment.
When --cached and --others are passed together, git only flushes output
after the untracked scan is done. On nixpkgs that is ~1.5s before the
first formatter can be spawned.
I'm not sure I'm able to reproduce this behavior. On my laptop, the entire scan of nixpkgs finishes in a lot less than 1.5s:
~/src/github.com/NixOS/nixpkgs❯ python time-to-first-line.py git ls-files --cached --others --exclude-standard --stage
0.01 ms: Starting git ls-files --cached --others --exclude-standard --stage
147.87 ms (+147.86 ms): I got a line of output
166.88 ms (+19.01 ms): Command returned 52473 more lines and exited with code 0I do see that removing --others is a lot faster, and perhaps has some effect on buffering:
~/src/github.com/NixOS/nixpkgs❯ python time-to-first-line.py git ls-files --cached --exclude-standard --stage
0.01 ms: Starting git ls-files --cached --exclude-standard --stage
11.75 ms (+11.74 ms): I got a line of output
32.86 ms (+21.12 ms): Command returned 52473 more lines and exited with code 0time-to-first-line.py
import subprocess
import sys
import shlex
import time
NANOS_PER_MILLI = 10**6
def pretty_elapsed(elapsed_ns: int) -> str:
millis = elapsed_ns / NANOS_PER_MILLI
return f"{millis:.2f} ms"
start = time.perf_counter_ns()
prev_log_ns : int|None = None
def log(msg: str):
global prev_log_ns
now = time.perf_counter_ns()
if prev_log_ns is not None:
delta = now - prev_log_ns
delta_suffix = f" (+{pretty_elapsed(delta)})"
else:
delta_suffix = ""
print(f"{pretty_elapsed(now - start)}{delta_suffix}: {msg}")
prev_log_ns = now
cmd = sys.argv[1:]
log(f"Starting {shlex.join(cmd)}")
p = subprocess.Popen(
cmd,
text=True,
stdout=subprocess.PIPE,
)
assert p.stdout is not None
first_line = p.stdout.readline()
log("I got a line of output")
stdout, _ = p.communicate()
exit_code = p.wait()
log(f"Command returned {len(stdout.splitlines())} more lines and exited with code {exit_code}")
assert exit_code == 0|
|
||
| // `--cached` and `--others` are queried separately because git buffers all | ||
| // output until the untracked scan finishes when both are combined; the | ||
| // index-only query streams immediately so formatters start without waiting. |
There was a problem hiding this comment.
Why? Is this an issue that could be fixed upstream in git? It seems like it would be a lot cleaner to fix there then the contortions we're going through here.
At the least, this claim needs some sort of citation so future readers can check if it still applies.
jfly
left a comment
There was a problem hiding this comment.
I just skimmed the "walk: unblock producers when Close() is called before EOF" commit. I was hoping we could land it independently of the walk split logic, but it looks like it touches some of the same code.
walk/git: run ls-files --cached / --others separately and stat in parallel
When --cached and --others are passed together, git only flushes output
after the untracked scan is done. On nixpkgs that is ~1.5s before the
first formatter can be spawned. After that we still Lstat every path
sequentially.
So run the two ls-files queries concurrently (the index-only one streams
right away) and do the Lstat calls in a worker pool.
On nixpkgs with --no-cache and nixfmt-rs this brings the wall time from
3.26s to 1.79s on a 64-core machine; the first formatter spawns after
80ms instead of 1564ms.
The same paths are emitted, just no longer in a deterministic order. As
far as I can tell nothing in the scheduler depends on that.