Skip to content

walk/git: split ls-files and stat paths concurrently#694

Open
Mic92 wants to merge 2 commits into
mainfrom
git-walker-parallel
Open

walk/git: split ls-files and stat paths concurrently#694
Mic92 wants to merge 2 commits into
mainfrom
git-walker-parallel

Conversation

@Mic92
Copy link
Copy Markdown
Member

@Mic92 Mic92 commented May 3, 2026

walk/git: run ls-files --cached / --others separately and stat in parallel
When --cached and --others are passed together, git only flushes output
after the untracked scan is done. On nixpkgs that is ~1.5s before the
first formatter can be spawned. After that we still Lstat every path
sequentially.

So run the two ls-files queries concurrently (the index-only one streams
right away) and do the Lstat calls in a worker pool.

On nixpkgs with --no-cache and nixfmt-rs this brings the wall time from
3.26s to 1.79s on a 64-core machine; the first formatter spawns after
80ms instead of 1564ms.

The same paths are emitted, just no longer in a deterministic order. As
far as I can tell nothing in the scheduler depends on that.

Mic92 added 2 commits May 3, 2026 22:23
…allel

When --cached and --others are passed together, git only flushes output
after the untracked scan is done. On nixpkgs that is ~1.5s before the
first formatter can be spawned. After that we still Lstat every path
sequentially.

So run the two ls-files queries concurrently (the index-only one streams
right away) and do the Lstat calls in a worker pool.

On nixpkgs with --no-cache and nixfmt-rs this brings the wall time from
3.26s to 1.79s on a 64-core machine; the first formatter spawns after
80ms instead of 1564ms.

The same paths are emitted, just no longer in a deterministic order. As
far as I can tell nothing in the scheduler depends on that.
When the format loop exits early (Ctrl+C, or `on-unmatched = fatal`),
walker.Close() is called while producers are still blocked on send --
git/jj on a full 64 KB stdout pipe (~1.6k paths), the filesystem walker
on a full filesCh -- so errgroup.Wait() hangs until kill -9.

Have Close() fire a cancellation signal first: a context for the git/jj
readers (also kills the child via exec.CommandContext) and a done
channel that filepath.Walk selects against. Regression tests overflow
the buffer and assert Close() returns without Read() being called.
@Mic92
Copy link
Copy Markdown
Member Author

Mic92 commented May 3, 2026

I noticed that, while working on nixfmt-rs and treefmt was slower than it should be...

Copy link
Copy Markdown
Collaborator

@jfly jfly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When --cached and --others are passed together, git only flushes output
after the untracked scan is done. On nixpkgs that is ~1.5s before the
first formatter can be spawned.

I'm not sure I'm able to reproduce this behavior. On my laptop, the entire scan of nixpkgs finishes in a lot less than 1.5s:

~/src/github.com/NixOS/nixpkgs❯ python time-to-first-line.py git ls-files --cached --others --exclude-standard --stage
0.01 ms: Starting git ls-files --cached --others --exclude-standard --stage
147.87 ms (+147.86 ms): I got a line of output
166.88 ms (+19.01 ms): Command returned 52473 more lines and exited with code 0

I do see that removing --others is a lot faster, and perhaps has some effect on buffering:

~/src/github.com/NixOS/nixpkgs❯ python time-to-first-line.py git ls-files --cached --exclude-standard --stage
0.01 ms: Starting git ls-files --cached --exclude-standard --stage
11.75 ms (+11.74 ms): I got a line of output
32.86 ms (+21.12 ms): Command returned 52473 more lines and exited with code 0
time-to-first-line.py
import subprocess
import sys
import shlex
import time

NANOS_PER_MILLI = 10**6

def pretty_elapsed(elapsed_ns: int) -> str:
    millis = elapsed_ns / NANOS_PER_MILLI
    return f"{millis:.2f} ms"

start = time.perf_counter_ns()
prev_log_ns : int|None = None

def log(msg: str):
    global prev_log_ns
    now = time.perf_counter_ns()
    if prev_log_ns is not None:
        delta = now - prev_log_ns
        delta_suffix = f" (+{pretty_elapsed(delta)})"
    else:
        delta_suffix = ""

    print(f"{pretty_elapsed(now - start)}{delta_suffix}: {msg}")
    prev_log_ns = now

cmd = sys.argv[1:]
log(f"Starting {shlex.join(cmd)}")
p = subprocess.Popen(
    cmd,
    text=True,
    stdout=subprocess.PIPE,
)

assert p.stdout is not None
first_line = p.stdout.readline()
log("I got a line of output")

stdout, _ = p.communicate()
exit_code = p.wait()
log(f"Command returned {len(stdout.splitlines())} more lines and exited with code {exit_code}")
assert exit_code == 0

Comment thread walk/git.go

// `--cached` and `--others` are queried separately because git buffers all
// output until the untracked scan finishes when both are combined; the
// index-only query streams immediately so formatters start without waiting.
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why? Is this an issue that could be fixed upstream in git? It seems like it would be a lot cleaner to fix there then the contortions we're going through here.

At the least, this claim needs some sort of citation so future readers can check if it still applies.

Copy link
Copy Markdown
Collaborator

@jfly jfly left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just skimmed the "walk: unblock producers when Close() is called before EOF" commit. I was hoping we could land it independently of the walk split logic, but it looks like it touches some of the same code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants