refactor(GitService): make getStatus faster#1265
refactor(GitService): make getStatus faster#1265arnestrickmann merged 2 commits intogeneralaction:mainfrom
Conversation
…tracking and make it fast by extraction diff from one api call instead of looped call
|
@anuragts is attempting to deploy a commit to the General Action Team on Vercel. A member of the Team first needs to authorize it. |
Greptile SummaryThis PR refactors Key changes:
Issue found:
Confidence Score: 3/5
|
| Filename | Overview |
|---|---|
| src/main/services/GitService.ts | Refactors getStatus to run two bulk git diff --numstat calls instead of 2N per-file calls, with a map-based O(1) lookup per entry. Significant performance improvement, but the batch command's rename-detection output format (old => new) doesn't match the new-path key used for lookup, causing additions/deletions to be silently zeroed for renamed+modified files. |
Sequence Diagram
sequenceDiagram
participant C as Caller
participant GS as getStatus
participant Git as git (child process)
participant FS as Filesystem
Note over C,FS: ── Before (O(n) spawns) ──
C->>GS: getStatus(taskPath)
GS->>Git: git status --porcelain
Git-->>GS: N status lines
loop for each of N files
GS->>Git: git diff --numstat --cached -- fileN
Git-->>GS: staged stats
GS->>Git: git diff --numstat -- fileN
Git-->>GS: unstaged stats
end
GS-->>C: GitChange[]
Note over C,FS: ── After (O(1) spawns) ──
C->>GS: getStatus(taskPath)
GS->>Git: git status --porcelain
Git-->>GS: N status lines
par staged batch
GS->>Git: git diff --numstat --cached
Git-->>GS: all staged stats
and unstaged batch
GS->>Git: git diff --numstat
Git-->>GS: all unstaged stats
end
GS->>GS: parseNumstatMap → stagedMap, unstagedMap
opt untracked files (in parallel)
GS->>FS: countFileNewlinesCapped (×M)
FS-->>GS: line counts
end
GS->>GS: entries.map → lookup in maps O(1)
GS-->>C: GitChange[]
Last reviewed commit: d5c2bc6
| const parseNumstatMap = (stdout: string): Map<string, { add: number; del: number }> => { | ||
| const map = new Map<string, { add: number; del: number }>(); | ||
| if (!stdout || !stdout.trim()) return map; | ||
| for (const line of stdout.trim().split('\n')) { | ||
| if (!line.trim()) continue; | ||
| const parts = line.split('\t'); | ||
| if (parts.length >= 3) { | ||
| const add = parts[0] === '-' ? 0 : parseInt(parts[0], 10) || 0; | ||
| const del = parts[1] === '-' ? 0 : parseInt(parts[1], 10) || 0; | ||
| const file = parts.slice(2).join('\t'); | ||
| const existing = map.get(file); | ||
| if (existing) { | ||
| existing.add += add; | ||
| existing.del += del; | ||
| } else { | ||
| map.set(file, { add, del }); | ||
| } | ||
| } | ||
| }; | ||
| } | ||
| return map; | ||
| }; |
There was a problem hiding this comment.
Renamed files with modifications will silently report 0 additions/deletions
When git diff --numstat --cached is invoked without a path filter, git's rename detection is active by default. For a staged rename-with-modification, the output line looks like:
5 3 old_file.ts => new_file.ts
or in the in-place format:
5 3 src/components/{OldButton => NewButton}.tsx
parseNumstatMap stores the key as the full third tab-field ("old_file.ts => new_file.ts" or "src/components/{OldButton => NewButton}.tsx"). The corresponding entries entry, however, has filePath = "new_file.ts" (parsed from git status --porcelain). Because stagedMap.get("new_file.ts") finds nothing, additions and deletions fall back to 0 + 0 = 0.
In the old per-file approach, git diff --numstat --cached -- new_file.ts forced git to emit just new_file.ts as the key, so the lookup always matched.
A fix is to normalise rename notation when inserting into the map:
// After: const file = parts.slice(2).join('\t');
let resolvedFile = file;
if (file.includes(' => ')) {
// "old => new" or "prefix/{old => new}/suffix"
resolvedFile = file
.replace(/\{[^}]+ => ([^}]+)\}/g, '$1') // in-place: {old => new}
.replace(/^.+ => (.+)$/, '$1'); // full rename: old => new
}
const existing = map.get(resolvedFile);
if (existing) { ... } else { map.set(resolvedFile, { add, del }); }Note: pure renames (zero content changes) produce 0\t0\t..., so the zero-fallback happens to be correct for them — but any rename-with-modification (which refactoring tools frequently produce) will show incorrect stats.
…urate file tracking
|
Thanks for the PR! @anuragts |
Problem
getStatus()spawns 2git diff --numstatprocesses per file in a sequential loop. 100 files = 200 processspawns, blocking the main thread.
Fix
#1264
Run 2 total
git diff --numstatcalls (staged + unstaged) for all files at once viaPromise.all(), parse outputinto a Map, then look up per file in O(1).
Benchmark
O(n) → O(1). All 391 tests pass.
Old -
old.mp4
New -
CleanShot.2026-03-04.at.07.04.45.mp4