Skip to content

fix(diff): skip huge file rendering#266

Merged
benvinegar merged 1 commit intomainfrom
bentlegen/fix-large-untracked-width
May 9, 2026
Merged

fix(diff): skip huge file rendering#266
benvinegar merged 1 commit intomainfrom
bentlegen/fix-large-untracked-width

Conversation

@benvinegar
Copy link
Copy Markdown
Member

@benvinegar benvinegar commented May 9, 2026

Summary

  • Skip very large tracked and untracked file diffs before rendering them as full patch rows.
  • Render skipped large files as review-stream placeholders while preserving exact tracked stats and bounded/lower-bound untracked stats when a full count would be expensive.
  • Exclude skipped tracked files from the generated git diff patch so large changes do not slow startup.
  • Replace spread-based width measurement so large metadata arrays cannot overflow the JS call stack.
  • Add generated 100k-line regression coverage and a manual render-check script for large tracked/untracked files.

Fixes #218.

Testing

bun run format:check
bun run typecheck
bun run lint
bun test src/core/loaders.test.ts src/ui/diff/codeColumns.test.ts src/ui/lib/ui-lib.test.ts
bun run scripts/test-large-untracked-render.tsx 700000
bun run scripts/test-large-untracked-render.tsx 700000 tracked

This PR description was generated by Pi using OpenAI GPT-5

@greptile-apps
Copy link
Copy Markdown
Contributor

greptile-apps Bot commented May 9, 2026

Greptile Summary

This PR guards against large untracked files stalling startup by short-circuiting the expensive git diff --no-index patch synthesis, replacing it with a lightweight placeholder, and fixes the related Math.max(...hugeArray) call-stack overflow in width measurement.

  • Large-file bypass in loaders.ts: shouldSkipLargeUntrackedFile skips files that exceed 1 MB or 20k lines in the first 256 KB, builds a placeholder DiffFile with isTooLarge: true, and uses countLinesInFile to provide accurate addition stats; countLinesInFile reads the full file even for multi-GB inputs, which can still cause startup hangs beyond the 1 MB threshold.
  • Call-stack fix in codeColumns.ts: spreads over 100k-element arrays replaced with an explicit loop, eliminating RangeError: Maximum call stack size exceeded.
  • UI message in renderRows.tsx: diffMessage surfaces a "File too large to render" hint with the --exclude-untracked workaround when isTooLarge is set.

Confidence Score: 3/5

The call-stack fix and placeholder rendering are safe to merge, but the line-counting path can still block the process for a long time on multi-hundred-MB untracked files.

For files that exceed the 1 MB byte-size threshold, countLinesInFile reads the entire file synchronously with no upper bound — a 500 MB or 2 GB untracked file would still stall startup, leaving the fix incomplete for the extreme cases the PR targets.

src/core/loaders.ts — specifically countLinesInFile and how it is called unconditionally after the size-based skip check fires.

Important Files Changed

Filename Overview
src/core/loaders.ts Core change: adds large-file detection and skipping for untracked files. countLinesInFile is called unconditionally for all skipped files including those well above 1 MB, which can still cause significant startup hangs for very large files — the main issue this PR aims to fix.
src/ui/diff/codeColumns.ts Replaces spread-based Math.max(0, ...array) with an explicit loop, eliminating the call-stack overflow for large metadata arrays. Correct and safe.
src/ui/diff/renderRows.tsx Adds isTooLarge branch to diffMessage. The ordering (after isBinary, before metadata.type === "new") is correct since the placeholder metadata carries type: "new".
src/core/types.ts Adds optional isTooLarge field to DiffFile. Straightforward type addition, no issues.
src/core/loaders.test.ts Adds regression test for large untracked file handling, verifying isTooLarge, empty hunks, and accurate addition stats. Test coverage is good.
src/ui/diff/codeColumns.test.ts New test verifying that 100k-line fixtures don't overflow the call stack in maxFileCodeLineWidth. Well-structured fixture generator.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[buildUntrackedDiffFile] --> B{shouldSkipLargeUntrackedFile?}
    B -->|no| C[runGitUntrackedFileDiffText and parse patch]
    C --> D[buildDiffFile normal path]
    B -->|size over 1MB| E[countLinesInFile reads ENTIRE file]
    B -->|lines in 256KB over 20k| F[countNewlinesInFilePrefix reads 256KB]
    F --> E
    E --> G[createSkippedLargeUntrackedMetadata]
    G --> H[buildDiffFile with isTooLarge=true]
    D --> I[DiffFile]
    H --> I
    I --> J{diffMessage in renderRows}
    J -->|isTooLarge| K[File too large to render message]
    J -->|isBinary| L[Binary file skipped]
    J -->|normal| M[render hunks]
Loading
Prompt To Fix All With AI
Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
src/core/loaders.ts:434
**`countLinesInFile` reads the entire file unconditionally for all skipped files**

`countLinesInFile` is called for every file that `shouldSkipLargeUntrackedFile` flags — including files that tripped the 1 MB byte-size threshold. A user with a 2 GB log file as an untracked worktree entry will hit `shouldSkipLargeUntrackedFile → true` in milliseconds (stat), then spend tens of seconds blocked in `countLinesInFile` reading all 2 GB synchronously in 64 KB chunks before the UI renders. The original goal of the fix — eliminating startup hangs caused by huge untracked files — is therefore still broken for the most extreme cases. Consider capping the read inside `countLinesInFile` to a reasonable byte limit (e.g. `LARGE_UNTRACKED_FILE_MAX_BYTES`) and returning a `null` / `undefined` stat when the full count can't be determined, or deriving an estimate from `stat.size`.

### Issue 2 of 3
src/core/loaders.ts:423-438
**Double file I/O for line-density-detected large files**

When a file is smaller than `LARGE_UNTRACKED_FILE_MAX_BYTES` (1 MB) but has high line density, the path is: `statSync``countNewlinesInFilePrefix` (reads up to 256 KB) → function returns `true``countLinesInFile` (re-opens and reads the whole file from byte 0). For the regression test case (200 KB, 100k lines) that means reading the file twice in full. Returning the partial newline count from the sniff pass (or threading an `fd` through) would avoid the redundant second open.

### Issue 3 of 3
src/ui/diff/renderRows.tsx:547
The `isTooLarge` message only describes the `--exclude-untracked` flag as a workaround, but `isTooLarge` applies to untracked files specifically — a user who has no idea their file is untracked may find the hint confusing or inapplicable. Aligning the message with the binary-skipped pattern (which omits workaround hints) would be cleaner, or the hint could mention that the file is untracked.

```suggestion
    return "Untracked file too large to render. Use --exclude-untracked to hide large untracked files.";
```

Reviews (1): Last reviewed commit: "fix(diff): skip huge untracked file rend..." | Re-trigger Greptile

Comment thread src/core/loaders.ts Outdated
isTooLarge: true,
isUntracked: true,
stats: {
additions: countLinesInFile(join(repoRoot, filePath)),
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 countLinesInFile reads the entire file unconditionally for all skipped files

countLinesInFile is called for every file that shouldSkipLargeUntrackedFile flags — including files that tripped the 1 MB byte-size threshold. A user with a 2 GB log file as an untracked worktree entry will hit shouldSkipLargeUntrackedFile → true in milliseconds (stat), then spend tens of seconds blocked in countLinesInFile reading all 2 GB synchronously in 64 KB chunks before the UI renders. The original goal of the fix — eliminating startup hangs caused by huge untracked files — is therefore still broken for the most extreme cases. Consider capping the read inside countLinesInFile to a reasonable byte limit (e.g. LARGE_UNTRACKED_FILE_MAX_BYTES) and returning a null / undefined stat when the full count can't be determined, or deriving an estimate from stat.size.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/core/loaders.ts
Line: 434

Comment:
**`countLinesInFile` reads the entire file unconditionally for all skipped files**

`countLinesInFile` is called for every file that `shouldSkipLargeUntrackedFile` flags — including files that tripped the 1 MB byte-size threshold. A user with a 2 GB log file as an untracked worktree entry will hit `shouldSkipLargeUntrackedFile → true` in milliseconds (stat), then spend tens of seconds blocked in `countLinesInFile` reading all 2 GB synchronously in 64 KB chunks before the UI renders. The original goal of the fix — eliminating startup hangs caused by huge untracked files — is therefore still broken for the most extreme cases. Consider capping the read inside `countLinesInFile` to a reasonable byte limit (e.g. `LARGE_UNTRACKED_FILE_MAX_BYTES`) and returning a `null` / `undefined` stat when the full count can't be determined, or deriving an estimate from `stat.size`.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by bounding skipped untracked-file line counting to the large-file byte limit. When the count is incomplete, Hunk now marks stats as truncated and renders the addition badge with a trailing + instead of synchronously reading the whole file. Added a regression test for the byte-threshold path.

This comment was generated by Pi using OpenAI GPT-5

Comment thread src/core/loaders.ts Outdated
Comment on lines +423 to +438
if (shouldSkipLargeUntrackedFile(repoRoot, filePath)) {
return buildDiffFile(
createSkippedLargeUntrackedMetadata(filePath),
"",
index,
sourcePrefix,
agentContext,
{
isTooLarge: true,
isUntracked: true,
stats: {
additions: countLinesInFile(join(repoRoot, filePath)),
deletions: 0,
},
},
);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Double file I/O for line-density-detected large files

When a file is smaller than LARGE_UNTRACKED_FILE_MAX_BYTES (1 MB) but has high line density, the path is: statSynccountNewlinesInFilePrefix (reads up to 256 KB) → function returns truecountLinesInFile (re-opens and reads the whole file from byte 0). For the regression test case (200 KB, 100k lines) that means reading the file twice in full. Returning the partial newline count from the sniff pass (or threading an fd through) would avoid the redundant second open.

Prompt To Fix With AI
This is a comment left during a code review.
Path: src/core/loaders.ts
Line: 423-438

Comment:
**Double file I/O for line-density-detected large files**

When a file is smaller than `LARGE_UNTRACKED_FILE_MAX_BYTES` (1 MB) but has high line density, the path is: `statSync``countNewlinesInFilePrefix` (reads up to 256 KB) → function returns `true``countLinesInFile` (re-opens and reads the whole file from byte 0). For the regression test case (200 KB, 100k lines) that means reading the file twice in full. Returning the partial newline count from the sniff pass (or threading an `fd` through) would avoid the redundant second open.

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed by reusing the bounded line-count pass for untracked large-file detection and placeholder stats. Line-density skips no longer reopen and reread the file from byte 0; incomplete counts are surfaced as truncated stats.

This comment was generated by Pi using OpenAI GPT-5

Comment thread src/ui/diff/renderRows.tsx Outdated
}

if (file.isTooLarge) {
return "File too large to render. Use --exclude-untracked to hide large untracked files.";
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 The isTooLarge message only describes the --exclude-untracked flag as a workaround, but isTooLarge applies to untracked files specifically — a user who has no idea their file is untracked may find the hint confusing or inapplicable. Aligning the message with the binary-skipped pattern (which omits workaround hints) would be cleaner, or the hint could mention that the file is untracked.

Suggested change
return "File too large to render. Use --exclude-untracked to hide large untracked files.";
return "Untracked file too large to render. Use --exclude-untracked to hide large untracked files.";
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ui/diff/renderRows.tsx
Line: 547

Comment:
The `isTooLarge` message only describes the `--exclude-untracked` flag as a workaround, but `isTooLarge` applies to untracked files specifically — a user who has no idea their file is untracked may find the hint confusing or inapplicable. Aligning the message with the binary-skipped pattern (which omits workaround hints) would be cleaner, or the hint could mention that the file is untracked.

```suggestion
    return "Untracked file too large to render. Use --exclude-untracked to hide large untracked files.";
```

How can I resolve this? If you propose a fix, please make it concise.

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated the placeholder copy to be generic: File too large to render automatically. The placeholder now applies to both tracked and untracked large diffs, so the previous --exclude-untracked hint was too specific.

This comment was generated by Pi using OpenAI GPT-5

@benvinegar benvinegar force-pushed the bentlegen/fix-large-untracked-width branch from 18b5e32 to 9810196 Compare May 9, 2026 22:46
@benvinegar benvinegar changed the title fix(diff): skip huge untracked file rendering fix(diff): skip huge file rendering May 9, 2026
@benvinegar benvinegar force-pushed the bentlegen/fix-large-untracked-width branch from 9810196 to 35c3f3e Compare May 9, 2026 22:52
@benvinegar benvinegar merged commit 8ed7c8d into main May 9, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Hunk diff in a repo with large untracked files gives "RangeError: Maximum call stack size exceeded"

1 participant