fix(diff): skip huge file rendering by benvinegar · Pull Request #266 · modem-dev/hunk

benvinegar · 2026-05-09T22:40:54Z

Summary

Skip very large tracked and untracked file diffs before rendering them as full patch rows.
Render skipped large files as review-stream placeholders while preserving exact tracked stats and bounded/lower-bound untracked stats when a full count would be expensive.
Exclude skipped tracked files from the generated git diff patch so large changes do not slow startup.
Replace spread-based width measurement so large metadata arrays cannot overflow the JS call stack.
Add generated 100k-line regression coverage and a manual render-check script for large tracked/untracked files.

Fixes #218.

Testing

bun run format:check
bun run typecheck
bun run lint
bun test src/core/loaders.test.ts src/ui/diff/codeColumns.test.ts src/ui/lib/ui-lib.test.ts
bun run scripts/test-large-untracked-render.tsx 700000
bun run scripts/test-large-untracked-render.tsx 700000 tracked

This PR description was generated by Pi using OpenAI GPT-5

greptile-apps · 2026-05-09T22:44:05Z

Greptile Summary

This PR guards against large untracked files stalling startup by short-circuiting the expensive git diff --no-index patch synthesis, replacing it with a lightweight placeholder, and fixes the related Math.max(...hugeArray) call-stack overflow in width measurement.

Large-file bypass in loaders.ts: shouldSkipLargeUntrackedFile skips files that exceed 1 MB or 20k lines in the first 256 KB, builds a placeholder DiffFile with isTooLarge: true, and uses countLinesInFile to provide accurate addition stats; countLinesInFile reads the full file even for multi-GB inputs, which can still cause startup hangs beyond the 1 MB threshold.
Call-stack fix in codeColumns.ts: spreads over 100k-element arrays replaced with an explicit loop, eliminating RangeError: Maximum call stack size exceeded.
UI message in renderRows.tsx: diffMessage surfaces a "File too large to render" hint with the --exclude-untracked workaround when isTooLarge is set.

Confidence Score: 3/5

The call-stack fix and placeholder rendering are safe to merge, but the line-counting path can still block the process for a long time on multi-hundred-MB untracked files.

For files that exceed the 1 MB byte-size threshold, countLinesInFile reads the entire file synchronously with no upper bound — a 500 MB or 2 GB untracked file would still stall startup, leaving the fix incomplete for the extreme cases the PR targets.

src/core/loaders.ts — specifically countLinesInFile and how it is called unconditionally after the size-based skip check fires.

Important Files Changed

Filename	Overview
src/core/loaders.ts	Core change: adds large-file detection and skipping for untracked files. `countLinesInFile` is called unconditionally for all skipped files including those well above 1 MB, which can still cause significant startup hangs for very large files — the main issue this PR aims to fix.
src/ui/diff/codeColumns.ts	Replaces spread-based `Math.max(0, ...array)` with an explicit loop, eliminating the call-stack overflow for large metadata arrays. Correct and safe.
src/ui/diff/renderRows.tsx	Adds `isTooLarge` branch to `diffMessage`. The ordering (after `isBinary`, before `metadata.type === "new"`) is correct since the placeholder metadata carries `type: "new"`.
src/core/types.ts	Adds optional `isTooLarge` field to `DiffFile`. Straightforward type addition, no issues.
src/core/loaders.test.ts	Adds regression test for large untracked file handling, verifying `isTooLarge`, empty hunks, and accurate addition stats. Test coverage is good.
src/ui/diff/codeColumns.test.ts	New test verifying that 100k-line fixtures don't overflow the call stack in `maxFileCodeLineWidth`. Well-structured fixture generator.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[buildUntrackedDiffFile] --> B{shouldSkipLargeUntrackedFile?}
    B -->|no| C[runGitUntrackedFileDiffText and parse patch]
    C --> D[buildDiffFile normal path]
    B -->|size over 1MB| E[countLinesInFile reads ENTIRE file]
    B -->|lines in 256KB over 20k| F[countNewlinesInFilePrefix reads 256KB]
    F --> E
    E --> G[createSkippedLargeUntrackedMetadata]
    G --> H[buildDiffFile with isTooLarge=true]
    D --> I[DiffFile]
    H --> I
    I --> J{diffMessage in renderRows}
    J -->|isTooLarge| K[File too large to render message]
    J -->|isBinary| L[Binary file skipped]
    J -->|normal| M[render hunks]

Prompt To Fix All With AI

Fix the following 3 code review issues. Work through them one at a time, proposing concise fixes.

---

### Issue 1 of 3
src/core/loaders.ts:434
**`countLinesInFile` reads the entire file unconditionally for all skipped files**

`countLinesInFile` is called for every file that `shouldSkipLargeUntrackedFile` flags — including files that tripped the 1 MB byte-size threshold. A user with a 2 GB log file as an untracked worktree entry will hit `shouldSkipLargeUntrackedFile → true` in milliseconds (stat), then spend tens of seconds blocked in `countLinesInFile` reading all 2 GB synchronously in 64 KB chunks before the UI renders. The original goal of the fix — eliminating startup hangs caused by huge untracked files — is therefore still broken for the most extreme cases. Consider capping the read inside `countLinesInFile` to a reasonable byte limit (e.g. `LARGE_UNTRACKED_FILE_MAX_BYTES`) and returning a `null` / `undefined` stat when the full count can't be determined, or deriving an estimate from `stat.size`.

### Issue 2 of 3
src/core/loaders.ts:423-438
**Double file I/O for line-density-detected large files**

When a file is smaller than `LARGE_UNTRACKED_FILE_MAX_BYTES` (1 MB) but has high line density, the path is: `statSync` → `countNewlinesInFilePrefix` (reads up to 256 KB) → function returns `true` → `countLinesInFile` (re-opens and reads the whole file from byte 0). For the regression test case (200 KB, 100k lines) that means reading the file twice in full. Returning the partial newline count from the sniff pass (or threading an `fd` through) would avoid the redundant second open.

### Issue 3 of 3
src/ui/diff/renderRows.tsx:547
The `isTooLarge` message only describes the `--exclude-untracked` flag as a workaround, but `isTooLarge` applies to untracked files specifically — a user who has no idea their file is untracked may find the hint confusing or inapplicable. Aligning the message with the binary-skipped pattern (which omits workaround hints) would be cleaner, or the hint could mention that the file is untracked.

```suggestion
    return "Untracked file too large to render. Use --exclude-untracked to hide large untracked files.";
```

_{Reviews (1): Last reviewed commit: "fix(diff): skip huge untracked file rend..." | Re-trigger Greptile}

greptile-apps · 2026-05-09T22:44:08Z

+        isTooLarge: true,
+        isUntracked: true,
+        stats: {
+          additions: countLinesInFile(join(repoRoot, filePath)),


countLinesInFile reads the entire file unconditionally for all skipped files

countLinesInFile is called for every file that shouldSkipLargeUntrackedFile flags — including files that tripped the 1 MB byte-size threshold. A user with a 2 GB log file as an untracked worktree entry will hit shouldSkipLargeUntrackedFile → true in milliseconds (stat), then spend tens of seconds blocked in countLinesInFile reading all 2 GB synchronously in 64 KB chunks before the UI renders. The original goal of the fix — eliminating startup hangs caused by huge untracked files — is therefore still broken for the most extreme cases. Consider capping the read inside countLinesInFile to a reasonable byte limit (e.g. LARGE_UNTRACKED_FILE_MAX_BYTES) and returning a null / undefined stat when the full count can't be determined, or deriving an estimate from stat.size.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/loaders.ts Line: 434 Comment: **`countLinesInFile` reads the entire file unconditionally for all skipped files** `countLinesInFile` is called for every file that `shouldSkipLargeUntrackedFile` flags — including files that tripped the 1 MB byte-size threshold. A user with a 2 GB log file as an untracked worktree entry will hit `shouldSkipLargeUntrackedFile → true` in milliseconds (stat), then spend tens of seconds blocked in `countLinesInFile` reading all 2 GB synchronously in 64 KB chunks before the UI renders. The original goal of the fix — eliminating startup hangs caused by huge untracked files — is therefore still broken for the most extreme cases. Consider capping the read inside `countLinesInFile` to a reasonable byte limit (e.g. `LARGE_UNTRACKED_FILE_MAX_BYTES`) and returning a `null` / `undefined` stat when the full count can't be determined, or deriving an estimate from `stat.size`. How can I resolve this? If you propose a fix, please make it concise.

Fixed by bounding skipped untracked-file line counting to the large-file byte limit. When the count is incomplete, Hunk now marks stats as truncated and renders the addition badge with a trailing + instead of synchronously reading the whole file. Added a regression test for the byte-threshold path.

This comment was generated by Pi using OpenAI GPT-5

greptile-apps · 2026-05-09T22:44:09Z

+  if (shouldSkipLargeUntrackedFile(repoRoot, filePath)) {
+    return buildDiffFile(
+      createSkippedLargeUntrackedMetadata(filePath),
+      "",
+      index,
+      sourcePrefix,
+      agentContext,
+      {
+        isTooLarge: true,
+        isUntracked: true,
+        stats: {
+          additions: countLinesInFile(join(repoRoot, filePath)),
+          deletions: 0,
+        },
+      },
+    );


Double file I/O for line-density-detected large files

When a file is smaller than LARGE_UNTRACKED_FILE_MAX_BYTES (1 MB) but has high line density, the path is: statSync → countNewlinesInFilePrefix (reads up to 256 KB) → function returns true → countLinesInFile (re-opens and reads the whole file from byte 0). For the regression test case (200 KB, 100k lines) that means reading the file twice in full. Returning the partial newline count from the sniff pass (or threading an fd through) would avoid the redundant second open.

Prompt To Fix With AI

This is a comment left during a code review. Path: src/core/loaders.ts Line: 423-438 Comment: **Double file I/O for line-density-detected large files** When a file is smaller than `LARGE_UNTRACKED_FILE_MAX_BYTES` (1 MB) but has high line density, the path is: `statSync` → `countNewlinesInFilePrefix` (reads up to 256 KB) → function returns `true` → `countLinesInFile` (re-opens and reads the whole file from byte 0). For the regression test case (200 KB, 100k lines) that means reading the file twice in full. Returning the partial newline count from the sniff pass (or threading an `fd` through) would avoid the redundant second open. How can I resolve this? If you propose a fix, please make it concise.

Fixed by reusing the bounded line-count pass for untracked large-file detection and placeholder stats. Line-density skips no longer reopen and reread the file from byte 0; incomplete counts are surfaced as truncated stats.

This comment was generated by Pi using OpenAI GPT-5

greptile-apps · 2026-05-09T22:44:10Z

  }

+  if (file.isTooLarge) {
+    return "File too large to render. Use --exclude-untracked to hide large untracked files.";


The isTooLarge message only describes the --exclude-untracked flag as a workaround, but isTooLarge applies to untracked files specifically — a user who has no idea their file is untracked may find the hint confusing or inapplicable. Aligning the message with the binary-skipped pattern (which omits workaround hints) would be cleaner, or the hint could mention that the file is untracked.

Suggested change

return "File too large to render. Use --exclude-untracked to hide large untracked files.";

return "Untracked file too large to render. Use --exclude-untracked to hide large untracked files.";

Prompt To Fix With AI

This is a comment left during a code review. Path: src/ui/diff/renderRows.tsx Line: 547 Comment: The `isTooLarge` message only describes the `--exclude-untracked` flag as a workaround, but `isTooLarge` applies to untracked files specifically — a user who has no idea their file is untracked may find the hint confusing or inapplicable. Aligning the message with the binary-skipped pattern (which omits workaround hints) would be cleaner, or the hint could mention that the file is untracked. ```suggestion return "Untracked file too large to render. Use --exclude-untracked to hide large untracked files."; ``` How can I resolve this? If you propose a fix, please make it concise.

Updated the placeholder copy to be generic: File too large to render automatically. The placeholder now applies to both tracked and untracked large diffs, so the previous --exclude-untracked hint was too specific.

This comment was generated by Pi using OpenAI GPT-5

greptile-apps Bot reviewed May 9, 2026

View reviewed changes

benvinegar force-pushed the bentlegen/fix-large-untracked-width branch from 18b5e32 to 9810196 Compare May 9, 2026 22:46

benvinegar changed the title ~~fix(diff): skip huge untracked file rendering~~ fix(diff): skip huge file rendering May 9, 2026

fix(diff): skip huge untracked file rendering

35c3f3e

benvinegar force-pushed the bentlegen/fix-large-untracked-width branch from 9810196 to 35c3f3e Compare May 9, 2026 22:52

benvinegar merged commit 8ed7c8d into main May 9, 2026
3 checks passed

baudbot-agent mentioned this pull request May 9, 2026

Hunk diff in a repo with large untracked files gives "RangeError: Maximum call stack size exceeded" #218

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(diff): skip huge file rendering#266

fix(diff): skip huge file rendering#266
benvinegar merged 1 commit intomainfrom
bentlegen/fix-large-untracked-width

benvinegar commented May 9, 2026 •

edited

Loading

Uh oh!

greptile-apps Bot commented May 9, 2026

Uh oh!

greptile-apps Bot May 9, 2026

Uh oh!

benvinegar May 9, 2026

Uh oh!

greptile-apps Bot May 9, 2026

Uh oh!

benvinegar May 9, 2026

Uh oh!

greptile-apps Bot May 9, 2026

Uh oh!

benvinegar May 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	return "File too large to render. Use --exclude-untracked to hide large untracked files.";
	return "Untracked file too large to render. Use --exclude-untracked to hide large untracked files.";

Conversation

benvinegar commented May 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Testing

Uh oh!

greptile-apps Bot commented May 9, 2026

Greptile Summary

Confidence Score: 3/5

Important Files Changed

Flowchart

Uh oh!

greptile-apps Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

benvinegar May 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

benvinegar May 9, 2026

Choose a reason for hiding this comment

Uh oh!

greptile-apps Bot May 9, 2026

Choose a reason for hiding this comment

Uh oh!

benvinegar May 9, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

benvinegar commented May 9, 2026 •

edited

Loading