fix(security): sanitize terminal-bound text#371
Conversation
Greptile SummaryThis PR adds a centralized
Confidence Score: 4/5The sanitizer logic is sound and all identified injection paths are covered; the only concerns are redundant sanitization in hot render paths and a duplicated helper function. The four-pass regex sanitizer correctly handles all common terminal injection vectors with a fast-path exit for clean text. Call sites are comprehensive. The structural issue is that sanitizeTerminalLine was embedded inside low-level measureTextWidth and sliceTextByWidth utilities: sliceTextByWidth sanitizes its input once, then calls measureTextWidth(cluster) in a loop for each grapheme, which sanitizes the already-clean cluster again. For render-heavy screens with many Unicode code cells this adds per-character overhead on every repaint. It is not a correctness problem, but it is a latent performance concern in the render hot path. src/ui/lib/text.ts for the redundant per-cluster re-sanitization inside sliceTextByWidth; src/ui/diff/expandCollapsedRows.ts and src/ui/diff/renderRows.tsx for the duplicated sanitizeSpans/sanitizeRenderSpans helper. Important Files Changed
Flowchart%%{init: {'theme': 'neutral'}}%%
flowchart TD
A["Untrusted Input"] --> B["sanitizeTerminalText / sanitizeTerminalLine\nsrc/lib/terminalText.ts"]
B --> B1["Pass 1: sevenBitControlStrings\nESC OSC, DCS, SOS, PM, APC, CSI"]
B1 --> B2["Pass 2: c1ControlStrings\nC1 DCS/SOS/OSC/PM/APC"]
B2 --> B3["Pass 3: c1Csi\nC1 CSI"]
B3 --> B4["Pass 4: controlCharacters\nResidual C0/DEL/C1 bytes"]
B4 --> C["Safe text"]
subgraph "Call sites"
D1["pager.ts"]
D2["main.tsx"]
D3["staticDiffPager.ts"]
D4["renderRows.tsx"]
D5["expandCollapsedRows.ts"]
D6["files.ts"]
D7["AgentInlineNote.tsx / agentPopover.ts"]
D8["text.ts"]
end
A --> D1 & D2 & D3 & D4 & D5 & D6 & D7 & D8
D1 & D2 & D3 & D4 & D5 & D6 & D7 & D8 --> B
Prompt To Fix All With AIFix the following 3 code review issues. Work through them one at a time, proposing concise fixes.
---
### Issue 1 of 3
src/ui/lib/text.ts:20-23
**Redundant re-sanitization on every grapheme cluster**
`sliceTextByWidth` sanitizes its input once to produce `safeText`, then iterates `textClusters(safeText)` and calls `measureTextWidth(cluster)` for each cluster. `measureTextWidth` now calls `sanitizeTerminalLine` on its input, so every grapheme cluster from the already-clean `safeText` is sanitized a second time. For a non-ASCII string with N clusters this means N+1 `sanitizeTerminalLine` invocations per `sliceTextByWidth` call. Because `sliceTextByWidth` is called inside render loops, this overhead accumulates on each repaint of a diff with Unicode content. The single `sanitizeTerminalLine` at the top of `sliceTextByWidth` already guarantees clusters are clean; the inner `measureTextWidth` call could use `stringWidth` directly for the cluster width without re-sanitizing.
### Issue 2 of 3
src/ui/diff/expandCollapsedRows.ts:83-87
**Duplicate `sanitizeSpans` / `sanitizeRenderSpans` across two files**
`sanitizeSpans` here and `sanitizeRenderSpans` in `renderRows.tsx` (lines 647-651) are byte-for-byte identical. Having two copies means a future change to the sanitization logic would need to be applied in two places. This helper belongs in `terminalText.ts` as an exported function so both call sites can share it.
### Issue 3 of 3
src/lib/terminalText.ts:9-12
**Module-level `/g` regexes are safe here, but worth an explicit note**
`sevenBitControlStrings`, `c1ControlStrings`, and `c1Csi` are module-level constants with the `/g` flag. `String.prototype.replace` resets `lastIndex` to 0 before iterating matches, so current usage is safe. However, if any caller were to use `.test()` or `.exec()` on these shared regex objects, `lastIndex` would be left in an intermediate state causing every second call to return the wrong result. A brief comment that these regexes must only be used via `.replace()` would prevent a subtle future footgun.
Reviews (1): Last reviewed commit: "fix(security): sanitize terminal-bound t..." | Re-trigger Greptile |
| export function measureTextWidth(text: string) { | ||
| return printableAsciiRegex.test(text) ? text.length : stringWidth(text); | ||
| const safeText = sanitizeTerminalLine(text); | ||
| return printableAsciiRegex.test(safeText) ? safeText.length : stringWidth(safeText); | ||
| } |
There was a problem hiding this comment.
Redundant re-sanitization on every grapheme cluster
sliceTextByWidth sanitizes its input once to produce safeText, then iterates textClusters(safeText) and calls measureTextWidth(cluster) for each cluster. measureTextWidth now calls sanitizeTerminalLine on its input, so every grapheme cluster from the already-clean safeText is sanitized a second time. For a non-ASCII string with N clusters this means N+1 sanitizeTerminalLine invocations per sliceTextByWidth call. Because sliceTextByWidth is called inside render loops, this overhead accumulates on each repaint of a diff with Unicode content. The single sanitizeTerminalLine at the top of sliceTextByWidth already guarantees clusters are clean; the inner measureTextWidth call could use stringWidth directly for the cluster width without re-sanitizing.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ui/lib/text.ts
Line: 20-23
Comment:
**Redundant re-sanitization on every grapheme cluster**
`sliceTextByWidth` sanitizes its input once to produce `safeText`, then iterates `textClusters(safeText)` and calls `measureTextWidth(cluster)` for each cluster. `measureTextWidth` now calls `sanitizeTerminalLine` on its input, so every grapheme cluster from the already-clean `safeText` is sanitized a second time. For a non-ASCII string with N clusters this means N+1 `sanitizeTerminalLine` invocations per `sliceTextByWidth` call. Because `sliceTextByWidth` is called inside render loops, this overhead accumulates on each repaint of a diff with Unicode content. The single `sanitizeTerminalLine` at the top of `sliceTextByWidth` already guarantees clusters are clean; the inner `measureTextWidth` call could use `stringWidth` directly for the cluster width without re-sanitizing.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed. sliceTextByWidth now measures already-sanitized grapheme clusters with an internal width helper instead of routing every cluster back through measureTextWidth.
This comment was generated by Pi using OpenAI GPT-5
| function sanitizeSpans(spans: RenderSpan[]): RenderSpan[] { | ||
| return spans | ||
| .map((span) => ({ ...span, text: sanitizeTerminalLine(span.text) })) | ||
| .filter((span) => span.text.length > 0); | ||
| } |
There was a problem hiding this comment.
Duplicate
sanitizeSpans / sanitizeRenderSpans across two files
sanitizeSpans here and sanitizeRenderSpans in renderRows.tsx (lines 647-651) are byte-for-byte identical. Having two copies means a future change to the sanitization logic would need to be applied in two places. This helper belongs in terminalText.ts as an exported function so both call sites can share it.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/ui/diff/expandCollapsedRows.ts
Line: 83-87
Comment:
**Duplicate `sanitizeSpans` / `sanitizeRenderSpans` across two files**
`sanitizeSpans` here and `sanitizeRenderSpans` in `renderRows.tsx` (lines 647-651) are byte-for-byte identical. Having two copies means a future change to the sanitization logic would need to be applied in two places. This helper belongs in `terminalText.ts` as an exported function so both call sites can share it.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Fixed. I moved the shared span sanitizer into src/lib/terminalText.ts as sanitizeTerminalSpans and updated both call sites to use it.
This comment was generated by Pi using OpenAI GPT-5
| const sevenBitControlStrings = | ||
| /\x1b(?:\][\s\S]*?(?:\x07|\x1b\\|\x9c)|[PX^_][\s\S]*?(?:\x1b\\|\x9c)|\[[0-?]*[ -/]*[@-~])/g; | ||
| const c1ControlStrings = /[\x90\x98\x9d\x9e\x9f][\s\S]*?(?:\x07|\x1b\\|\x9c)/g; | ||
| const c1Csi = /\x9b[0-?]*[ -/]*[@-~]/g; |
There was a problem hiding this comment.
Module-level
/g regexes are safe here, but worth an explicit note
sevenBitControlStrings, c1ControlStrings, and c1Csi are module-level constants with the /g flag. String.prototype.replace resets lastIndex to 0 before iterating matches, so current usage is safe. However, if any caller were to use .test() or .exec() on these shared regex objects, lastIndex would be left in an intermediate state causing every second call to return the wrong result. A brief comment that these regexes must only be used via .replace() would prevent a subtle future footgun.
Prompt To Fix With AI
This is a comment left during a code review.
Path: src/lib/terminalText.ts
Line: 9-12
Comment:
**Module-level `/g` regexes are safe here, but worth an explicit note**
`sevenBitControlStrings`, `c1ControlStrings`, and `c1Csi` are module-level constants with the `/g` flag. `String.prototype.replace` resets `lastIndex` to 0 before iterating matches, so current usage is safe. However, if any caller were to use `.test()` or `.exec()` on these shared regex objects, `lastIndex` would be left in an intermediate state causing every second call to return the wrong result. A brief comment that these regexes must only be used via `.replace()` would prevent a subtle future footgun.
How can I resolve this? If you propose a fix, please make it concise.There was a problem hiding this comment.
Added the note. The global regexes now have a comment explaining they are private and only used with String#replace, avoiding test/exec lastIndex footguns.
This comment was generated by Pi using OpenAI GPT-5
c8f3292 to
55b39f9
Compare
55b39f9 to
d26dcb6
Compare
Summary
Test plan
bun testbun run typecheckbun run lintThis PR description was generated by Pi using OpenAI GPT-5