feat: input hardening against agent hallucinations#370
Conversation
Semver Impact of This PR🟡 Minor (new features) 📋 Changelog PreviewThis is how your changes will appear in the changelog. New Features ✨
Bug Fixes 🐛
🤖 This preview updates automatically when you update the PR. |
Codecov Results 📊✅ 104 passed | Total: 104 | Pass Rate: 100% | Execution Time: 0ms 📊 Comparison with Base Branch
✨ No test changes detected All tests are passing successfully. ✅ Patch coverage is 81.52%. Project has 4108 uncovered lines. Files with missing lines (3)
Coverage diff@@ Coverage Diff @@
## main #PR +/-##
==========================================
+ Coverage 80.78% 80.79% +0.01%
==========================================
Files 131 132 +1
Lines 21291 21382 +91
Branches 0 0 —
==========================================
+ Hits 17199 17274 +75
- Misses 4092 4108 +16
- Partials 0 0 —Generated by Codecov Action |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Exported
rejectPreEncodedis unused in production code- Removed the unused rejectPreEncoded function and PRE_ENCODED_PATTERN constant since validateResourceId already rejects % characters via RESOURCE_ID_FORBIDDEN.
Or push these changes by commenting:
@cursor push 1092590404
Preview (1092590404)
diff --git a/src/lib/input-validation.ts b/src/lib/input-validation.ts
--- a/src/lib/input-validation.ts
+++ b/src/lib/input-validation.ts
@@ -29,12 +29,6 @@
const RESOURCE_ID_FORBIDDEN = /[?#%\s]/;
/**
- * Matches `%XX` hex-encoded sequences (e.g., `%2F`, `%20`, `%3A`).
- * Used to detect pre-encoded strings that would get double-encoded.
- */
-const PRE_ENCODED_PATTERN = /%[0-9a-fA-F]{2}/;
-
-/**
* Matches ASCII control characters (code points 0x00–0x1F).
* These are invisible and have no valid use in CLI string inputs.
*
@@ -111,27 +105,6 @@
}
/**
- * Reject pre-URL-encoded sequences (`%XX`) in resource identifiers.
- *
- * Resource IDs (slugs, issue IDs) should always be plain strings. If they
- * contain `%XX` patterns, they were likely pre-encoded by an agent and
- * would get double-encoded when interpolated into API URLs.
- *
- * @param input - String to validate
- * @param label - Human-readable name for error messages (e.g., "project slug")
- * @throws {ValidationError} When input contains percent-encoded sequences
- */
-export function rejectPreEncoded(input: string, label: string): void {
- const match = PRE_ENCODED_PATTERN.exec(input);
- if (match) {
- throw new ValidationError(
- `Invalid ${label}: contains URL-encoded sequence "${match[0]}".\n` +
- ` Use plain text instead of percent-encoding (e.g., "my project" not "my%20project").`
- );
- }
-}
-
-/**
* Validate a resource identifier (org slug, project slug, or issue ID component).
*
* Rejects characters that would cause URL injection when the identifier
diff --git a/test/lib/input-validation.property.test.ts b/test/lib/input-validation.property.test.ts
--- a/test/lib/input-validation.property.test.ts
+++ b/test/lib/input-validation.property.test.ts
@@ -20,7 +20,6 @@
} from "fast-check";
import {
rejectControlChars,
- rejectPreEncoded,
validateEndpoint,
validateResourceId,
} from "../../src/lib/input-validation.js";
@@ -42,19 +41,6 @@
/** Characters that should be rejected in resource IDs */
const injectionCharArb = constantFrom("?", "#", "%20", " ", "\t", "\n");
-/** Pre-encoded sequences that should be rejected */
-const preEncodedArb = constantFrom(
- "%2F",
- "%20",
- "%3A",
- "%3F",
- "%23",
- "%00",
- "%0A",
- "%7E",
- "%41"
-);
-
/** Control characters (ASCII 0x00-0x1F) as strings */
const controlCharArb = constantFrom(
"\x00",
@@ -122,35 +108,6 @@
});
});
-describe("rejectPreEncoded properties", () => {
- test("valid slugs without percent signs always pass", async () => {
- await fcAssert(
- property(validSlugArb, (input) => {
- rejectPreEncoded(input, "test");
- }),
- { numRuns: DEFAULT_NUM_RUNS }
- );
- });
-
- test("any string with %XX hex pattern always throws", async () => {
- await fcAssert(
- property(tuple(validSlugArb, preEncodedArb), ([prefix, encoded]) => {
- const input = `${prefix}${encoded}`;
- expect(() => rejectPreEncoded(input, "test")).toThrow(
- /URL-encoded sequence/
- );
- }),
- { numRuns: DEFAULT_NUM_RUNS }
- );
- });
-
- test("percent sign followed by non-hex does not throw", () => {
- // "%ZZ" is not a valid encoding — we only reject real %XX patterns
- rejectPreEncoded("my-org%ZZstuff", "test");
- rejectPreEncoded("my-org%Gxstuff", "test");
- });
-});
-
describe("validateResourceId properties", () => {
test("valid slugs always pass", async () => {
await fcAssert(| ` Use plain text instead of percent-encoding (e.g., "my project" not "my%20project").` | ||
| ); | ||
| } | ||
| } |
There was a problem hiding this comment.
Exported rejectPreEncoded is unused in production code
Low Severity
rejectPreEncoded is exported but never imported or called from any production code in src/. The function validateResourceId already catches % broadly via RESOURCE_ID_FORBIDDEN (/[?#%\s]/), making rejectPreEncoded's narrower %XX check redundant for all current integration points. It's only imported in the test file for isolated unit testing of itself.
BYK
left a comment
There was a problem hiding this comment.
Re: BugBot comment on rejectPreEncoded being unused in production code:
This is intentional. rejectPreEncoded is part of the public validation API described in #350 — it's a composable building block designed to be used independently by future callers that need finer-grained % validation (e.g., endpoints where % in query params is valid but %XX in path segments is not).
Currently, validateResourceId catches % broadly, which is the right behavior for slugs and IDs. But rejectPreEncoded gives a more specific error message about double-encoding vs. a generic "contains %", and it's exported so that future validation points (like the planned --dry-run request preview in #349) can compose it independently.
The test coverage is intentional — it validates the specific %XX detection logic in isolation.
Add client-side validation that rejects malformed inputs from AI agents before they reach the Sentry API. Agents hallucinate differently than humans — they embed query params in IDs, double-encode URLs, and inject path traversals. New module `src/lib/input-validation.ts` with four reusable validators: - `rejectControlChars` — reject ASCII control characters (< 0x20) - `rejectPreEncoded` — reject %XX hex-encoded sequences - `validateResourceId` — reject ?, #, %, whitespace in slugs and IDs - `validateEndpoint` — reject .. path traversal in API endpoints Applied at three integration points: - `parseOrgProjectArg` / `parseIssueArg` in arg-parsing.ts - `normalizeEndpoint` in api.ts Attacks prevented: - `sentry issue list "my-org?query=foo"` (query injection) - `sentry project view "my-project#anchor"` (fragment injection) - `sentry issue view "CLI-G%20extra"` (double encoding) - `sentry api "../../admin/settings/"` (path traversal) Fixes #350
7ff8323 to
bb6cb3e
Compare



Add client-side validation that rejects malformed inputs from AI agents before they reach the Sentry API. Agents hallucinate differently than humans — they embed query params in IDs, double-encode URLs, and inject path traversals.
Changes
New module
src/lib/input-validation.tswith four reusable validators:rejectControlCharsrejectPreEncoded%XXhex patternsvalidateResourceId?,#,%, whitespace, control charsvalidateEndpoint..path segmentsApplied at three integration points:
parseOrgProjectArg/parseIssueArginarg-parsing.ts— validates org slugs, project slugs, and issue identifiersnormalizeEndpointinapi.ts— rejects..path traversal insentry apiendpointsAttacks prevented
Tests
Fixes #350