Skip to content

fix: classify HTTP 202/5xx as indeterminate in http-status-codes#80

Merged
dacharyc merged 1 commit intomainfrom
fix/http-status-codes-false-positives
May 2, 2026
Merged

fix: classify HTTP 202/5xx as indeterminate in http-status-codes#80
dacharyc merged 1 commit intomainfrom
fix/http-status-codes-false-positives

Conversation

@dacharyc
Copy link
Copy Markdown
Member

@dacharyc dacharyc commented May 2, 2026

Summary

Closes #79.

  • Adds an indeterminate classification for HTTP 202 (Vercel/Next.js ISR cache-miss/build) and 5xx responses, replacing the previous behavior where 202 was misclassified as soft-404 and 5xx silently passed as correct-error.
  • Reports indeterminate counts separately; excludes them from the soft-404 tally.
  • Returns warn ("Could not determine bad-URL handling") when the entire sample is indeterminate; scoring falls back to the default 0.5 warn coefficient rather than penalizing the site for CDN behavior we can't measure.
  • Updates SCORING.md, scoring-reference.md, and docs/agent-score-calculation.md to reflect that http-status-codes now has a (narrow) warn state.

Test plan

  • New unit tests cover 202, 503, all-indeterminate→warn, and the proportions extractor.
  • npm run lint passes (eslint + tsc --noEmit clean).
  • npm test passes (1272 tests).
  • Verified end-to-end against plaid.com/docs at concurrency 30. Two consecutive runs produced different cache states (281 correct / 19 indeterminate, then 16 / 284) and both correctly pass with the indeterminate count noted in the message.

Vercel/Next.js ISR returns 202 Accepted during cache-miss/build for
fresh URLs. The previous classifier treated any non-4xx as evidence of
soft-404, so afdocs's intentional fan-out of 155+ unique nonexistent
URLs would trip ISR and produce false-positive failures that real
agents (low concurrency, warm cache) never see.

5xx responses tell us nothing about how the site handles bad URLs,
either; they were silently passing as "correct-error" before.

Both now route to a new `indeterminate` classification that's excluded
from the soft-404 tally and reported separately. When the entire sample
is indeterminate, the check returns `warn` and falls back to the default
0.5 warn coefficient rather than penalizing the site for CDN behavior
it can't measure.

Verified against plaid.com/docs at concurrency 30: one run returned 281
correct-error / 19 indeterminate, another 16 / 284 — both correctly
pass instead of flunking the site for ISR build behavior.
@dacharyc dacharyc merged commit 609866b into main May 2, 2026
2 checks passed
@dacharyc dacharyc deleted the fix/http-status-codes-false-positives branch May 2, 2026 03:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

http-status-codes: false positives from 5KB body scan + HTTP 202 misclassification

1 participant