Skip to content

fix: invert WebFetch User-Agent strategy to honest-bot-first#303

Merged
anandgupta42 merged 1 commit intomainfrom
fix/webfetch-honest-ua-strategy
Mar 19, 2026
Merged

fix: invert WebFetch User-Agent strategy to honest-bot-first#303
anandgupta42 merged 1 commit intomainfrom
fix/webfetch-honest-ua-strategy

Conversation

@anandgupta42
Copy link
Contributor

What does this PR do?

Inverts the WebFetch User-Agent strategy from spoofed-browser-first to honest-bot-first, fixing ~30% fetch failure rate caused by TLS fingerprint mismatch detection.

Key changes:

  • Default to altimate-code/1.0 honest bot UA instead of spoofed Chrome UA
  • Retry with browser UA on 403/406 (WAF/bot-detection codes only)
  • Cancel response body stream before retry to prevent resource leaks
  • Move clearTimeout() into try/finally to prevent Slowloris-style hangs

Benchmark results:

Strategy Success Rate
Honest UA solo 20/20 (100%)
Browser UA solo 14/20 (70%)
Old retry (browser → Cloudflare-only) 17/20 (85%)
New retry (honest → browser) 19/20 (95%)

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Issue for this PR

Closes #302

How did you verify your code works?

  • 8/8 unit tests pass (3 existing + 5 new)
  • Integration benchmark against 20 real URLs with bot detection (Cloudflare, Anubis, etc.)
  • 6-model consensus code review (Claude, GPT 5.2, Gemini 3.1, Kimi K2.5, MiniMax M2.5, GLM-5) — 5/6 approved
  • Typecheck passes

Checklist

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • New and existing unit tests pass locally with my changes
  • I have added tests that prove my fix is effective

🤖 Generated with Claude Code

The fake Chrome UA was being blocked by Cloudflare TLS fingerprint
detection, Anubis, and other anti-bot systems (~30% of sites).
Inverting to an honest bot UA first with browser UA fallback improves
success rate from 70% to 100% (honest UA solo) / 85% to 95% (with
retry strategy).

Changes:
- Default to `altimate-code/1.0` honest bot UA instead of spoofed Chrome UA
- Retry with browser UA on 403/406 (WAF/bot-detection codes only)
- Remove 404/429 from retry set (semantically incorrect for UA retry)
- Cancel response body stream before retry to prevent resource leaks
- Move `clearTimeout()` into `try/finally` after body consumption to
  prevent Slowloris-style hangs on slow-streaming responses
- Add tests: 403 retry, 406 retry, no-retry on success, no-retry on
  500, both-UAs-fail with call tracking

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Copy link

@claude claude bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review.

@coderabbitai
Copy link

coderabbitai bot commented Mar 19, 2026

Warning

Rate limit exceeded

@anandgupta42 has exceeded the limit for the number of commits that can be reviewed per hour. Please wait 3 minutes and 54 seconds before requesting another review.

⌛ How to resolve this issue?

After the wait time has elapsed, a review can be triggered using the @coderabbitai review command as a PR comment. Alternatively, push new commits to this PR.

We recommend that you space out your commits to avoid hitting the rate limit.

🚦 How do rate limits work?

CodeRabbit enforces hourly rate limits for each developer per organization.

Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout.

Please see our FAQ for further information.

ℹ️ Review info
⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: aecd05c1-bcbd-4605-bc6e-7b900f5010ee

📥 Commits

Reviewing files that changed from the base of the PR and between bb94013 and 78ff7f2.

📒 Files selected for processing (2)
  • packages/opencode/src/tool/webfetch.ts
  • packages/opencode/test/tool/webfetch.test.ts
✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/webfetch-honest-ua-strategy
📝 Coding Plan
  • Generate coding plan for human review comments

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@anandgupta42 anandgupta42 merged commit 76ec9fe into main Mar 19, 2026
10 checks passed
Comment on lines +73 to +78
let response = await fetch(params.url, { signal, headers: honestHeaders })

clearTimeout()

if (!response.ok) {
throw new Error(`Request failed with status code: ${response.status}`)
// Retry with browser UA if the honest UA was rejected
if (!response.ok && RETRYABLE_STATUSES.has(response.status)) {
await response.body?.cancel().catch(() => {})
response = await fetch(params.url, { signal, headers: browserHeaders })
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug: If an initial fetch() call throws an error, the finally block is not reached, and clearTimeout() is never called, leaving a timer running unnecessarily.
Severity: MEDIUM

Suggested Fix

The try/finally block should be expanded to wrap all the code that executes after the timer is created with abortAfterAny(), including the initial fetch() calls. This will ensure that clearTimeout() is always called, regardless of whether the fetch operations succeed or fail.

Prompt for AI Agent
Review the code at the location below. A potential bug has been identified by an AI
agent.
Verify if this is a real issue. If it is, propose a fix; if not, explain why it's not
valid.

Location: packages/opencode/src/tool/webfetch.ts#L73-L78

Potential issue: The `try/finally` block that is supposed to guarantee the execution of
`clearTimeout()` only starts after the initial `fetch()` calls. If either of these
`fetch()` operations throws an error, for instance due to a network failure, the
exception will propagate before the `try` block is entered. Consequently, the `finally`
block is never executed, and `clearTimeout()` is not called. This results in the timeout
timer continuing to run for its entire duration (30-120 seconds) even after the
associated fetch operation has failed, causing unnecessary resource consumption.

Did we get this right? 👍 / 👎 to inform future reviews.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

fix: WebFetch fails ~30% of the time due to TLS fingerprint mismatch with spoofed browser UA

1 participant