Skip to content

Fix checkLink failures by adding User-Agent headers and timeout#413

Merged
hawkeyexl merged 3 commits intomainfrom
copilot/investigate-checklink-failures
Dec 24, 2025
Merged

Fix checkLink failures by adding User-Agent headers and timeout#413
hawkeyexl merged 3 commits intomainfrom
copilot/investigate-checklink-failures

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Dec 24, 2025

Investigated 80 checkLink test failures from user report. Root cause: modern websites (Wikipedia, JFrog, etc.) reject HTTP requests without User-Agent headers, returning 400/403 errors.

Changes

  • Added browser-like headers to axios requests:

    • User-Agent: Chrome/120 string
    • Accept: text/html,application/xhtml+xml,application/xml
    • Accept-Language: en-US,en
  • Added timeout protection: 10s timeout, 5 max redirects

  • Improved error messages: Show actual HTTP status codes (401, 403) instead of generic "Invalid or unresolvable URL"

  • Updated .gitignore: Exclude PNG test artifacts

Before/After

Before:

let req = await axios.get(step.checkLink.url)
// Wikipedia returns: ERR_BAD_REQUEST
// Error: "Invalid or unresolvable URL: https://en.wikipedia.org/wiki/E.164"

After:

const requestConfig = {
  headers: {
    'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36...',
    'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
    'Accept-Language': 'en-US,en;q=0.9'
  },
  timeout: 10000,
  maxRedirects: 5
};
let req = await axios.get(step.checkLink.url, requestConfig)
// Wikipedia returns: 200
// Success: "Returned 200"

Impact

  • 60% of unique failures now pass (6/10 URLs tested)
  • Wikipedia URLs work (E.164, ISO 3166-1, ISO 639-1)
  • JFrog Artifactory URLs work
  • Estimated ~48 of 80 reported failures will resolve
  • Remaining failures are legitimate (403 bot protection, 401 broken links, ENOTFOUND)

No breaking changes. Fully backward compatible.

Original prompt

Investigate user-reported checkLink failures from the following results: https://cdn.discordapp.com/attachments/1453108489705754624/1453137355673174148/testResults-1766525295537.json?ex=694d03df&is=694bb25f&hm=7fae7efaca8bde9c8d68be26273a025ad937e85a23c59c91dd186c318757cd54&

Identify failure causes and suggest fixes as appropriate.


💡 You can make Copilot smarter by setting up custom instructions, customizing its development environment and configuring Model Context Protocol (MCP) servers. Learn more Copilot coding agent tips in the docs.

Summary by CodeRabbit

  • Bug Fixes

    • Improved error handling with enhanced error messaging and request timeout configuration for better diagnostics.
  • Chores

    • Updated test artifact patterns in version control.

✏️ Tip: You can customize this high-level summary in your review settings.

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 24, 2025

Walkthrough

Changes extend .gitignore to ignore PNG test artifacts and enhance error handling in checkLink.js by introducing request configuration with timeout and redirect limits, plus improved error messages that differentiate between HTTP response errors and unresolvable URLs.

Changes

Cohort / File(s) Summary
Test Artifact Configuration
.gitignore
Added ignore pattern for PNG files in test/artifacts/ directory alongside existing artifact exclusions
Request Configuration & Error Handling
src/tests/checkLink.js
Introduced requestConfig object with custom headers, 10-second timeout, and redirect limits; added catch block for axios errors with enhanced error differentiation between HTTP response errors and unresolvable URLs

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

Poem

🐰 A PNG here, a timeout there,
Better errors caught with care!
Request config hops along,
Stronger links, no more wrong,
Artifacts cleared, code runs bright! ✨

Pre-merge checks and finishing touches

❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. You can run @coderabbitai generate docstrings to improve docstring coverage.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately summarizes the main change: adding User-Agent headers and timeout to fix checkLink failures, which are the primary improvements in the changeset.
✨ Finishing touches
  • 📝 Generate docstrings
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Post copyable unit tests in a comment
  • Commit unit tests in branch copilot/investigate-checklink-failures

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Copilot AI and others added 2 commits December 24, 2025 14:36
…or messages

Co-authored-by: hawkeyexl <5209367+hawkeyexl@users.noreply.github.com>
Co-authored-by: hawkeyexl <5209367+hawkeyexl@users.noreply.github.com>
Copilot AI changed the title [WIP] Investigate user-reported checkLink failures and suggest fixes Fix checkLink failures by adding User-Agent headers and timeout Dec 24, 2025
Copilot AI requested a review from hawkeyexl December 24, 2025 14:48
@hawkeyexl hawkeyexl marked this pull request as ready for review December 24, 2025 15:49
@hawkeyexl
Copy link
Copy Markdown
Contributor

@coderabbitai review

@coderabbitai
Copy link
Copy Markdown
Contributor

coderabbitai Bot commented Dec 24, 2025

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

Copy link
Copy Markdown
Contributor

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (1)
src/tests/checkLink.js (1)

55-64: Consider consistent quote style.

The requestConfig object uses single quotes for string values, while the rest of the file consistently uses double quotes (see lines 36, 42, 82, etc.). For consistency, consider updating to double quotes.

🔎 Proposed fix for quote consistency
-  const requestConfig = {
-    headers: {
-      'User-Agent': 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36',
-      'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
-      'Accept-Language': 'en-US,en;q=0.9'
-    },
-    timeout: 10000, // 10 second timeout
-    maxRedirects: 5
-  };
+  const requestConfig = {
+    headers: {
+      "User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/120.0.0.0 Safari/537.36",
+      "Accept": "text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8",
+      "Accept-Language": "en-US,en;q=0.9"
+    },
+    timeout: 10000, // 10 second timeout
+    maxRedirects: 5
+  };
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 07d595d and d618aa0.

📒 Files selected for processing (2)
  • .gitignore
  • src/tests/checkLink.js
🧰 Additional context used
📓 Path-based instructions (2)
src/tests/*.js

📄 CodeRabbit inference engine (AGENTS.md)

src/tests/*.js: Create new step type handlers as async functions exported from src/tests/[actionName].js
Step handlers must return object with { status: "PASS"|"FAIL"|"WARNING", description: string, outputs: {} }
Validate step schema before resolving to object and setting defaults in new step handlers
Always handle driver cleanup in try/finally blocks in browser automation code
Set step variables via step.variables = { MY_VAR: "$$response.body.token" } to store data as environment variables
Use setViewportSize() to calculate inner dimensions rather than directly setting window size in browser automation
Mark unsafe steps with step.unsafe = true and require config.allowUnsafeSteps = true to execute

Files:

  • src/tests/checkLink.js
src/**/*.js

📄 CodeRabbit inference engine (AGENTS.md)

src/**/*.js: Use log(config, level, message) for all logging, where level = "debug"|"info"|"warning"|"error"
Use getAvailableApps() to detect installed browsers instead of hardcoding browser paths

Files:

  • src/tests/checkLink.js
🧠 Learnings (2)
📚 Learning: 2025-12-03T00:21:10.834Z
Learnt from: CR
Repo: doc-detective/core PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-03T00:21:10.834Z
Learning: Applies to src/tests/httpRequest.js : Extract request/response examples from OpenAPI spec and use AJV for schema validation in HTTP request handling

Applied to files:

  • src/tests/checkLink.js
📚 Learning: 2025-12-03T00:21:10.834Z
Learnt from: CR
Repo: doc-detective/core PR: 0
File: AGENTS.md:0-0
Timestamp: 2025-12-03T00:21:10.834Z
Learning: Applies to src/tests/*.js : Step handlers must return object with `{ status: "PASS"|"FAIL"|"WARNING", description: string, outputs: {} }`

Applied to files:

  • src/tests/checkLink.js
🧬 Code graph analysis (1)
src/tests/checkLink.js (3)
src/utils.js (1)
  • axios (4-4)
src/tests/httpRequest.js (3)
  • axios (2-2)
  • step (680-708)
  • result (12-12)
src/tests.js (1)
  • axios (5-5)
🔇 Additional comments (4)
.gitignore (1)

115-115: LGTM!

The addition of test/artifacts/*.png is appropriate for preventing test artifacts from being committed while remaining consistent with existing artifact patterns.

src/tests/checkLink.js (3)

66-73: LGTM!

The axios request correctly applies the requestConfig with headers, timeout, and redirect limits. The implementation properly handles both success and error cases, returning a normalized response structure.


76-84: Excellent improvement in error handling.

The enhanced error messages now differentiate between HTTP response errors (e.g., 403, 401) and network/resolution failures (e.g., ENOTFOUND, timeouts), providing much clearer feedback for debugging. The logic correctly accesses the axios error structure and is consistent with the existing status code comparison pattern.


55-67: The axios configuration is correct. Axios 1.13.2 fully supports headers, timeout, and maxRedirects options as implemented.

@promptless
Copy link
Copy Markdown

promptless Bot commented Dec 24, 2025

📝 Documentation updates detected!

New suggestion: Document checkLink HTTP status code errors in troubleshooting

@hawkeyexl hawkeyexl merged commit e433358 into main Dec 24, 2025
16 checks passed
@hawkeyexl hawkeyexl deleted the copilot/investigate-checklink-failures branch December 24, 2025 20:52
@promptless
Copy link
Copy Markdown

promptless Bot commented Dec 24, 2025

📝 Documentation updates detected!

Updated existing suggestion: Document checkLink HTTP status code errors in troubleshooting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants