Skip to content

Conversation

@steven10a
Copy link
Collaborator

@steven10a steven10a commented Nov 20, 2025

Making URL detection more robust in response to #39

  • Normalized allowed_schemes entries for matching
  • Reworked _is_url_allowed to handle host only entries by using the allowed_schemes
  • Added blocking URLs that contain password when block_userinfo=true
  • Better port matching with validation
  • Support scheme less inputs for improved usability (example.com doesn't get blocked when https://example.com is in the allow list. But http://example.com does)
  • Expanded test coverage

Copilot AI review requested due to automatic review settings November 20, 2025 18:54
@steven10a steven10a changed the title Dev/steven/url v2 Enhance URL allow list matching Nov 20, 2025
Copilot finished reviewing on behalf of steven10a November 20, 2025 19:00
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR significantly enhances the URL filtering guardrail's robustness and usability by improving URL validation, normalization, and matching logic. The changes address JS issue #39 with better handling of edge cases and more flexible URL matching.

Key improvements include:

  • Scheme normalization: Handles various scheme formats (HTTPS://, http:, https ) uniformly
  • Enhanced security: Now blocks URLs containing passwords in addition to usernames when block_userinfo=true
  • Scheme-less URL support: Enables more natural allow list entries (e.g., example.com) that match URLs regardless of scheme, while still enforcing explicit scheme restrictions when specified
  • Improved validation: Better port validation, path segment boundary checking, and fallback host extraction for malformed URLs

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/checks/urls.ts Core implementation updates including new helper functions (normalizeAllowedSchemes, extractHostCandidate, safeGetPort, isIpv4Address), refactored validateUrlSecurity to track scheme presence, and completely reworked isUrlAllowed with comprehensive matching logic for schemes, ports, paths, queries, and fragments
src/__tests__/unit/checks/keywords-urls.test.ts Extensive test additions covering scheme normalization, path boundaries, scheme-less matching, CIDR ranges, port matching, query/fragment requirements, password-only userinfo blocking, malformed port handling, and trailing slash edge cases

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Comment on lines +285 to +289
// Only check allowed_schemes if the URL explicitly had a scheme
const normalizedScheme = originalScheme.toLowerCase();

if (hadScheme && !config.allowed_schemes.has(normalizedScheme)) {
return { parsedUrl: null, reason: `Blocked scheme: ${normalizedScheme}`, hadScheme };

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Enforce allowed_schemes for scheme-less URLs

Scheme validation is skipped whenever the input omits an explicit scheme (hadScheme is false), so a plain example.com now bypasses allowed_schemes. For configs that deliberately disallow http (e.g., allowed_schemes: ['https']) but still have an HTTP entry in the allow list, users can simply drop the http:// prefix and the URL will be allowed because both the security check here and the later allow-list scheme check are gated on hadScheme. This reintroduces disallowed schemes via scheme-less inputs and weakens the scheme restriction.

Useful? React with 👍 / 👎.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WAI. Implemented this way for usability. A typical user will type example.com and not https://example.com. A scheme not allowed will still be caught (http://example.com)

@steven10a steven10a requested a review from Copilot November 20, 2025 19:28
@steven10a
Copy link
Collaborator Author

@codex review

Copilot finished reviewing on behalf of steven10a November 20, 2025 19:33
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 11 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@chatgpt-codex-connector
Copy link

Codex Review: Didn't find any major issues. Delightful!

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

Copilot finished reviewing on behalf of steven10a November 20, 2025 20:07
Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@gabor-openai gabor-openai self-requested a review November 21, 2025 01:10
Copy link
Collaborator

@gabor-openai gabor-openai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY

@gabor-openai gabor-openai merged commit 368b40e into main Nov 21, 2025
1 check passed
@steven10a steven10a deleted the dev/steven/url_v2 branch December 2, 2025 21:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants