-
-
Notifications
You must be signed in to change notification settings - Fork 8.6k
[py] Validate URLs before navigation to fail fast on malformed input #17041
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
Co-authored-by: AutomatedTester <128518+AutomatedTester@users.noreply.github.com>
Co-authored-by: AutomatedTester <128518+AutomatedTester@users.noreply.github.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull request overview
This pull request adds client-side URL validation to Python Selenium bindings to provide clear, immediate feedback when users attempt navigation with malformed URLs. Previously, malformed URLs would either fail silently (Chrome) or return unclear errors (Firefox), leaving users confused.
Changes:
- Added
is_valid_url()function incommon/utils.pyto validate URLs have proper schemes per RFC 3986 - Integrated validation into
driver.get()andbrowsing_context.navigate()to raiseInvalidArgumentExceptionwith descriptive messages on invalid URLs - Added comprehensive unit and integration tests covering valid/invalid URL scenarios
Reviewed changes
Copilot reviewed 5 out of 5 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
| py/selenium/webdriver/common/utils.py | Adds is_valid_url() function to validate URLs have non-empty schemes with valid characters per RFC 3986 |
| py/selenium/webdriver/remote/webdriver.py | Integrates URL validation into driver.get() method, raising InvalidArgumentException for invalid URLs |
| py/selenium/webdriver/common/bidi/browsing_context.py | Integrates URL validation into browsing_context.navigate() method with same exception handling |
| py/test/unit/selenium/webdriver/common/url_validation_tests.py | Unit tests for URL validation logic covering various valid/invalid URL formats |
| py/test/selenium/webdriver/common/url_navigation_validation_tests.py | Integration tests verifying both navigation methods properly validate URLs and raise exceptions |
| # Schemes should only contain alphanumeric characters, +, -, and . | ||
| # per RFC 3986 section 3.1 | ||
| if not all(c.isalnum() or c in "+-." for c in result.scheme): | ||
| return False |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to RFC 3986 section 3.1, a URI scheme must begin with a letter, followed by any combination of letters, digits, plus (+), period (.), or hyphen (-). The current validation logic checks if all characters are alphanumeric or in "+-.", but it doesn't verify that the scheme starts with a letter. This could allow invalid schemes like "1http" or "+scheme" to pass validation.
The validation should be updated to check that the first character of the scheme is alphabetic. For example:
- Check that result.scheme[0].isalpha() is True
- Then check the remaining characters
| # Schemes should only contain alphanumeric characters, +, -, and . | |
| # per RFC 3986 section 3.1 | |
| if not all(c.isalnum() or c in "+-." for c in result.scheme): | |
| return False | |
| # Schemes must begin with a letter, and subsequent characters may be | |
| # alphanumeric, '+', '-', or '.', per RFC 3986 section 3.1 | |
| if not result.scheme[0].isalpha(): | |
| return False | |
| if not all(c.isalnum() or c in "+-." for c in result.scheme[1:]): | |
| return False |
| def test_valid_custom_scheme_url(self): | ||
| """Test that custom scheme URLs are accepted.""" | ||
| assert is_valid_url("custom://example.com") is True | ||
|
|
||
| def test_invalid_url_without_scheme(self): | ||
| """Test that URLs without scheme are rejected.""" | ||
| assert is_valid_url("example.com") is False | ||
|
|
||
| def test_invalid_url_with_malformed_scheme(self): | ||
| """Test that URLs with malformed scheme (missing :) are rejected.""" | ||
| assert is_valid_url("http//example.com") is False | ||
|
|
||
| def test_invalid_url_with_space(self): | ||
| """Test that URLs with spaces in the scheme are rejected.""" | ||
| assert is_valid_url("http ://example.com") is False | ||
|
|
||
| def test_invalid_empty_string(self): | ||
| """Test that empty strings are rejected.""" | ||
| assert is_valid_url("") is False | ||
|
|
||
| def test_invalid_relative_url(self): | ||
| """Test that relative URLs are rejected.""" | ||
| assert is_valid_url("/path/to/page") is False | ||
|
|
||
| def test_invalid_protocol_relative_url(self): | ||
| """Test that protocol-relative URLs are rejected.""" | ||
| assert is_valid_url("//example.com") is False |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The test suite is missing edge case tests for invalid schemes that start with non-letter characters. According to RFC 3986 section 3.1, schemes must start with a letter. Consider adding test cases for:
- Schemes starting with digits (e.g., "1http://example.com")
- Schemes starting with special characters (e.g., "+scheme://example.com", "-scheme://example.com", ".scheme://example.com")
These tests would catch the validation bug where the scheme isn't checked to start with a letter.
|
|
||
| import pytest | ||
|
|
||
| from selenium.common.exceptions import InvalidArgumentException |
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The import of InvalidArgumentException is unused in this file. The tests only use the is_valid_url function directly without testing any exception handling. Either remove this unused import or add tests that verify exception handling behavior.
| from selenium.common.exceptions import InvalidArgumentException |
| import pytest | ||
|
|
Copilot
AI
Feb 2, 2026
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import of 'pytest' is not used.
| import pytest |
|
I originally raised this issue, but after discussion in Slack, I think the consensus was that the validation is really narrow, and it's better to just let the browser return an error on navigation than trying to to validate upfront. |
Sounds good to me. I only did the PR to play Copilot Agents |
Chrome silently fails and Firefox returns unclear errors when navigating to malformed URLs (missing schemes, typos like
http//example.com). Users get confused whendriver.get("example.com")doesn't navigate or raise an exception.Changes
Added
is_valid_url()tocommon/utils.py: Validates absolute URLs per W3C WebDriver spec. Checks for valid scheme (RFC 3986: alphanumeric,+,-,.). Accepts all schemes including custom ones.Applied validation to navigation methods:
driver.get()inremote/webdriver.pybrowsing_context.navigate()incommon/bidi/browsing_context.pyBoth raise
InvalidArgumentExceptionwith descriptive message on invalid URLs.Test coverage: Unit tests for validation logic, integration tests for both navigation methods.
Example
User Concerns Addressed:
urlparseWarning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
releases.bazel.build/usr/local/lib/node_modules/@bazel/bazelisk/bazelisk-linux_amd64 /usr/local/lib/node_modules/@bazel/bazelisk/bazelisk-linux_amd64 test //py/test/unit/selenium/webdriver/common:url_validation_tests --test_output=all(dns block)/usr/local/lib/node_modules/@bazel/bazelisk/bazelisk-linux_amd64 /usr/local/lib/node_modules/@bazel/bazelisk/bazelisk-linux_amd64 info workspace(dns block)If you need me to access, download, or install something from one of these locations, you can either:
Original prompt
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.