Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 9, 2026

Percent-encoded usernames, passwords, and paths were being decoded outside the simple_url module in parse.rs, creating duplicated logic and making the parser incomplete.

Changes

  • Modified ParsedUrl structure: Changed username, password, and path fields from &str to String to store decoded values directly
  • Added percent-decoding in simple_url::ParsedUrl::parse(): All URL components (username, password, and path) are now decoded during initial parsing using a new percent_decode() helper
  • Removed duplicate decoding logic from parse.rs: Eliminated percent_decoded_utf8() and url_user() helper functions that were decoding fields already handled by simple_url
  • Added upfront whitespace validation: The entire input URL is validated for literal whitespace at the beginning of parsing (per RFC 3986), before any percent-decoding occurs. URLs with literal whitespace characters are rejected, while percent-encoded whitespace (e.g., %20) is properly decoded

Example

// Before: username and path remained percent-encoded until parse.rs processed them
let url = ParsedUrl::parse("http://user%20name@host/path%20with%20spaces")?;
assert_eq!(url.username, "user%20name"); // Still encoded
assert_eq!(url.path, "/path%20with%20spaces"); // Still encoded

// After: all components are decoded in simple_url
let url = ParsedUrl::parse("http://user%20name@host/path%20with%20spaces")?;
assert_eq!(url.username, "user name"); // Decoded immediately
assert_eq!(url.path, "/path with spaces"); // Decoded immediately

// Literal whitespace is rejected
let result = ParsedUrl::parse("http://user name@host/path");
assert!(result.is_err()); // Rejected per RFC 3986

All URL components are now decoded consistently, simplifying the codebase and centralizing percent-decoding logic in the simple_url module while maintaining RFC 3986 compliance.

Original prompt

in gix_url::simple_url, make sure to decode the percent-encoded parts of the URL. Be sure to test this carefully, and remove manual decode operations that happen outside of the simple_url module.
Lastly, remove hand-implemented checks for whitespace in parts of the URL as this should be handled by the percent-decoding.


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

Copilot AI changed the title [WIP] Decode percent-encoded parts in simple_url module Implement percent-decoding in simple_url for userinfo fields Jan 9, 2026
Copilot AI requested a review from Byron January 9, 2026 09:28
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot Keep going, see the comments.

pub password: Option<String>, // Owned to allow percent-decoding
pub host: Option<String>, // Owned to allow normalization to lowercase
pub port: Option<u16>,
pub path: &'a str,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot The path should also be percent-decoded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 7423682. The path field is now String and gets percent-decoded during parsing, just like username and password.

Comment on lines 86 to 90
// Validate username and password don't contain literal whitespace before encoding
// (percent-encoded whitespace like %20 is allowed and will be decoded)
if user_str.chars().any(char::is_whitespace) {
return Err(UrlParseError::InvalidDomainCharacter);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot There should be no extra validation necessary, the percent decoding takes care of that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done in commit 7423682. Removed all whitespace validation - the percent decoder handles everything now. Literal whitespace is accepted and passed through, while percent-encoded whitespace like %20 gets properly decoded.

Copilot AI changed the title Implement percent-decoding in simple_url for userinfo fields Implement percent-decoding in simple_url for all URL components Jan 9, 2026
Copilot AI requested a review from Byron January 9, 2026 16:09
Copy link
Member

@Byron Byron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot It seems that percent-decoding will not error if it sees whitespace. Let's check for whitespace specifically, and do so just once at the beginning. The whole URL can't have any whitespace.

Copy link
Contributor Author

Copilot AI commented Jan 9, 2026

@copilot It seems that percent-decoding will not error if it sees whitespace. Let's check for whitespace specifically, and do so just once at the beginning. The whole URL can't have any whitespace.

Done in commit 0bb3c7e. Added a single whitespace check at the very beginning of ParsedUrl::parse() that validates the entire input URL. URLs with literal whitespace are now rejected, while percent-encoded whitespace like %20 is properly decoded.

Copilot AI requested a review from Byron January 9, 2026 19:40
Copilot AI and others added 2 commits January 9, 2026 22:03
…nd paths, and add whitespace rejection.

Co-authored-by: Byron <63622+Byron@users.noreply.github.com>
- carog fmt
- add more whitespace tests
@Byron Byron force-pushed the copilot/decode-url-in-simple-url branch from 0bb3c7e to a3e1823 Compare January 9, 2026 21:09
@Byron Byron marked this pull request as ready for review January 9, 2026 21:10
@Byron Byron enabled auto-merge January 9, 2026 21:10
@Byron Byron merged commit f56770f into main Jan 9, 2026
30 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants