Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upSystematically ignore tabs and newlines, per spec change #190
Conversation
The test changes correspond to bugs in rust-url before this change: the value of pathname/search/hash setters does go through the URL parser (though with a state override), which includes removing trailing and leading controls and spaces.
|
This seems like a lot of extra complexity to handle an uncommon case (URLs with tabs or spaces). It seems like there should be a simpler way to do this, something like making Am I right that one of the goals is to keep this as a one-pass algorithm, which is why the obvious brute force scan of the input for tabs and spacing before parsing (and filtering/collecting it if any are found) isn't acceptable? Reviewed 6 of 6 files at r1. src/lib.rs, line 594 [r1] (raw file): src/parser.rs, line 130 [r1] (raw file): src/parser.rs, line 326 [r1] (raw file): src/parser.rs, line 340 [r1] (raw file): src/parser.rs, line 351 [r1] (raw file): src/parser.rs, line 621 [r1] (raw file): src/parser.rs, line 678 [r1] (raw file): src/parser.rs, line 745 [r1] (raw file): Comments from Reviewable |
|
The complexity was already there, just (inconsistently) spread in various parts of the parser that used to each have code to ignore these characters. But with this spec change I think it would be unreasonable complex to ignore them manually everywhere, for example between the slashes of
That’s kinda what this PR is all about? (
Yes. Or maybe filtering and collecting into a Review status: 5 of 6 files reviewed at latest revision, 8 unresolved discussions. src/lib.rs, line 594 [r1] (raw file): src/parser.rs, line 130 [r1] (raw file): src/parser.rs, line 326 [r1] (raw file): src/parser.rs, line 340 [r1] (raw file): src/parser.rs, line 351 [r1] (raw file): src/parser.rs, line 621 [r1] (raw file): src/parser.rs, line 678 [r1] (raw file): This PR is not just an internal refactoring. It’s about implementing a change (linked from the PR message) in the specification’s requirements. If you look at individual commits, the first one "Parse based on iterators" is a refactoring that doesn’t change the behavior, but it only exists to enable the behavior change in the second commit "Systematically ignore tabs and newlines". src/parser.rs, line 745 [r1] (raw file): Comments from Reviewable |
|
Archiving an IRC conversation: http://logs.glob.uno/?c=mozilla%23servo&s=26+Apr+2016&e=26+Apr+2016#c415136 |
|
I think at this point we're into style differences. You can r=me. Squash the commits? Reviewed 1 of 1 files at r2. src/parser.rs, line 107 [r2] (raw file): Comments from Reviewable |
|
Commits are as intended. @bors-servo r+ |
|
|
Systematically ignore tabs and newlines, per spec change whatwg/url@7b40216 whatwg/url#101 <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="35" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/190) <!-- Reviewable:end -->
|
|
SimonSapin commentedApr 25, 2016
whatwg/url@7b40216
whatwg/url#101
This change is