Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign up[idna] Update data to Unicode 10.0 and fix logic #351
Conversation
|
Filed unicode-rs/unicode-normalization#16 for |
a3ea472
to
a48ab35
|
|
|
Now that |
|
|
|
Sorry for the delays and repeated conflicts. If you’d prefer I can take over and do the minor changes I requested below. Reviewed 1 of 1 files at r1, 2 of 2 files at r2, 1 of 1 files at r3, 2 of 2 files at r4, 2 of 2 files at r5. Cargo.toml, line 5 at r5 (raw file):
This change is not necessary, please remove it. Since the new version of idna is semver-compatible, end-users can update to it independently of url. Cargo.toml, line 45 at r5 (raw file):
This is also unnecessary. "0.1.0" means ">=0.1.0,<0.2.0", same as "0.1". idna/src/uts46.rs, line 416 at r4 (raw file):
Comments from Reviewable |
A retry of #171 This diff changes the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in #166. The change in the code results in a few failures for test cases of the Conformance Testing data provided with UTS #46. But, as the header of the test data file (IdnaTest.txt) says: "If the file does not indicate an error, then the implementation must either have an error, or must have a matching result." Therefore, failing on those test cases does not break conformance with UTS #46, and to some level, anticipated. As mentioned in #166, a feedback is submitted for this inconsistency and the test logic can be improved later if the data file addresses the comments. Until then, we can throw less errors and maintain passing conformance tests with this diff. To keep the side-effects of ignoring errors during test runs as minimum as possible, I have separated `TooShortForDns` error from `TooLongForDns`. The `Error` struct has been kept private, so the change won't affect any library users. Fix #166
* The code was disabled to allow tests pass. Now that `IdnaTest.txt` is fixed for this failure, we can re-enable the code.
* As the first paragraph of The Bidi Rules section explains, the rules need to be ignored if there are no Bidi labels present in the domain name. So, add `is_bidi_domain` evaluation to `processing()`, and pass it down to `passes_bidi()` to act on. * Add unit tests for the bidi rules, making it faster and easier to maintain the feature.
|
Thanks for the review, @SimonSapin. I've addressed all the comments and rebased, so should be good to land. Review status: 1 of 5 files reviewed at latest revision, 3 unresolved discussions. idna/src/uts46.rs, line 416 at r4 (raw file): Previously, SimonSapin (Simon Sapin) wrote…
Done. Cargo.toml, line 5 at r5 (raw file): Previously, SimonSapin (Simon Sapin) wrote…
Done. Cargo.toml, line 45 at r5 (raw file): Previously, SimonSapin (Simon Sapin) wrote…
Done. Comments from Reviewable |
|
Reviewed 1 of 2 files at r7, 2 of 3 files at r10, 1 of 1 files at r11. idna/src/uts46.rs, line 346 at r10 (raw file):
Can we not make this public? That would mean that adding a new variant (like you did in another commit of this PR) would be a breaking change. It looks like unit tests don’t need it. Comments from Reviewable |
|
Review status: 3 of 5 files reviewed at latest revision, 1 unresolved discussion. idna/src/uts46.rs, line 346 at r10 (raw file): Previously, SimonSapin (Simon Sapin) wrote…
Right, there was no need for it anymore. Reverted it. Comments from Reviewable |
|
Also updated the |
|
Btw, updating |
|
Looks great, thanks! @bors-servo r+
Please do. (I’m a bit surprise the new tests pass without the new data. Maybe the tests don’t cover the normative differences.) Reviewed 1 of 4 files at r6, 1 of 2 files at r12, 1 of 1 files at r13, 1 of 1 files at r14. Comments from Reviewable |
|
|
[idna] Update data to Unicode 10.0 and fix logic * Change the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in #166. (Another attempt on #337 and #171) * Update `IdnaTest.txt` file to UCD 10.0 and fix Validation Rules, specially Bidi Rules, for the tests to pass. * Add TODO marks for new flags introduced in Unicode 10.0 version of UTS#46. (http://www.unicode.org/reports/tr46/proposed.html) * Add integration test for `rust-url` crate for the new behavior. Fix #166 <!-- Reviewable:start --> --- This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/rust-url/351) <!-- Reviewable:end -->
|
|
behnam commentedMay 29, 2017
•
edited by larsbergstrom
Change the behavior of ToASCII step to match the spec and prevent failures on some cases when a domain name starts with leading dots (FULL STOPs), as requested in #166. (Another attempt on #337 and #171)
Update
IdnaTest.txtfile to UCD 10.0 and fix Validation Rules, specially Bidi Rules, for the tests to pass.Add TODO marks for new flags introduced in Unicode 10.0 version of UTS#46. (http://www.unicode.org/reports/tr46/proposed.html)
Add integration test for
rust-urlcrate for the new behavior.Fix #166
This change is