Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

URL: run IdnaTestV2.txt in WPT #38080

Merged
merged 8 commits into from
Jan 24, 2023
Merged

URL: run IdnaTestV2.txt in WPT #38080

merged 8 commits into from
Jan 24, 2023

Conversation

annevk
Copy link
Member

@annevk annevk commented Jan 20, 2023

For whatwg/url#341.

These tests work and can run in browsers. The upstream tests have issues as indicated in IdnaTestV2-feedback.txt. If folks would be willing to take a look at this that would be appreciated. Especially if there are further oversights in the upstream tests we'd want to call out to the Unicode Consortium.

(Design feedback is also welcome of course. The code isn't great and still has some print() statements. Happy to clean it up, especially given some direction.)

cc @ricea @TimothyGu @achristensen07 @domenic @valenting @karwa

url/resources/IdnaTestV2-parser.py Outdated Show resolved Hide resolved
url/resources/IdnaTestV2-parser.py Outdated Show resolved Hide resolved
@annevk
Copy link
Member Author

annevk commented Jan 23, 2023

Analyzing the remaining failures in WebKit they are all about https://www.unicode.org/reports/tr46/#UseSTD3ASCIIRules. Not just the code points listed there, but their xn-- equivalents or simply ASCII <, >, and =. I haven't found a way to be able to filter all of them in a robust enough way given the tests so they are kept in for now.

I was able to filter IPV4-like inputs in a way that captures all of the issues with them. I suppose instead of excluding them I could also change the expectation to failure. Thoughts welcome.

@annevk annevk requested a review from valenting January 23, 2023 17:16
@annevk
Copy link
Member Author

annevk commented Jan 23, 2023

I think this is ready for inclusion now. I have removed all tests that have outstanding UTS46 feedback. WebKit has 0 failures, Gecko 52, and Chromium 712.

Unfortunately this is only about a third of the test suite, but we can improve this over time as UTS46 considers our feedback.

Copy link
Contributor

@valenting valenting left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Thank you for adding this!

@karwa
Copy link
Contributor

karwa commented Jan 23, 2023

Looks good! But why disable the Bidi tests?

@annevk
Copy link
Member Author

annevk commented Jan 24, 2023

CheckBidi has at least one UB issue (empty trailing label) and also has outstanding feedback that it might be too strict: whatwg/url#543. I have submitted the former to Unicode and plan on submitting the latter either once I've gotten a better idea of what to recommend or once I've given up on getting a better idea and decide to at the very least raise the issue.

WebKit does currently pass all the CheckBidi tests, but other browsers and jsdom/whatwg-url do not, and I think the feedback around it being too strict has merit and deserves further review before we enshrine the status quo.

Copy link
Member

@TimothyGu TimothyGu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for picking up this work again!

url/tools/IdnaTestV2-parser.py Show resolved Hide resolved
url/IdnaTestV2.window.js Outdated Show resolved Hide resolved
@annevk annevk enabled auto-merge (squash) January 24, 2023 10:08
@annevk annevk merged commit 9216115 into master Jan 24, 2023
@annevk annevk deleted the annevk/IdnaTestV2 branch January 24, 2023 10:11
}
}
return output;
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be slightly nicer for other consumers like jsdom if this preprocessing, as well as the prepending of https:// and appending of /x, was done by the Python script that output the JSON file.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did not want to do that as I want URLHost (it'll happen one day!) to be able to consume the actual test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants