Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

HTML API: Add test suite from html5lib #5794

Closed
wants to merge 70 commits into from

Conversation

sirreal
Copy link
Member

@sirreal sirreal commented Dec 19, 2023

Add a suite of tests for the HTML API taken from https://github.com/html5lib/html5lib-tests/tree/a9f44960a9fedf265093d22b2aa3c7ca123727b9

That repository is MIT licensed.

Only the tree construction tests are taken. This suite of tests is also used by the servo project to test its html5ever package.

In this PR we extract relevant test data from html5lib-tests and use it to test the HTML Processor. We transform the HTML Processor output to match the expected tree shape from html5lib-tests. We also explicitly skip some tests based on their test data names. Test data names look like tests20/line0497 where the form is:

test_file "/line" test_line_number

where test_line_number is padded to 4 digits.

Tests can be skipped by adding them to the SKIP_TEST array. Tests are easy to filter by running e.g. phpunit --filter=tests1/line0001.

TODO

  • Move test data to tests/phpunit/data
  • Fix tests or skip tests with expected failures
  • Add instructions for how to update test data from source repo

Trac ticket: https://core.trac.wordpress.org/ticket/60227


This Pull Request is for code review only. Please keep all other discussion in the Trac ticket. Do not merge this Pull Request. See GitHub Pull Requests for Code Review in the Core Handbook for more details.

Copy link

Test using WordPress Playground

The changes in this pull request can previewed and tested using a WordPress Playground instance.

WordPress Playground is an experimental project that creates a full WordPress instance entirely within the browser.

Some things to be aware of

  • The Plugin and Theme Directories cannot be accessed within Playground.
  • All changes will be lost when closing a tab with a Playground instance.
  • All changes will be lost when refreshing the page.
  • A fresh instance is created each time the link below is clicked.
  • Every time this pull request is updated, a new ZIP file containing all changes is created. If changes are not reflected in the Playground instance,
    it's possible that the most recent build failed, or has not completed. Check the list of workflow runs to be sure.

For more details about these limitations and more, check out the Limitations page in the WordPress Playground documentation.

Test this pull request with WordPress Playground.

@sirreal
Copy link
Member Author

sirreal commented Dec 22, 2023

Rebased. I started skipping tests that don't expect empty head.

@sirreal

This comment was marked as resolved.

@sirreal sirreal force-pushed the add-html5lib-tests branch 2 times, most recently from 5b434ef to 968a381 Compare January 10, 2024 16:29
@sirreal
Copy link
Member Author

sirreal commented Jan 16, 2024

  • The suite is MIT licensed. Is that OK to use?

I'm confident at this point that this license is compatible. The GNU page on compatible license mentions it (although they call it Expat).

Additionally, jQuery and React are both MIT licensed, and WordPress uses them.

@sirreal sirreal force-pushed the add-html5lib-tests branch 2 times, most recently from 43af3a4 to c23cda9 Compare January 16, 2024 12:53
@sirreal sirreal marked this pull request as ready for review January 16, 2024 14:08
@sirreal
Copy link
Member Author

sirreal commented Jan 16, 2024

This is ready for review.

@sirreal
Copy link
Member Author

sirreal commented Jan 17, 2024

naming feels a bit cumbersome. Tests_HtmlApi_WpHtmlProcessorHtml5lib::test_external_html5lib just feels long.

I've renamed the class and test method to Tests_HtmlApi_Html5lib::test_parse.

@sirreal
Copy link
Member Author

sirreal commented Jan 17, 2024

I've also skipped the remaining failing test, I clearly marked it as a bug in its skip message.

Copy link
Contributor

@dmsnell dmsnell left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not sure when I left this comment, but I apparently forgot to hit submit review and thus it never posted

tests/phpunit/tests/html-api/wpHtmlProcessorHtml5lib.php Outdated Show resolved Hide resolved
@sirreal

This comment was marked as outdated.

@sirreal
Copy link
Member Author

sirreal commented Jan 31, 2024

@dmsnell and I looked at a test case where we fail to reach a text node at the end of the document in these tests. The text node seems to be reached, but its token type is null. I think if we get that fixed most of the test failures will be fixed 🎉

@costdev
Copy link
Contributor

costdev commented Apr 15, 2024

A couple of the test data files are being detected as binary by Git. The html5lib repo uses this .gitattributes file to correct it.

To ensure the files are always human-readable whether locally or in PRs, can we add the following file to this PR?

tests/phpunit/data/html5lib-tests/.gitattributes

*.dat		-text diff

@dmsnell
Copy link
Contributor

dmsnell commented Apr 16, 2024

Thanks @costdev - good catch!

@dmsnell
Copy link
Contributor

dmsnell commented Apr 16, 2024

Merged in [58010]
3b9aefb

@dmsnell dmsnell closed this Apr 16, 2024
@dmsnell dmsnell deleted the add-html5lib-tests branch April 16, 2024 12:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
5 participants