Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in assert_node_equals hides test failures #202

Open
evilpie opened this issue Nov 17, 2023 · 11 comments
Open

Bug in assert_node_equals hides test failures #202

evilpie opened this issue Nov 17, 2023 · 11 comments
Assignees
Milestone

Comments

@evilpie
Copy link

evilpie commented Nov 17, 2023

While updating the tests for the new WebIDL syntax I realized that assert_node_equals doesn't correctly work for <template>, because isNodeEqual doesn't look at the DocumentFragement in content at all.

After running the setHTML tests with a <div> instead, I've noticed some test failures with tests like this:

var html = "<div>test<div>p</div>tt<p>div</p></div>";
a.setHTML(html, {sanitizer: { elements: ["p"] }});
b.innerHTML = "testptt<p>div</p>";
a.isNodeEqual(b)

This is caused by "flattening" turning the first into three text nodes like [test][p][tt] instead of one long text node [testptt]. I am not sure if there is way to keep the current way we write tests combined with isNodeEqual. A simple a.innerHTML == b.innerHTML wouldn't have this issue.

Aside: Should the Sanitizer algorithm maybe concatenate those nodes?
/cc @otherdaniel

@otherdaniel
Copy link
Collaborator

While updating the tests for the new WebIDL syntax I realized that assert_node_equals doesn't correctly work for <template>, because isNodeEqual doesn't look at the DocumentFragement in content at all.

After running the setHTML tests with a <div> instead, I've noticed some test failures with tests like this:

var html = "<div>test<div>p</div>tt<p>div</p></div>";
a.setHTML(html, {sanitizer: { elements: ["p"] }});
b.innerHTML = "testptt<p>div</p>";
a.isNodeEqual(b)

This is caused by "flattening" turning the first into three text nodes like [test][p][tt] instead of one long text node [testptt]. I am not sure if there is way to keep the current way we write tests combined with isNodeEqual.

I had solved this problem at some point with a more complex version of assert_node_equals, that calls Node.normalize() on both nodes and recurses ( assert_node_equals(a.content, b.content)) if it encounters a DocumentFragment.

A simple a.innerHTML == b.innerHTML wouldn't have this issue.

Hmm. Maybe that's the simpler way. But I believe .innerHTML and HTML fragment serialization in general doesn't provide some of the necessary guarantees, e.g. order of attributes. As I read the spec, it guarantees a consistent order for multiple invocations on the same node, but not for different nodes. (No idea how that looks in practice, though.)

Aside: Should the Sanitizer algorithm maybe concatenate those nodes?

My "reading" of the group is that there's a strong preference to build things on top of existing DOM algorithms. I suspect some sort of append-and-join-text-nodes wouldn't be well received. Maybe one could spec it to always .normalize() nodes after modifying them. In either case... I'll be happy to go with the group on this; but off-hand I'm skeptical.

@mozfreddyb
Copy link
Collaborator

I'm not sure we if we require neighboring text nodes to be merged. I agree with @otherdaniel, that it doesn't sound so great about using innerHTML to compare a serialization.

Looking at our HTML experts, e.g., @annevk and @zcorpan to see if they have had this issue before.

@annevk
Copy link
Collaborator

annevk commented Nov 20, 2023

There's not many APIs that manipulate the node tree directly in this way. I suggest you look at what the various methods on ranges do. They might normalize, not sure.

@securityMB
Copy link

Would it be viable to just call normalize on nodes before the comparison?

@mozfreddyb
Copy link
Collaborator

Mh, would it make sense to normalize as part of the Sanitizer API?

@evilpie
Copy link
Author

evilpie commented Nov 20, 2023

We apparently have three assert_node_equals implementations, and one uses normalize already, like Daniel said. I am definitely going to look into unifying those after I get the tests working.

@mozfreddyb
Copy link
Collaborator

Can you take a stab at fixing the (tentative) tests and closing this one out, @evilpie?

@mozfreddyb mozfreddyb added the v1 label Jan 23, 2024
@zcorpan
Copy link

zcorpan commented Jan 23, 2024

We should switch to html5lib's treebuilder tests .dat format for these tests, after I've moved it to wpt.
web-platform-tests/wpt#27868

@evilpie
Copy link
Author

evilpie commented Feb 5, 2024

I have updated the tests, I can try to figure out how to this via WPT directly or we sync them via Mozilla. @otherdaniel does this look good to you: https://phabricator.services.mozilla.com/D200627

@annevk
Copy link
Collaborator

annevk commented Feb 5, 2024

@evilpie this does not allow us to test the exact structure, which does not seem ideal. We'd then have to test separately that normalization does not happen. I think Simon's idea was more promising.

@mozfreddyb
Copy link
Collaborator

There's likely value in solving this with a function that is upstreamed to wpt in order to make use of this here and in Trusted Types.

@mozfreddyb mozfreddyb added this to the v1 milestone Apr 17, 2024
@mozfreddyb mozfreddyb removed the v1 label Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants