Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[WIP] 15270 Parameterize DOM Node Traversal Methods #15503

Closed
wants to merge 1 commit into from

Conversation

@DominoTree
Copy link
Contributor

DominoTree commented Feb 11, 2017

Think this is getting pretty close now.

Made some ugly edits to try and ensure exact functionality is preserved with iterators. Once I identify the issue I'm seeing, I'll clean them up.

I've also temporarily made all of the places calling iterators with the new type parameters only use Node or Element - this way they should have identical functionality as they did before, which should make debugging the iterators themselves easier.

Once the iterators are done, we can clean up where they're being called from as well.


  • ./mach build -d does not report any errors
  • ./mach test-tidy does not report any errors
  • These changes fix #15270 (github issue number if applicable).
  • There are tests for these changes OR
  • These changes do not require tests because _____

This change is Reviewable

@highfive
Copy link

highfive commented Feb 11, 2017

Heads up! This PR modifies the following files:

  • @fitzgen: components/script/dom/document.rs, components/script/dom/nodeiterator.rs, components/script/dom/servoparser/mod.rs, components/script/dom/htmlcollection.rs, components/script/dom/node.rs, components/script/dom/documentfragment.rs, components/script/dom/htmlfieldsetelement.rs, components/script/dom/text.rs, components/script/dom/characterdata.rs, components/script/dom/servoparser/html.rs, components/script/dom/range.rs, components/script/dom/htmloptgroupelement.rs, components/script/dom/htmlselectelement.rs, components/script/dom/htmltableelement.rs, components/script/dom/htmlimageelement.rs, components/script/dom/element.rs, components/script/dom/htmltablecellelement.rs, components/script/dom/htmloptionelement.rs, components/script/lib.rs, components/script/dom/htmlbodyelement.rs, components/script/devtools.rs
  • @KiChjang: components/script/dom/document.rs, components/script/dom/nodeiterator.rs, components/script/dom/servoparser/mod.rs, components/script/dom/htmlcollection.rs, components/script/dom/node.rs, components/script/dom/documentfragment.rs, components/script/dom/htmlfieldsetelement.rs, components/script/dom/text.rs, components/script/dom/characterdata.rs, components/script/dom/servoparser/html.rs, components/script/dom/range.rs, components/script/dom/htmloptgroupelement.rs, components/script/dom/htmlselectelement.rs, components/script/dom/htmltableelement.rs, components/script/dom/htmlimageelement.rs, components/script/dom/element.rs, components/script/dom/htmltablecellelement.rs, components/script/dom/htmloptionelement.rs, components/script/lib.rs, components/script/dom/htmlbodyelement.rs, components/script/devtools.rs
@highfive
Copy link

highfive commented Feb 11, 2017

warning Warning warning

  • These commits modify script code, but no tests are modified. Please consider adding a test!
@KiChjang
Copy link
Member

KiChjang commented Feb 16, 2017

r? @jdm

@highfive highfive assigned jdm and unassigned KiChjang Feb 16, 2017
@DominoTree
Copy link
Contributor Author

DominoTree commented Feb 16, 2017

Status -

At least one of my iterators is currently broken and I'm trying to figure out which one(s) are incorrect. I've rewritten them several times but I'm not currently sure which one to focus on - I think I may have been staring at this too long and need a new set of eyes.

Once the iterators are working, I'd like to add while() loops to all of them, as in the first two examples from #14451 so the parameter can be used for filtering if necessary.

Finally, I'll take advantage of that where the iterators are currently being called from.

@nox
Copy link
Member

nox commented Feb 27, 2017

Did you manage to progress on this, @DominoTree?

@jdm
Copy link
Member

jdm commented Feb 27, 2017

I meant to take a look at this last week but got swamped by other work. If I don't get a chance to do so today or tomorrow, I'll hand it off to someone else.

@jdm
Copy link
Member

jdm commented Feb 27, 2017

Actually @nox graciously offered to take this off my plate.
r? @nox

@highfive highfive assigned nox and unassigned jdm Feb 27, 2017
@DominoTree
Copy link
Contributor Author

DominoTree commented Feb 27, 2017

Haven't had a chance to dig into this much further yet, but any guidance would be appreciated. I should have some time this week.

The first commit was my first attempt, and the second was mostly cleaning things up to isolate issues a bit more. I think there were a few other minor changes I made to my local copy to fix some bugs that I haven't pushed up, but I'll have to take a look at it and see where things ended up.

@nox
Copy link
Member

nox commented Mar 1, 2017

@bors-servo
Copy link
Contributor

bors-servo commented Mar 1, 2017

Trying commit da60bb3 with merge 810a139...

bors-servo added a commit that referenced this pull request Mar 1, 2017
[WIP] 15270 Parameterize DOM Node Traversal Methods

<!-- Please describe your changes on the following line: -->

Think this is getting pretty close now.

Made some ugly edits to try and ensure exact functionality is preserved with iterators.  Once I identify the issue I'm seeing, I'll clean them up.

I've also temporarily made all of the places calling iterators with the new type parameters only use `Node` or `Element` - this way they should have identical functionality as they did before, which should make debugging the iterators themselves easier.

Once the iterators are done, we can clean up where they're being called from as well.

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [ ] These changes fix #15270 (github issue number if applicable).

<!-- Either: -->
- [ ] There are tests for these changes OR
- [ ] These changes do not require tests because _____

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/15503)
<!-- Reviewable:end -->
@nox
Copy link
Member

nox commented Mar 1, 2017

@bors-servo try-

Never mind, didn't notice the second commit. Will experiment locally.

@DominoTree
Copy link
Contributor Author

DominoTree commented Mar 1, 2017

I believe all of the iterators should have while let loops instead of simple if let to allow them to act as filters. I tried it locally and it didn't cause any further issues, but the page rendering was still broken.

@nox
Copy link
Member

nox commented Mar 1, 2017

@DominoTree What loop do you mean?

@DominoTree
Copy link
Contributor Author

DominoTree commented Mar 1, 2017

Sorry, in the first commit, if you look at the impls for SiblingIterator ReverseSiblingIterator and FollowingIterator, they use while loops to continue over subsequent elements if they fail to cast the first element they encounter. I believe all of the iterators should probably behave this way.

@nox
Copy link
Member

nox commented Mar 1, 2017

@DominoTree Please remove the second commit, I don't want the simplifications that undo the improvements from the first in the tree, and I don't want to review it either. I think I found the issue with your refactoring and will soon submit the review.

@DominoTree
Copy link
Contributor Author

DominoTree commented Mar 1, 2017

@nox I will get that reverted shortly.

Copy link
Member

nox left a comment

A few logic errors in the iterators and many nits.

@@ -225,12 +225,12 @@ impl CharacterDataMethods for CharacterData {

// https://dom.spec.whatwg.org/#dom-nondocumenttypechildnode-previouselementsibling
fn GetPreviousElementSibling(&self) -> Option<Root<Element>> {
self.upcast::<Node>().preceding_siblings().filter_map(Root::downcast).next()
self.upcast::<Node>().preceding_siblings::<Element>().next()

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

No need for a type annotation here.

}

// https://dom.spec.whatwg.org/#dom-nondocumenttypechildnode-nextelementsibling
fn GetNextElementSibling(&self) -> Option<Root<Element>> {
self.upcast::<Node>().following_siblings().filter_map(Root::downcast).next()
self.upcast::<Node>().following_siblings::<Element>().next()

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

No need for a type annotation here.

@@ -482,7 +482,7 @@ impl Document {
/// https://github.com/w3c/web-platform-tests/issues/2122
pub fn refresh_base_element(&self) {
let base = self.upcast::<Node>()
.traverse_preorder()
.traverse_preorder::<Node>()

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member
self.upcast::<Node>().traverse_preorder::<HTMLBaseElement>()
@@ -679,7 +679,7 @@ impl Document {
.map_or(false, |attr| &**attr.value() == name)
};
let doc_node = self.upcast::<Node>();
doc_node.traverse_preorder()
doc_node.traverse_preorder::<Node>()

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

I think you can remove both the type annotation and the .filter_map(Root::downcast) part. If it doesn't compile, you can still do:

doc_node.traverse_preorder::<HTMLAnchorElement>()
@@ -1809,8 +1809,7 @@ impl Document {
/// Iterate over all iframes in the document.
pub fn iter_iframes(&self) -> impl Iterator<Item=Root<HTMLIFrameElement>> {
self.upcast::<Node>()
.traverse_preorder()
.filter_map(Root::downcast::<HTMLIFrameElement>)
.traverse_preorder::<HTMLIFrameElement>()

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

I'm not sure this type annotation is needed.

self.current = Some(next_sibling);
return Some(current);
self.current = Some(next_sibling.clone());
return Root::downcast::<T>(current);

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

Same remark about None, the iterator should never return None until it's done with the whole tree.

}
self.depth -= 1;
}
debug_assert!(self.depth == 0);
self.current = None;
Some(current)
//TODO: dominotree - not sure how to handle failures here - is returning None okay if it can't cast to T?

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

Same remark about downcasting and unwrap.

self.current = Some(first_child.clone());
if let Some(first_child) = Root::downcast::<T>(first_child) {
return Some(first_child);
}

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

That does not seem correct to me.

if self.child_elements().any(|c| c.upcast::<Node>() != child) {
return Err(Error::HierarchyRequest);
if self.children::<Element>().any(|c| c.upcast::<Node>() != child) {
return Err(Error::HierarchyRequest);

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

Nit: indentation changed.

@@ -19,6 +19,8 @@

#![deny(unsafe_code)]
#![allow(non_snake_case)]
// TODO: dominotree remove this
#![allow(unrooted_must_root)]

This comment has been minimized.

Copy link
@nox

nox Mar 1, 2017

Member

No.

@DominoTree
Copy link
Contributor Author

DominoTree commented Mar 1, 2017

@nox I hadn't reverted that commit yet and I just sat down to do it, should I go ahead and do it now? Or should I add your changes here that are applicable and then rebase?

@nox
Copy link
Member

nox commented Mar 1, 2017

Yes, but I want it removed, not reverted as in what git revert does.

@DominoTree
Copy link
Contributor Author

DominoTree commented Mar 1, 2017

Okay, the commit is removed.

@nox
Copy link
Member

nox commented Mar 1, 2017

Ok! Now all my comments appear on the Files tab, address them and I'll make another review pass. Thanks for your contribution and keeping up with it!

Copy link
Member

nox left a comment

The iterators are still broken and an additional nit.

@@ -91,7 +91,7 @@ pub trait LayoutNode: Debug + GetLayoutData + TNode {
current: self.last_child(),
})
}

This comment has been minimized.

Copy link
@nox

nox Mar 2, 2017

Member

Nit: broken indentation.

@@ -221,7 +221,7 @@ impl HTMLCollection {
pub fn elements_iter_after<'a>(&'a self, after: &'a Node) -> impl Iterator<Item=Root<Element>> + 'a {
// Iterate forwards from a node.
HTMLCollectionElementsIter {
node_iter: after.following_nodes(&self.root),
node_iter: after.following_nodes::<Node>(&self.root),

This comment has been minimized.

Copy link
@nox

nox Mar 2, 2017

Member

This type annotation is useless, given node_iter must be an iterator that yields Root<Node> values.

@@ -235,7 +235,7 @@ impl HTMLCollection {
pub fn elements_iter_before<'a>(&'a self, before: &'a Node) -> impl Iterator<Item=Root<Element>> + 'a {
// Iterate backwards from a node.
HTMLCollectionElementsIter {
node_iter: before.preceding_nodes(&self.root),
node_iter: before.preceding_nodes::<Node>(&self.root),

This comment has been minimized.

Copy link
@nox

nox Mar 2, 2017

Member

This type annotation is useless, given node_iter must be an iterator that yields Root<Node> values.

@@ -138,7 +138,7 @@ impl HTMLTableElement {
fn get_rows(&self) -> TableRowFilter {
TableRowFilter {
sections: self.upcast::<Node>()
.children()
.children::<Node>()

This comment has been minimized.

Copy link
@nox

nox Mar 2, 2017

Member

You forgot to simplify this one.

if let Some(first_child) = Root::downcast::<T>(first_child) {
self.current = Some(Root::upcast::<Node>(first_child.clone()));
return Some(first_child);
}

This comment has been minimized.

Copy link
@nox

nox Mar 2, 2017

Member

I don't understand what you are trying to do by repeating the loop, and current is still never yielded again.

@DominoTree
Copy link
Contributor Author

DominoTree commented Mar 2, 2017

@nox As far as the iterators go, I was working based on the proof-of-concept linked to in #15270 and the POC iterators seemed to work properly in some brief testing.

https://github.com/frewsxcv/servo/blob/04a3efd4f2e2d843c72b55fc6767b15c7bd3e26a/components/script/dom/node.rs#L1136

Would a better approach be to simply attempt to upcast/downcast the values which would be returned from the existing iterators as they exist in the master branch? If so, how do we handle cases where we encounter elements which cannot be cast to the desired type? Do we loop until we find an element which can be successfully cast to that type? Do we act as if there are no more elements to iterate over?

This is really the part where I'm having trouble figuring out what the correct approach is.

@DominoTree
Copy link
Contributor Author

DominoTree commented Mar 24, 2017

@nox updated those methods, and it's still exhibiting the same behavior. I'll try to dig into it some more today - let me know if anything jumps out at you.

Copy link
Member

nox left a comment

I think I found the issue in your current code.

pub fn inclusively_following_siblings<T>(&self) -> SiblingIterator<T>
where T: DerivedFrom<Node>
{
SiblingIterator::<T> {

This comment has been minimized.

Copy link
@nox

nox Apr 4, 2017

Member

Nit: the type annotation isn't required here.

where T: DerivedFrom<Node>
{
SiblingIterator::<T> {
current: Root::downcast::<T>(Root::from_ref(self)),

This comment has been minimized.

Copy link
@nox

nox Apr 4, 2017

Member

This is incorrect, the node following self may be a T, but if self is not a T, current will be set to None.

You probably want:

SiblingIterator {
    current: Root::downcast(Root::from_ref(self)).or_else(|| get_next_sibling(self)),
    phantom: PhantomData,
}
where T: DerivedFrom<Node>
{
ReverseSiblingIterator::<T> {
current: Root::downcast::<T>(Root::from_ref(self)),

This comment has been minimized.

Copy link
@nox

nox Apr 4, 2017

Member

Same remark here:

Root::downcast(Root::from_ref(self)).or_else(|| get_previous_sibling(self))
@DominoTree
Copy link
Contributor Author

DominoTree commented Apr 5, 2017

@nox Thanks, that makes sense - I made those changes and was able to debug a bit more.

On servo.org, when it's inserting the element for <h1 id='page-title' class='page-header'>, the call to self.GetDocumentElement() at document.rs:616 returns None

The code in that method is simply self.upcast::<Node>().children().next() - so it would make sense that it would return None as that h1 element has no children, but I'm confused as to how this is supposed to work. If we're getting the document element, shouldn't it be traversing the DOM upwards?

DEBUG:html5ever::tree_builder: processing TagToken(Tag { kind: StartTag, name: Atom(\'h1\' type=static), self_closing: false, attrs: [Attribute { name: QualName { ns: Atom(\'\' type=static), local: Atom(\'id\' type=static) }, value: Tendril<UTF8>(owned: \"page-title\") }, Attribute { name: Qual
Name { ns: Atom(\'\' type=static), local: Atom(\'class\' type=static) }, value: Tendril<UTF8>(owned: \"page-header\") }] }) in insertion mode InBody
DEBUG:script::dom::document: Adding named element to document 0x7f14daaa9600: 0x7f14dab00480 id=page-title
WARN:servo: Panic hook called.
DEBUG:constellation::constellation: Sending log entry Warn("Panic hook called.").
The element is in the document, so there must be a document element. (thread ScriptThread PipelineId { namespace_id: PipelineNamespaceId(0), index: PipelineIndex(1) }, at /checkout/src/libcore/option.rs:785)
ERROR:servo: The element is in the document, so there must be a document element.
DEBUG:constellation::constellation: Received log entry Warn("Panic hook called.").
@jdm
Copy link
Member

jdm commented Apr 5, 2017

The document element should be the first child of the Document. self is the document, not the h1 element.

@DominoTree
Copy link
Contributor Author

DominoTree commented Apr 5, 2017

I suppose I could've scrolled up until I got to impl DocumentMethods for Document { 😃

@bors-servo
Copy link
Contributor

bors-servo commented Apr 5, 2017

The latest upstream changes (presumably #15904) made this pull request unmergeable. Please resolve the merge conflicts.

where T: DerivedFrom<Node> {
let mut current = Root::from_ref(start);
//Should while loop here should probably be an if block
//or should we go through all the parents?

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

Remove.

where T: DerivedFrom<Node>
{
SiblingIterator {
current: Root::downcast(Root::from_ref(self)).or_else(||get_next_sibling(self)),

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

Nit: space after ||.

ReverseSiblingIterator {
current: Some(Root::from_ref(self)),
current: Root::downcast(Root::from_ref(self)).or_else(||get_previous_sibling(self)),

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

Nit: space after ||.

current: Some(Root::from_ref(self)),
pub fn following_nodes<T>(&self, root: &Node) -> FollowingIterator<T> where T: DerivedFrom<Node> {
FollowingIterator::<T> {
current: Root::downcast::<T>(Root::from_ref(self)),

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

This isn't correct.

current: Some(Root::from_ref(self)),
pub fn preceding_nodes<T>(&self, root: &Node) -> PrecedingIterator<T> where T: DerivedFrom<Node> {
PrecedingIterator::<T> {
current: Root::downcast::<T>(Root::from_ref(self)),

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

This isn't correct.

@@ -729,15 +743,17 @@ impl Node {
Ok(NodeList::new_simple_list(&window, iter))
}

pub fn ancestors(&self) -> AncestorIterator {
pub fn ancestors(&self) -> AncestorIterator<Node> {

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

Why isn't this one parameterised?

}
}

pub fn inclusive_ancestors(&self) -> AncestorIterator {
pub fn inclusive_ancestors(&self) -> AncestorIterator<Node> {

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

Why isn't this one parameterised?

ReverseSiblingIterator {
current: self.GetLastChild(),
current: Root::downcast(Root::from_ref(self)).or_else(||get_last_child(self)),

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

Nit: space after ||.

fn get_first_child<T>(start: &Node) -> Option<Root<T>>
where T: DerivedFrom<Node> {
let mut current = Root::from_ref(start);
//The while loop here should probably just be an if block

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

Remove.

fn get_last_child<T>(start: &Node) -> Option<Root<T>>
where T: DerivedFrom<Node> {
let mut current = Root::from_ref(start);
//The while loop here should probably just be an if block

This comment has been minimized.

Copy link
@nox

nox Apr 16, 2017

Member

Remove.

@bors-servo
Copy link
Contributor

bors-servo commented Apr 19, 2017

The latest upstream changes (presumably #16521) made this pull request unmergeable. Please resolve the merge conflicts.

@KiChjang
Copy link
Member

KiChjang commented Jun 9, 2017

Is this PR wanted anymore?

@DominoTree
Copy link
Contributor Author

DominoTree commented Jun 9, 2017

@KiChjang Not sure what to do at this point - when I opened this PR, there were some DOM parsing issues that had been introduced by my changes. After fixing nits and trying a few different approaches (as well as several rounds of updates to make things merge cleanly), the same DOM parsing issues still exist, and I have been unable to figure out how to resolve them.

I'd be glad to continue putting effort into this PR, but until the fundamental DOM issues can be resolved, there doesn't seem to be much value here, and I'm really not sure how to move forward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

7 participants
You can’t perform that action at this time.