Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Convert node serialization to a purely iterative algorithm. #17520

Merged
merged 1 commit into from Jul 17, 2017

Conversation

@camlorn
Copy link
Contributor

camlorn commented Jun 25, 2017

We maintain a stack of open element nodes with their children count, popping from the top of the stack and closing when the count reaches zero.

Contrary to my comment in #16696, this is a purely iterative algorithm. I just wasn't feeling sufficiently clever with respect to finding a relatively clean way until later.


  • ./mach build -d does not report any errors
  • ./mach test-tidy does not report any errors
  • These changes fix #16696 (github issue number if applicable).
  • There are tests for these changes.

This change is Reviewable

@highfive
Copy link

highfive commented Jun 25, 2017

Heads up! This PR modifies the following files:

  • @fitzgen: components/script/dom/servoparser/html.rs
  • @KiChjang: components/script/dom/servoparser/html.rs
@camlorn
Copy link
Contributor Author

camlorn commented Jun 26, 2017

Is the Travis failure spurious? I'm not sure what this is about a manifest change.

@camlorn
Copy link
Contributor Author

camlorn commented Jun 26, 2017

Never mind, I didn't realize the manifests contained hashes that need to be updated after changing tests.

@camlorn camlorn force-pushed the camlorn:master branch from 18b4f74 to 8f92da1 Jun 26, 2017
@camlorn
Copy link
Contributor Author

camlorn commented Jul 3, 2017

Just bumping this; can we get it merged?

@bors-servo
Copy link
Contributor

bors-servo commented Jul 4, 2017

The latest upstream changes (presumably #17599) made this pull request unmergeable. Please resolve the merge conflicts.

@asajeffrey
Copy link
Member

asajeffrey commented Jul 4, 2017

Sorry about the delay, the Mozilla all-hands was last week, we've been a bit swamped!

Would it be possible to simplify this, by using an iterator whose Item is something like:

enum BeginOrEndNode {
   Begin(Root<Node>),
   End(Root<Node>),
}

this would avoid having to use the stack in serialize, since the stack would be kept in the iterator. The iterator can store a stack of BeginOrEndNode, initialized as vec![Begin(root)]. When next is called, we pop from the stack: if it's an End(node), we just return it, but if it's a Begin(node), we push End(node), then Begin(child) for each child.

Copy link
Member

asajeffrey left a comment

Could possibly be simplified a bit.

node.children()
};
fn start_element<S: Serializer>(node: &Node, serializer: &mut S) -> io::Result<()> {
let elem = node.downcast::<Element>().expect("Node should be an Element");

This comment has been minimized.

Copy link
@asajeffrey

asajeffrey Jul 4, 2017

Member

Take an &Element parameter rather than a &Node?

// As we encounter elements, we push them to this stack with a count of their children.
// Every time through the following loop, we decrement the child count of the topmost element.
// When it gets to 0, it's time to close the node.
let mut open_stack: Vec<(Root<Node>, u32)> = vec![];

This comment has been minimized.

Copy link
@asajeffrey

asajeffrey Jul 4, 2017

Member

It would be nice if we could roll this stack into the iterator, rather than have it explicit.

@camlorn
Copy link
Contributor Author

camlorn commented Jul 4, 2017

@asajeffrey
I'll have to think about the stack. I'm not entirely sure how this makes things better. It seems like the code is going to be just as complicated or worse with your alternative.

Is the intent to increase reuse? I can refactor for that in mind if that's the goal.

@asajeffrey
Copy link
Member

asajeffrey commented Jul 4, 2017

I suspect the code will be more readable with a clean separation between iterating over the nodes and serializing them. I don't think there's much scope for reuse here.

@camlorn
Copy link
Contributor Author

camlorn commented Jul 4, 2017

@asajeffrey
I'm going to try to do it tomorrow and keep the commit separate; I'm either being insufficiently clever at the moment or we might end up wanting to revert my refactor. I think it'll make the serialization function conceptually simpler, but it's going to need nested matches and I think it's going to make the code more complicated overall.

This is one of those places where coroutines would be a godsend.

@camlorn camlorn force-pushed the camlorn:master branch from 8f92da1 to d3e6c38 Jul 6, 2017
@camlorn camlorn force-pushed the camlorn:master branch from d3e6c38 to be18210 Jul 6, 2017
@camlorn
Copy link
Contributor Author

camlorn commented Jul 6, 2017

That's the nicest I can make it. I need to rebase these commits together, if we like it.

@camlorn
Copy link
Contributor Author

camlorn commented Jul 7, 2017

This Travis failure looks spurious to me, or at least unrelated to this PR. Can someone rerun it?

@jdm
Copy link
Member

jdm commented Jul 7, 2017

That's #17594. Most PRs are showing it.

@asajeffrey
Copy link
Member

asajeffrey commented Jul 7, 2017

This is looking a lot nicer! I think it can be simplified even further by something like:

struct SerializationIterator {
    stack: Vec<SerializationCommand>,
}

then the iterator is something like (ignoring templates for the moment):

impl Iterator for SerializationIterator {
    type Item = SerializationCommand;
    fn next(&mut self) -> Option<SerializationCommand> {
        let result = self.stack.pop();
        if let Some(SerializationCommand::OpenElement(element)) {
            self.stack.push(SerializationCommand::CloseElement(element));
            for child in element.children().iter().rev() {
                match child.downcast::<Element>() {
                    Some(child) => self.stack.push(SerializationCommand::OpenElement(child)),
                    None => self.stack.push(SerializationCommand::SerializeNonelement(child)),
                }
            }
        }
        result
    }
}
@camlorn
Copy link
Contributor Author

camlorn commented Jul 8, 2017

Before I try to rewrite this again, what counts as satisfactory? I feel like we're chasing perfection. It's complicated no matter how you slice it.

I'm not trying to be disagreeable and if you say rewrite it I'll go rewrite it, but this feels like the sort of thing where we're both making equally valid judgement calls and where a third party would find both our judgement calls to be equally difficult to grasp without explanation.

@camlorn camlorn force-pushed the camlorn:master branch from be18210 to ac3d170 Jul 9, 2017
@KiChjang
Copy link
Member

KiChjang commented Jul 10, 2017

I'm really not ok with storing a u32 since it means nothing to me while I'm reading the code.

@camlorn
Copy link
Contributor Author

camlorn commented Jul 10, 2017

@asajeffrey
Since we have multiple people now, I'll rewrite it.

I don't like the i32 approach either, but I have an easier time convincing myself that nodes are getting visited in the right order with mine as compared to yours. It's not that I have a preference between them, so much as it is that I've already spent rather a lot of time on this and the solution works.

The only non-aesthetic argument here is if performance matters, in which case we should profile it.

@camlorn camlorn force-pushed the camlorn:master branch from b5c6756 to f267f04 Jul 11, 2017
@camlorn
Copy link
Contributor Author

camlorn commented Jul 11, 2017

This might still be able to be made cleaner. I'm going to give it another pass in the morning.

@asajeffrey
Copy link
Member

asajeffrey commented Jul 11, 2017

Thanks!

@camlorn
Copy link
Contributor Author

camlorn commented Jul 14, 2017

I just pushed the cleanup work. If this is good, I'll rebase them together. I don't think it can be made cleaner.

Sorry about the turnaround time. Lots of things going on, etc.

@asajeffrey
Copy link
Member

asajeffrey commented Jul 17, 2017

Yay, this looks really nice now, thanks for all your work!

./mach test-tidy is failing, apart from that this looks good to merge!

@asajeffrey
Copy link
Member

asajeffrey commented Jul 17, 2017

Oh yes, and the commits need squashed too.

We maintain a stack of open element nodes and non-node elements and use it to determine when to close them.
@camlorn camlorn force-pushed the camlorn:master branch from fb27db1 to 033b319 Jul 17, 2017
@camlorn
Copy link
Contributor Author

camlorn commented Jul 17, 2017

Think that does it.

@asajeffrey
Copy link
Member

asajeffrey commented Jul 17, 2017

LGTM! @bors-servo r+

@bors-servo
Copy link
Contributor

bors-servo commented Jul 17, 2017

📌 Commit 033b319 has been approved by asajeffrey

@bors-servo
Copy link
Contributor

bors-servo commented Jul 17, 2017

Testing commit 033b319 with merge 66b9544...

bors-servo added a commit that referenced this pull request Jul 17, 2017
Convert node serialization to a purely iterative algorithm.

We maintain a stack of open element nodes with their children count, popping from the top of the stack and closing when the count reaches zero.

Contrary to my comment in #16696, this is a purely iterative algorithm.  I just wasn't feeling sufficiently clever with respect to finding a relatively clean way until later.

<!-- Please describe your changes on the following line: -->

---
<!-- Thank you for contributing to Servo! Please replace each `[ ]` by `[X]` when the step is complete, and replace `__` with appropriate data: -->
- [x] `./mach build -d` does not report any errors
- [x] `./mach test-tidy` does not report any errors
- [x] These changes fix #16696 (github issue number if applicable).

<!-- Either: -->
- [x] There are tests for these changes.

<!-- Also, please make sure that "Allow edits from maintainers" checkbox is checked, so that we can help you if you get stuck somewhere along the way.-->

<!-- Pull requests that do not address these steps are welcome, but they will require additional verification as part of the review process. -->

<!-- Reviewable:start -->
---
This change is [<img src="https://reviewable.io/review_button.svg" height="34" align="absmiddle" alt="Reviewable"/>](https://reviewable.io/reviews/servo/servo/17520)
<!-- Reviewable:end -->
@bors-servo
Copy link
Contributor

bors-servo commented Jul 17, 2017

💔 Test failed - linux-rel-wpt

@jdm
Copy link
Member

jdm commented Jul 17, 2017

@bors-servo
Copy link
Contributor

bors-servo commented Jul 17, 2017

@bors-servo
Copy link
Contributor

bors-servo commented Jul 17, 2017

@bors-servo bors-servo merged commit 033b319 into servo:master Jul 17, 2017
3 checks passed
3 checks passed
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
homu Test successful
Details
@asajeffrey
Copy link
Member

asajeffrey commented Jul 17, 2017

Yay!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

6 participants
You can’t perform that action at this time.