Should we avoid using a Vec to store a Node's children? #26

iwburns · 2016-12-27T06:57:55Z

When I first wrote this library, I got a large amount of inspiration from SimonSapin's idtree (which can be found here). However, (for several reasons that I can't remember right now) I decided to go with a Node like this:

struct Node<T> {
    data: T,
    parent: Option<NodeId>,
    children: Vec<NodeId>,
}

instead of something more like this: (which is essentially what is used in idtree)

struct Node<T> {
    data: T,
    parent: Option<NodeId>,
    first_child: Option<NodeId>,
    last_child: Option<NodeId>,
    prev_sibling: Option<NodeId>,
    next_sibling: Option<NodeId>,
}

Now I'm wondering if it would be easier to implement some features (Iterators is one example, but there could be other features that fall into this category as well) if I had gone this route.

Also, I want to do at least some initial work on this before 1.0.0 because if we end up going this route there's no need for a NodeBuilder anymore, so this would be a pretty large breaking change. If it ends up looking like tons of work (but still worth the effort in the long run), then I might put it off and just release 2.0.0 when this is ready instead of holding 1.0.0 back.

I'll probably make a branch for this soon to explore what all will be involved here, but I wanted to get your thoughts on this @Drakulix .

The text was updated successfully, but these errors were encountered:

Drakulix · 2016-12-27T13:20:05Z

This would of course avoid heap allocations.

But I don't think it is very convenient to use. I use indexing on the children array quite a lot and the run time would get worse for that case, even if we provide utility functions to index, so we do not break the API. It would also make the code more unreadable.

I would not do this without an actual benchmark showing a great advantage.

iwburns · 2016-12-27T16:22:37Z

I think (haven't tried anything yet, so I could be wrong) it would probably make certain things much easier to write, but you're correct; if we were to not break the API (apart from not having a NodeBuilder) some of the things that we would have to write would be quite odd and possibly just downright confusing.

I would not do this without an actual benchmark showing a great advantage.

You're absolutely right, this needs to happen regardless of when this gets written (pre/post 1.0.0).

The biggest case in which I can see this helping performance, is when the caller doesn't actually know how many children a Node may have (for example, parsing externally provided html structures). If this change isn't made, they'll just have to make (wildly inaccurate) guesses about what to pass to with_child_capacity when building a Node. In that situation, you may be better off finding another Tree library with how many heap allocations and Vec re-sizes you're likely to get.

With that said though, the more I think about this, the more I think it should wait until after 1.0.0 (if it even gets implemented/merged at all). Thoughts?

iwburns added the question label Dec 27, 2016

iwburns changed the title ~~Should we avoid using a Vec to store a Node's children?~~ Should we avoid using a Vec to store a Node's children? Dec 27, 2016

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should we avoid using a Vec to store a Node's children? #26

Should we avoid using a Vec to store a Node's children? #26

iwburns commented Dec 27, 2016

Drakulix commented Dec 27, 2016

iwburns commented Dec 27, 2016

Should we avoid using a Vec to store a Node's children? #26

Should we avoid using a Vec to store a Node's children? #26

Comments

iwburns commented Dec 27, 2016

Drakulix commented Dec 27, 2016

iwburns commented Dec 27, 2016