Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we avoid using a Vec to store a Node's children? #26

Open
iwburns opened this issue Dec 27, 2016 · 2 comments
Open

Should we avoid using a Vec to store a Node's children? #26

iwburns opened this issue Dec 27, 2016 · 2 comments
Labels

Comments

@iwburns
Copy link
Owner

iwburns commented Dec 27, 2016

When I first wrote this library, I got a large amount of inspiration from SimonSapin's idtree (which can be found here). However, (for several reasons that I can't remember right now) I decided to go with a Node like this:

struct Node<T> {
    data: T,
    parent: Option<NodeId>,
    children: Vec<NodeId>,
}

instead of something more like this: (which is essentially what is used in idtree)

struct Node<T> {
    data: T,
    parent: Option<NodeId>,
    first_child: Option<NodeId>,
    last_child: Option<NodeId>,
    prev_sibling: Option<NodeId>,
    next_sibling: Option<NodeId>,
}

Now I'm wondering if it would be easier to implement some features (Iterators is one example, but there could be other features that fall into this category as well) if I had gone this route.

Also, I want to do at least some initial work on this before 1.0.0 because if we end up going this route there's no need for a NodeBuilder anymore, so this would be a pretty large breaking change. If it ends up looking like tons of work (but still worth the effort in the long run), then I might put it off and just release 2.0.0 when this is ready instead of holding 1.0.0 back.

I'll probably make a branch for this soon to explore what all will be involved here, but I wanted to get your thoughts on this @Drakulix .

@iwburns iwburns changed the title Should we avoid using a Vec to store a Node's children? Should we avoid using a Vec to store a Node's children? Dec 27, 2016
@Drakulix
Copy link
Contributor

This would of course avoid heap allocations.

But I don't think it is very convenient to use. I use indexing on the children array quite a lot and the run time would get worse for that case, even if we provide utility functions to index, so we do not break the API. It would also make the code more unreadable.

I would not do this without an actual benchmark showing a great advantage.

@iwburns
Copy link
Owner Author

iwburns commented Dec 27, 2016

I think (haven't tried anything yet, so I could be wrong) it would probably make certain things much easier to write, but you're correct; if we were to not break the API (apart from not having a NodeBuilder) some of the things that we would have to write would be quite odd and possibly just downright confusing.

I would not do this without an actual benchmark showing a great advantage.

You're absolutely right, this needs to happen regardless of when this gets written (pre/post 1.0.0).

The biggest case in which I can see this helping performance, is when the caller doesn't actually know how many children a Node may have (for example, parsing externally provided html structures). If this change isn't made, they'll just have to make (wildly inaccurate) guesses about what to pass to with_child_capacity when building a Node. In that situation, you may be better off finding another Tree library with how many heap allocations and Vec re-sizes you're likely to get.

With that said though, the more I think about this, the more I think it should wait until after 1.0.0 (if it even gets implemented/merged at all). Thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants