Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Do Not Merge] PoC of cyclic previous node technique #38

Closed

Conversation

lo48576
Copy link
Contributor

@lo48576 lo48576 commented Aug 4, 2019

This is a PoC implementation of #12.

I noticed:

  • Updating neighbors are O(1) cost, if all the nodes have parents.
    • indextree does not provide implicit root nodes, and toplevel nodes can have siblings without parents.
      For such nodes, updating neighbors are O(num_of_siblings).
      // NOTE: These are not O(1) operations.
      // This is because nodes are allowed to having no children.
      // If the node has no children, it is impossible to get the first
      // and the last siblings.
      Some(id) => (
      id.preceding_siblings(arena).last(),
      id.following_siblings(arena).last(),
      ),
  • Now all neighors getters (such as Node::parent()) should receive &Arena<T>.
    This is not intrinsic difficutly, because when people wanted to get NodeId from Node<T>, then they would have &Arena<T>.
    (However, I feel it is just annoying to pass &arena as an argument all the time).

If this change is worth merging, then I think indextree should support some mechanism for pseudo-parent node or implicit root node to make updates constant time.

@lo48576 lo48576 mentioned this pull request Aug 4, 2019
@lo48576 lo48576 force-pushed the feature/prev-cyclic-sibling branch 2 times, most recently from 0e7fd1c to b44a404 Compare August 4, 2019 11:07
@saschagrunert
Copy link
Owner

If this change is worth merging, then I think indextree should support some mechanism for pseudo-parent node or implicit root node to make updates constant time.

Do you see any drawback in adding an implicit root node? I would suggest moving forward with that approach... :)

@lo48576
Copy link
Contributor Author

lo48576 commented Aug 4, 2019

If implicit root node exists, top-level nodes (created by the user) may have siblings, though they were not told to have.
To treat siblings relations of top-level nodes correctly, the arena should have multiple implicit nodes (I'm not sure they can be called "root" nodes :-))

let mut arena = Arena::new();
let n1a = arena.new_node("1a");
let n1b = arena.new_node("1b");
let n2 = arena.new_node("2");
// arena
// `-- (implicit 1)
//     `-- 1a
// `-- (implicit 2)
//     `-- 1b
// `-- (implicit 3)
//     `-- 2
n1a.insert_after(n1b, &mut arena);
// arena
// `-- (implicit 1)
//     |-- 1a
//     `-- 1b
// `-- (implicit 3)
//     `-- 2

Mainly there are two problems:

  1. I don't want users to manipulate implicit nodes manually, because they are implicit and non-usual node.
    To achive this, NodeId should not be created from index number directly (or users may accidentally access that node).
  2. To refer implicit nodes by NodeId, they should be in Arena::nodes field, but it contains non-optional pub data: T field, and implicit nodes should have no values.

I think they (implicit nodes) bring some complexity and exceptional rule, and it may make the code hard to read and maintain...

@lo48576 lo48576 force-pushed the feature/prev-cyclic-sibling branch from b44a404 to bb0fe3f Compare August 4, 2019 15:42
@lo48576
Copy link
Contributor Author

lo48576 commented Aug 4, 2019

Required breaking change for this feature is additional parameter for neighbors accessors (such as Node::previous_sibling(self, &Arena<T>) instead of Node::previous_sibling(self)).
(Current state of this branch is already working, except that updates are not always O(1).
I think no more interface changes are necessary.)

If you really want this, you can change the interface before implementing internals.
Then, later internal changes can be non-breaking and new release with this feature will be safely done with revision bump.

See <http://www.aosabook.org/en/posa/parsing-xml-at-the-speed-of-light.html>.
Note that, by this implementation, node insersions and removals are not
O(1) for top-level nodes.

Fixes saschagrunert#12.
Currently `Node::previous_sibling()` and `Node::last_child()` really
requires the arena, but others (such as `Node::parent()`) does not.
However, it is implementation details and should be hidden from the
users.
So they should have the same interface.
@lo48576 lo48576 force-pushed the feature/prev-cyclic-sibling branch from bb0fe3f to 03cf106 Compare August 4, 2019 16:24
@phrohdoh
Copy link
Contributor

In one of my uses of indextree I do want all top-level/parentless nodes to be siblings.
How is that use-case affected by these changes?

@lo48576
Copy link
Contributor Author

lo48576 commented Sep 16, 2019

TL;DR: I don't think this change is worth doing with non-obvious breaking change...


The problem of the current PR is that insertion and detach of top-level nodes are not O(N).

Possible solutions (currently I'm aware of) are:

  1. Allow O(N) operations for top-level siblings.
    • In this case, your code will get slower when adding and detaching top-level sibling nodes.
  2. Add virtual "root" node internally to arena.
    • NodeId value generation or internal structure of NodeId may be changed (so that it can represent virtual node).
      • This may also require data access API to be changed (because the virtual root node has no data).
    • NodeId might (or might not) lose some good properties, such as PartialOrd and/or predictability.
      • I think this won't affect usual use case, but I'm not sure.
    • The maximum number of nodes in an arena might be reduced.
  3. Force users to have a "root" node for each arena.
    • This hugely breaks compatibility and affects many users, so I think this won't happen...
  4. Give up using cyclic previous node, i.e. make no change.

Choice 1 will cause performance degration.
Choice 2 and 3 requires non-obvious breaking change, i.e. migration will be non-straightforward and difficult.
Choice 4 is default (current) situation when there are no progress.

This change reduces memory consumption a little, but also make the implementation more complex and may require semantics to change.
I'm not willing to push this PR forward.

@lo48576
Copy link
Contributor Author

lo48576 commented Sep 3, 2021

Closing as the branch is outdated and I won't maintain this experimental PoC anymore.

@lo48576 lo48576 closed this Sep 3, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants