Should .children be a dictionary instead of a tuple? #3

TomNicholas · 2021-08-18T20:57:59Z

A tree-like structure must have nodes, each of which can contain multiple children, and those children have to be selectable via some kind of name. However, those names can either be keys to access the child objects, or inherent properties of the child objects.

In the former case we would have a node.children=tuple(child1, child2), where child1.name = 'steve', child2.name = 'mary' etc. In the latter case we would have node.children=dict('steve': child1, 'mary': child2), where each child need not have a name. It's not clear to me which of these approaches is better in our case.

It's easy to ensure that all nodes have names (and if we make nodes inherit from Dataset they will inherit a name), but storing children in tuples leads to annoying code like child_we_want = next(c for c in node.children if c.name == name_we_want), instead of just child_we_want = node[name_we_want]. A DataTree is also quite intuitively represented by a nested dictionary where keys are parts of a path and values are either datasets or child nodes, and in that description we would not say that the name key is an inherent property of the value.

Using a dictionary also means that the path to an object is distinct from the name of that object.

This also means that a node doesn't need a name at all, and becomes defined only in terms of its parent and children. In effect, the name of the node would be the key for which self.parent.children[key] returns self. Parentless nodes would be nameless.

A disadvantage of this is that a stored Dataset object has no idea who its parent is.

None of the tree implementations I've seen work like this, and it appears to deviate from the way that a "tree" is defined mathematically.

The anytree library uses named nodes and tuples to store the children, so to use dictionaries we would need to reimplement the NodeMixin class to use a dictionary instead.

The text was updated successfully, but these errors were encountered:

TomNicholas · 2021-08-26T05:09:00Z

A related question is whether the set of child nodes is an ordered set or not. In the mathematical definition of a tree it is unordered, but I'm not sure whether order of nodes matters for certain filetypes or not. By using a tuple or list to store children we are implicitly ordering the tree, compared to using a set (or pre-python 3.6 dict).

Even if we stick with an ordered type for storing the children we still have to decide if our trees are ordered or not, because it matters when checking equivalence between trees.

It might make sense to just choose the more general option (i.e.ordered), and then have flags to treat trees as unordered when it matters.

TomNicholas · 2022-04-27T21:59:29Z

Closed by #76

TomNicholas added the design question label Aug 18, 2021

TomNicholas mentioned this issue Aug 18, 2021

Should data nodes and tree nodes be unrelated classes? #4

Closed

TomNicholas mentioned this issue Aug 26, 2021

map over multiple subtrees #29

Closed

TomNicholas mentioned this issue Sep 21, 2021

Store variables in DataTree instead of storing Dataset #2

Closed

TomNicholas mentioned this issue Apr 14, 2022

Track DataTree progress arviz-devs/arviz#2015

Open

TomNicholas mentioned this issue Apr 21, 2022

Child dict refactor (also removes anytree dependency) #76

Merged

7 tasks

TomNicholas added this to the 0.1 milestone Apr 27, 2022

TomNicholas closed this as completed Apr 27, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Should .children be a dictionary instead of a tuple? #3

Should .children be a dictionary instead of a tuple? #3

TomNicholas commented Aug 18, 2021 •

edited

Loading

TomNicholas commented Aug 26, 2021

TomNicholas commented Apr 27, 2022

Should .children be a dictionary instead of a tuple? #3

Should .children be a dictionary instead of a tuple? #3

Comments

TomNicholas commented Aug 18, 2021 • edited Loading

TomNicholas commented Aug 26, 2021

TomNicholas commented Apr 27, 2022

TomNicholas commented Aug 18, 2021 •

edited

Loading