New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Setting node name keeps tree linkage #310
base: main
Are you sure you want to change the base?
Setting node name keeps tree linkage #310
Conversation
datatree/treenode.py
Outdated
parent = self._parent | ||
self.orphan() | ||
self._name = name | ||
self._set_parent(parent, name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of this orphaning then un-orphaning logic, couldn't we just change the key in .children
directly via some logic more like in _post_attach
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This does seem immediate to me right now (I tried some things but not successfully until now)
I still need to remove then add again the node in the parent's children at some point 🤔
In all cases I get more messy code than the current solution
__delitem__
itself uses orphan
under the hood for instance
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Finally, I went for a .children =
assignment. I ensured the order of children is preserved by re-creating an OrderedDict from a generator on the existing parent's children. Code is more verbose, but I think this is unavoidable to ensure the order is preserved. Indeed, orphaning then re-attaching would have lost the ordering information.
Note: _post_attach
solely won't rename the node _name
, so two renaming still occur:
One on the parent's children (preserving order)One on the node_name
itself
Edit: if the node had a parent, there is no need to manually set the ._name
. Reassigning the updated list of children to its parents renames the node along the way
I remained conservative and forbid the renaming of a node to None if it has a parent Edit: according to a comment in the datatree design document mentioned in pydata/xarray#8747, section 4) Are nodes named? in practice, only the root node would remain anonmyous. Hence it makes sense to only authorize a node without a parent (root) to be renamed to None? See my comment #309 (comment) Edit: build is failing because of the new change in printing sizes of DataArrays: pydata/xarray#8702 Whether |
I see I got to this pretty late, so it may be solved already (if so, please ignore this). But I couldn't quite tell from this conversation if everything was working or not. If not, I have a solution which passes all pytests (except for a format error that was not passing prior to my changes) at https://github.com/marcel-goldschen-ohm/datatree/tree/namelinkbugfix. Changes are in datatree.py in |
Small bugfix. I added a test reproducing the example in #309, and tests are not broken
Note: In
_post_attach
, I set the private_name
instead of setting thename
. Indeed, it can lead to infinite recursions when a setter is used inside of a class. I assume the_post_attach
method is like a "runtime assertion"pre-commit run --all-files
New functions/methods are listed inapi.rst
docs/source/whats-new.rst