Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature Request - Node Tree from list of delimited strings #153

Open
als0052 opened this issue Jan 14, 2021 · 4 comments
Open

Feature Request - Node Tree from list of delimited strings #153

als0052 opened this issue Jan 14, 2021 · 4 comments

Comments

@als0052
Copy link

als0052 commented Jan 14, 2021

I've got some files I'm trying to parse into a tree (a FEM assembly tree actually) and need to take a list of lists and create a tree form it. Because of the way I get the raw input that I'm parsing (TCL script) all the lists have the full path, much of which is repeated in each list. I'm sure I could get this working eventually without a 'batch node creator' but I wanted to throw this out there anyways for consideration that such a batch node creator be added in the future.

Below is a (hopefully understandable) minimum example copied and pasted from a markdown export of a Jupyter notebook. I think this request might be similar to others made previously and if so feel free to close this one and/or link it to other issues.

Sorry for the book-length post!!


from anytree import Node, RenderTree
from pathlib import Path

Make the Tree Manually w/ anytree

a1 = Node('Assembly 1', parent=None)
a1_sa1 = Node('Sub Assembly 1', parent=a1)
a1_sa1_ssa1 = Node('Sub Sub Assembly 1', parent=a1_sa1)
a1_sa1_ssa1_sssa1 = Node('Sub Sub Sub Assembly 1', parent=a1_sa1_ssa1)
a1_sa1_ssa1_sssa1.children = [Node('Component 1'), Node('Component 2')]

a1_sa2 = Node('Sub Assembly 2', parent=a1)
a1_sa2.children = [Node('Component 1'), Node('Component 2'), Node('Component 3'),
                   Node('Component 4'), Node('Component 5'), Node('Component 6'),
                   Node('Component 7'), Node('Component 8'), Node('Component 9'),
                   Node('Component 10'), Node('Component 11'), Node('Component 12'),
                   Node('Component 13'), Node('Component 14'), Node('Component 15'),
                   Node('Component 16')]

a1_sa3 = Node('Sub Assembly 3', parent=a1)
a1_sa3_ssa1 = Node('Sub Sub Assembly 1', parent=a1_sa3)
a1_sa3_ssa1_c1 = Node('Component 1', parent=a1_sa3_ssa1)

a1_sa3_ssa2 = Node('Sub Sub Assembly 2', parent=a1_sa3)
a1_sa3_ssa2_c1 = Node('Component 1', parent=a1_sa3_ssa2)

a1_sa3_ssa3 = Node('Sub Sub Assembly 3', parent=a1_sa3)
a1_sa3_ssa3_c1 = Node('Component 1', parent=a1_sa3_ssa3)
a1_sa3_c1 = Node('Component 1', parent=a1_sa3)
a1_sa3_c2 = Node('Component 2', parent=a1_sa3)
a1_sa3_c3 = Node('Component 3', parent=a1_sa3)
a1_sa3_c4 = Node('Component 4', parent=a1_sa3)

a1_sa3_ssa4 = Node('Sub Sub Assembly 4', parent=a1_sa3)
a1_sa3_ssa4_c1 = Node('Component 1', parent=a1_sa3_ssa4)

a1_sa3_ssa5 = Node('Sub Sub Assembly 5', parent=a1_sa3)
a1_sa3_ssa5_c1 = Node('Component 1', parent=a1_sa3_ssa5)

a1_sa3_ssa6 = Node('Sub Sub Assembly 6', parent=a1_sa3)
a1_sa3_ssa6_c1 = Node('Component 1', parent=a1_sa3_ssa6)

a1_sa3_ssa7 = Node('Sub Sub Assembly 7', parent=a1_sa3)
a1_sa3_ssa7_c1 = Node('Component 1', parent=a1_sa3_ssa7)

a1_sa3_ssa8 = Node('Sub Sub Assembly 8', parent=a1_sa3)
a1_sa3_ssa8_c1 = Node('Component 1', parent=a1_sa3_ssa8)

a1_sa3_ssa9 = Node('Sub Sub Assembly 9', parent=a1_sa3)
a1_sa3_ssa9_c1 = Node('Component 1', parent=a1_sa3_ssa9)

a1_sa3_ssa10 = Node('Sub Sub Assembly 10', parent=a1_sa3)
a1_sa3_ssa10_c1 = Node('Component 1', parent=a1_sa3_ssa10)
for pre, fill, node in RenderTree(a1):
    print(f'{pre}{node.name}')
Assembly 1
├── Sub Assembly 1
│   └── Sub Sub Assembly 1
│       └── Sub Sub Sub Assembly 1
│           ├── Component 1
│           └── Component 2
├── Sub Assembly 2
│   ├── Component 1
│   ├── Component 2
│   ├── Component 3
│   ├── Component 4
│   ├── Component 5
│   ├── Component 6
│   ├── Component 7
│   ├── Component 8
│   ├── Component 9
│   ├── Component 10
│   ├── Component 11
│   ├── Component 12
│   ├── Component 13
│   ├── Component 14
│   ├── Component 15
│   └── Component 16
└── Sub Assembly 3
    ├── Sub Sub Assembly 1
    │   └── Component 1
    ├── Sub Sub Assembly 2
    │   └── Component 1
    ├── Sub Sub Assembly 3
    │   └── Component 1
    ├── Component 1
    ├── Component 2
    ├── Component 3
    ├── Component 4
    ├── Sub Sub Assembly 4
    │   └── Component 1
    ├── Sub Sub Assembly 5
    │   └── Component 1
    ├── Sub Sub Assembly 6
    │   └── Component 1
    ├── Sub Sub Assembly 7
    │   └── Component 1
    ├── Sub Sub Assembly 8
    │   └── Component 1
    ├── Sub Sub Assembly 9
    │   └── Component 1
    └── Sub Sub Assembly 10
        └── Component 1

Parse the TCL Script Output File

Looks something like this (~ delimited; contents pasted here so as to not include ExampleTree_featureRequest_raw.txt):

~Assembly1~SubAssembly1~SubSubAssembly1~SubSubSubAssembly1~Component1
~Assembly1~SubAssembly1~SubSubAssembly1~SubSubSubAssembly1~Component2
~Assembly1~SubAssembly2~Component1
~Assembly1~SubAssembly2~Component2
~Assembly1~SubAssembly2~Component3
~Assembly1~SubAssembly2~Component4
~Assembly1~SubAssembly2~Component5
~Assembly1~SubAssembly2~Component6
~Assembly1~SubAssembly2~Component7
~Assembly1~SubAssembly2~Component8
~Assembly1~SubAssembly2~Component9
~Assembly1~SubAssembly2~Component10
~Assembly1~SubAssembly2~Component11
~Assembly1~SubAssembly2~Component12
~Assembly1~SubAssembly2~Component13
~Assembly1~SubAssembly2~Component14
~Assembly1~SubAssembly2~Component15
~Assembly1~SubAssembly2~Component16
~Assembly1~SubAssembly3~SubSubAssembly1~SubSubSubAssembly1~Component1
~Assembly1~SubAssembly3~SubSubAssembly2~SubSubSubAssembly1~Component1
~Assembly1~SubAssembly3~SubSubAssembly3~SubSubSubAssembly1~Component1
~Assembly1~SubAssembly3~SubSubAssembly3~SubSubSubAssembly2~Component1
~Assembly1~SubAssembly3~SubSubAssembly4~Component1
~Assembly1~SubAssembly3~SubSubAssembly4~Component2
~Assembly1~SubAssembly3~SubSubAssembly4~Component3
~Assembly1~SubAssembly3~SubSubAssembly4~Component4
~Assembly1~SubAssembly3~SubSubAssembly4~Component1
~Assembly1~SubAssembly3~SubSubAssembly5~Component1
~Assembly1~SubAssembly3~SubSubAssembly6~Component1
~Assembly1~SubAssembly3~SubSubAssembly7~Component1
~Assembly1~SubAssembly3~SubSubAssembly8~Component1
~Assembly1~SubAssembly3~SubSubAssembly9~Component1
~Assembly1~SubAssembly3~SubSubAssembly10~Component1

# Output file from the TCL script
raw_tcl_output = Path().cwd().joinpath('ExampleTree_featureRequest_raw.txt')

Read in the raw TCL Output File

with open(raw_tcl_output, 'r') as fin:
    content = fin.readlines()

Parse the content into list of lists

# Strip newlines, split into lists
content = [c.strip('\n') for c in content]
content = [c.split('~') for c in content]
display(content[0])

# Take all but the first after splitting to remove blank at beginning
# I guess that blank (i.e. ``content[0][0]``) is like the root?
content = [c[1:] for c in content]
['',
 'Assembly1',
 'SubAssembly1',
 'SubSubAssembly1',
 'SubSubSubAssembly1',
 'Component1']
content[0]
['Assembly1',
 'SubAssembly1',
 'SubSubAssembly1',
 'SubSubSubAssembly1',
 'Component1']
content
[['Assembly1',
  'SubAssembly1',
  'SubSubAssembly1',
  'SubSubSubAssembly1',
  'Component1'],
 ['Assembly1',
  'SubAssembly1',
  'SubSubAssembly1',
  'SubSubSubAssembly1',
  'Component2'],
 ['Assembly1', 'SubAssembly2', 'Component1'],
 ['Assembly1', 'SubAssembly2', 'Component2'],
 ['Assembly1', 'SubAssembly2', 'Component3'],
 ['Assembly1', 'SubAssembly2', 'Component4'],
 ['Assembly1', 'SubAssembly2', 'Component5'],
 ['Assembly1', 'SubAssembly2', 'Component6'],
 ['Assembly1', 'SubAssembly2', 'Component7'],
 ['Assembly1', 'SubAssembly2', 'Component8'],
 ['Assembly1', 'SubAssembly2', 'Component9'],
 ['Assembly1', 'SubAssembly2', 'Component10'],
 ['Assembly1', 'SubAssembly2', 'Component11'],
 ['Assembly1', 'SubAssembly2', 'Component12'],
 ['Assembly1', 'SubAssembly2', 'Component13'],
 ['Assembly1', 'SubAssembly2', 'Component14'],
 ['Assembly1', 'SubAssembly2', 'Component15'],
 ['Assembly1', 'SubAssembly2', 'Component16'],
 ['Assembly1',
  'SubAssembly3',
  'SubSubAssembly1',
  'SubSubSubAssembly1',
  'Component1'],
 ['Assembly1',
  'SubAssembly3',
  'SubSubAssembly2',
  'SubSubSubAssembly1',
  'Component1'],
 ['Assembly1',
  'SubAssembly3',
  'SubSubAssembly3',
  'SubSubSubAssembly1',
  'Component1'],
 ['Assembly1',
  'SubAssembly3',
  'SubSubAssembly3',
  'SubSubSubAssembly2',
  'Component1'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly4', 'Component1'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly4', 'Component2'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly4', 'Component3'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly4', 'Component4'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly4', 'Component1'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly5', 'Component1'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly6', 'Component1'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly7', 'Component1'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly8', 'Component1'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly9', 'Component1'],
 ['Assembly1', 'SubAssembly3', 'SubSubAssembly10', 'Component1']]

Desired Feature

A way to 'batch create' a node tree. Some command that will take in a list of delimited node-childNodes-etc. and create a valid Node object from it.

Example:

>>> list1 = ['Assembly1', 'SubAssembly1', 'SubSubAssembly1', 'SubSubSubAssembly1', 'Component1']
>>> node_from_list1 = SomeNewBatchCreateNodeFunction(input_list=list1)

The above should produce the same result as doing it by hand:

a1 = Node('Assembly 1', parent=None)
a1_sa1 = Node('Sub Assembly 1', parent=a1)
a1_sa1_ssa1 = Node('Sub Sub Assembly 1', parent=a1_sa1)
a1_sa1_ssa1_sssa1 = Node('Sub Sub Sub Assembly 1', parent=a1_sa1_ssa1)
a1_sa1_ssa1_sssa1.children = [Node('Component 1')]

Related Issues

Open

Closed

Semi-Related Issues

Open

Closed

@jkbgbr
Copy link
Contributor

jkbgbr commented Jun 28, 2022

You mean like this?

from anytree import Node, findall_by_attr, RenderTree

lines = ['~Assembly1~SubAssembly1~SubSubAssembly1~SubSubSubAssembly1~Component1',
         '~Assembly1~SubAssembly1~SubSubAssembly1~SubSubSubAssembly1~Component2',
         '~Assembly1~SubAssembly2~Component1',
         '~Assembly1~SubAssembly2~Component2',
         '~Assembly1~SubAssembly2~Component3',
         '~Assembly1~SubAssembly2~Component4',
         '~Assembly1~SubAssembly2~Component5',
         '~Assembly1~SubAssembly2~Component6',
         '~Assembly1~SubAssembly2~Component7',
         '~Assembly1~SubAssembly2~Component8',
         '~Assembly1~SubAssembly2~Component9',
         '~Assembly1~SubAssembly2~Component10',
         '~Assembly1~SubAssembly2~Component11',
         '~Assembly1~SubAssembly2~Component12',
         '~Assembly1~SubAssembly2~Component13',
         '~Assembly1~SubAssembly2~Component14',
         '~Assembly1~SubAssembly2~Component15',
         '~Assembly1~SubAssembly2~Component16'
         '~Assembly1~SubAssembly3~SubSubAssembly1~SubSubSubAssembly1~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly2~SubSubSubAssembly1~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly3~SubSubSubAssembly1~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly3~SubSubSubAssembly2~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly4~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly4~Component2',
         '~Assembly1~SubAssembly3~SubSubAssembly4~Component3',
         '~Assembly1~SubAssembly3~SubSubAssembly4~Component4',
         '~Assembly1~SubAssembly3~SubSubAssembly4~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly5~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly6~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly7~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly8~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly9~Component1',
         '~Assembly1~SubAssembly3~SubSubAssembly10~Component1', ]


def from_assembly_line(root: Node = None, line: str = ''):

    nodenames = [x for x in line.split('~') if x]  # removing empty items

    # root node
    if root is None:
        root = Node(nodenames[0], parent=None)

    # iterating from the second element
    for nodeind, nodename in enumerate(nodenames[1:]):
        parent_candidate = findall_by_attr(node=root, value=nodenames[nodeind])
        # todo check len(parent_candidate) > 0
        if not findall_by_attr(node=parent_candidate[0], value=nodename):
            Node(nodename, parent=parent_candidate[0])

    return root


if __name__ == '__main__':
    _root = None
    for _line in lines:
        _root = from_assembly_line(root=_root, line=_line)

    print(RenderTree(_root))

@als0052
Copy link
Author

als0052 commented Jun 28, 2022

That looks like it'll work. I think I long ago found a work around to my issue above but I was hoping that this could become a more easily used feature in future releases. That way you don't have to write your own function to do it, even if it is a pretty simple (in hindsight) function.

@lverweijen
Copy link

lverweijen commented Jul 4, 2023

Something more general that I would like would be a function to turn a list like this:

l = [('Europe", "Italy", "Rome"),
     ('Europe", "Italy", "Milan"),
     ('Europe", "France", "Paris")]

into a tree:

Europe
 Italy
  Rome
  Milan
 France
  Paris

The reason is that I have many hierarchies stored in csv files and with pandas and such a function I can easily convert them to trees.
It would also solve your problem, because you can just split each of your strings.

@lverweijen
Copy link

lverweijen commented Jul 4, 2023

Here are sample implementations for my ideas above:

def from_rows(rows, node_factory=anytree.Node, root_name="root"):
    created_nodes = {}

    root = node_factory(root_name)
    for row in rows:
        parent_node = root
        for depth, col in enumerate(row):
            if (depth, col) in created_nodes:
                node = created_nodes[depth, col]
            else:
                node = node_factory(col)
                node.parent = parent_node
                created_nodes[depth, col] = node

            parent_node = node

    return root


def to_rows(root, str_factory=str, skip_root=True):
    index = 1 if skip_root else 0
    for leaf in root.leaves:
        yield [str_factory(node) for node in leaf.path[index:]]

Update 2024-01-06:
I implemented this in littletree using functions Node.from_rows, Node.to_rows.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants