Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why IndentedStringImporter is gone ? #105

Open
regexgit opened this issue Nov 5, 2019 · 8 comments
Open

Why IndentedStringImporter is gone ? #105

regexgit opened this issue Nov 5, 2019 · 8 comments

Comments

@regexgit
Copy link

regexgit commented Nov 5, 2019

About one year ago I had IndentedStringImporter installed with anytree. Now after a reinstallation of the OS and all my tools I realize that it is no longer present.

I use it a lot, for example biological taxonomies or import/export of photo tools hierarchical keywords are "controlled vocabulary" indented text files.

Fortunately I kept a backup of the code but but I would prefer an official installation.

@c0fec0de
Copy link
Owner

It was no official implementation. Just on a branch. I will double check.

@als0052
Copy link

als0052 commented Jan 14, 2021

Just to add to the convo here I'm also interested in seeing the IndentedStringImporter (perhaps also an IndentedStringExporter?) added in. I read the initial feature request and tried looking for the source code by my github-foo is not very good.

@LionKimbro
Copy link

I was just thinking about implementing an indented string importer, something that would read:

Foo
  Bar
  Baz
    Boz
    Bitz
  Blah

...and construct a tree with just that.

If I implement such a thing in a branch, is there any chance that it would be accepted?
Is the project taking contributions?

@angely-dev
Copy link

Any updates on this?

@regexgit
Copy link
Author

It seems not.
If it can help you while waiting for an official version: I still use the original version (file indentedstringimporter.py of 2019) which I carefully kept.
No warranty of course but for my needs it's enough.

# -*- coding: utf-8 -*-
from anytree import AnyNode

#---------------------------------------
def _get_indentation(line):
	# Split string using version without indentation
	# First item of result is the indentation itself.
	content = line.lstrip(' ')
	indentation_length = len(line.split(content)[0])
	return indentation_length, content

#*******************************************************************************
class IndentedStringImporter(object):

	def __init__(self, nodecls=AnyNode):
		u"""
		Import Tree from a single string (with all the lines) or list of strings
		(lines) with indentation.
		
		Every indented line is converted to an instance of `nodecls`. The string
		(without indentation) found on the lines are set as the respective node name.
		
		This importer do not constrain indented data to have a definite number of
		whitespaces (multiple of any number). Nodes are considered child of a
		parent simply if its indentation is bigger than its parent.
		
		This means that the tree can have siblings with different indentations,
		as long as the siblings indentations are bigger than the respective parent
		(but not necessarily the same considering each other).
		
		Keyword Args:
		    nodecls: class used for nodes.
		
		Example using a string list:
		>>> from anytree.importer import IndentedStringImporter
		>>> from anytree import RenderTree
		>>> importer = IndentedStringImporter()
		>>> lines = [
		...    'Node1',
		...    'Node2',
		...    '    Node3',
		...    'Node5',
		...    '    Node6',
		...    '        Node7',
		...    '    Node8',
		...    '        Node9',
		...    '      Node10',
		...    '    Node11',
		...    '  Node12',
		...    'Node13',
		...]
		>>> root = importer.import_(lines)
		>>> print(RenderTree(root))
		AnyNode(name='root')
		├── AnyNode(name='Node1')
		├── AnyNode(name='Node2')
		│   └── AnyNode(name='Node3')
		├── AnyNode(name='Node5')
		│   ├── AnyNode(name='Node6')
		│   │   └── AnyNode(name='Node7')
		│   ├── AnyNode(name='Node8')
		│   │   ├── AnyNode(name='Node9')
		│   │   └── AnyNode(name='Node10')
		│   ├── AnyNode(name='Node11')
		│   └── AnyNode(name='Node12')
		└── AnyNode(name='Node13')
		Example using a string:
		>>> string = "Node1\n  Node2\n  Node3\n    Node4"
		>>> root = importer.import_(string)
		>>> print(RenderTree(root))
		 AnyNode(name='root')
		└── AnyNode(name='Node1')
		    ├── AnyNode(name='Node2')
		    └── AnyNode(name='Node3')
		        └── AnyNode(name='Node4')
		"""
		
		self.nodecls = nodecls
	
	#------------------------------------
	def _tree_from_indented_str(self, data):
		if isinstance(data, str):
			lines = data.splitlines()
		else:
			lines = data
		root = self.nodecls(name="root")
		indentations = {}
		for line in lines:
			cur_indent, name = _get_indentation(line)

			if len(indentations) == 0:
				parent = root
			elif cur_indent not in indentations:
				# parent is the next lower indentation
				keys = [key for key in indentations.keys()
						  if key < cur_indent]
				parent = indentations[max(keys)]
			else:
				# current line uses the parent of the last line
				# with same indentation
				# and replaces it as the last line with this given indentation
				parent = indentations[cur_indent].parent

			indentations[cur_indent] = self.nodecls(name=name, parent=parent)

			# delete all higher indentations
			keys = [key for key in indentations.keys() if key > cur_indent]
			for key in keys:
				indentations.pop(key)
		return root
	
	#------------------------------------
	def import_(self, data):
		# data: single string or a list of lines
		return self._tree_from_indented_str(data)

@angely-dev
Copy link

angely-dev commented Mar 1, 2023

Thanks @regexgit for pointing out the original version, yet I ended up doing my own and lightweight implementation. It converts an indented config (not text, strictly speaking, since I assume each line to be unique per indented blocks) to an n-ary tree using raw nested dicts.

The goal was to compare (and merge) two config files whilst being aware of the indented blocks scope. Unlike anytree, it won't meet everyone's requirements but if anyone is interested: text to tree conversion in 10 lines of code and an example. I also published a simple gist.

@lverweijen
Copy link

lverweijen commented Jul 4, 2023

I would also be interested in this.

I actually created my own version. It wasn't written for anytree (but can probably easily be changed) and it may not be very flexible or fault-tolerant, but it should be reasonably fast for correct input:

    def from_indented_file(file, indent='@'):  # Change to "    " if 4 spaces are desired
        # Each line consists of indent and code
        pattern = re.compile(rf"^(?P<prefix>({re.escape(indent)})*)(?P<code>.*)")

        root = Node()
        stack = [root]

        for line in file:
            match = pattern.match(line)
            prefix, code = match['prefix'], match['code']
            depth = len(prefix) // len(indent)
            parent_node = stack[depth]
            node = parent_node.add(code)  # Should probably change to node = Node(parent=parent_node)

            # Place node as last item on index depth + 1
            del stack[depth + 1:]
            stack.append(node)

   return root

If a pull request is accepted, maybe the best parts of all three implementations can be combined.
I would also like to have an export to an indented file with the same options.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants