Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Path.from_parts() or similar to allow tuple constructor like pathlib.Path #157

Open
Tracked by #156
MestreLion opened this issue Nov 2, 2021 · 8 comments
Open
Tracked by #156

Comments

@MestreLion
Copy link
Contributor

MestreLion commented Nov 2, 2021

I have this NBT Explorer-like printing and when re-implementing it using my generic tree walker I noticed a missing feature in Path that forced me to use a really inefficient constructor:

The generic tree walker yields a generic tuple containing each path component, similar to the pathlib.Path() API. And to create an nbtlib.Path from this tuple I had to resort to an expensive that iteratively (or recursively) creates a new Path for each component by concatenating the next component (def get_element(root, keys): return get_element(root[keys[0]], keys[1:]). (by the way, this concatenation is what led me to #146 , as each part can be an int (for List) or str (for Compound)

I'm sure there are better, or at least easier ways of doing this. Something like Path.from_accessors(), but skipping the parser by assuming each part is either an int or a str representing a single key.

@MestreLion
Copy link
Contributor Author

I suggest 2 possible approaches:

  • Path.from_parts(), classmethod that takes a plain tuple* and builds the accessors without invoking the parser, assuming each part is a single, non-empty component
  • Allow the main constructor Path() to take such plain tuples (possibly invoking .from_parts(), right after the isinstance(path, Path) check (as Path is a tuple itself)

*: By "tuple" it should actually mean "any sequence which is not an string", so lists and similar iterables are allowed too. anyway, that's a minor detail.

@vberlier vberlier mentioned this issue Nov 3, 2021
9 tasks
@vberlier
Copy link
Owner

vberlier commented Nov 3, 2021

Right, so now I see why you wanted to concatenate ints too. But in this case subscripting is the operation you would want anyway. Assuming __add__ allows ints, you could implement _tree.get_element() using concatenation like this root + keys[0], but _tree.get_element(Path(), ["foo.bar", 2, "spam"]) would result in foo.bar[2].spam instead of "foo.bar"[2].spam. It's the kind of subtle error I was thinking about the other day. The problem isn't really that supporting ints in __add__ is ambiguous by itself, but that it behaves the same for operations that are semantically different. To me the bug would be more time-consuming to figure out than a clear TypeError.

Anyway, I agree that from_parts() would be pretty useful. You could probably currently implement it yourself with something like this (untested):

path = tuple.__new__(Path, (NamedKey(k) if isinstance(k, str) else ListIndex(k) for k in data.keys[:-1]))

Also a walk() method would definitely belong in nbtlib.

@MestreLion
Copy link
Contributor Author

but _tree.get_element(Path(), ["foo.bar", 2, "spam"]) would result in foo.bar[2].spam instead of "foo.bar"[2].spam. It's the kind of subtle error I was thinking about the other day.

Yeah, but since it came from a tuple, using a method called .from_parts(), it's pretty self-evident that "foo.bar" must indeed be a single component. The tuple/list format makes it clear that each element is a single key. If you want to parse each element, then use Path's constructor or concatenation. .from_parts() is for adults, it does have a contract/assumption embedded on it for the sake of performance.

The problem isn't really that supporting ints in __add__ is ambiguous by itself, but that it behaves the same for operations that are semantically different.

The operations semantically are different, I agree, but int only behaves the same because because of a limitation on integers: they can't "express" more than a single component. Strings and Paths can, so they do when the operation allows.

For example, let's say we hypothetically allow tuples/lists in __add__ too, it could (and maybe it should) mean "parse every element", so Path() + ["foo.bar", 2, "spam"] == Path("foo.bar[2].spam"). Probably not useful, but still consistent:

  • "High-level" operations, such as Path() and +, do some parsing/processing on its input. A single string may become more than one key, so a tuple may become more than its len(). An int would too, if it could, and Path keeps its len()
  • "Low-level" operations, such as .from_accessors() of .from_parts(), are internal tools exposed to the API for the sake of performance. They make assumptions and are more restrictive about their data. They skip the parser.

@MestreLion
Copy link
Contributor Author

in this scenario, [] indexing (not slicing) is a low-level tool too, as it also skips the parser by contract, assuming the index is always a single element for leaf values such as strings and integers. For a collection such as another Path, it merely loops and adds each part without any processing (the same for a hipotetical tuple. But I'm not suggesting you allow tuples to + or [] or even (), ok? Just using it as a mental exercise for testing API consistency. I'm more than happy with just .from_parts()

@MestreLion
Copy link
Contributor Author

MestreLion commented Nov 3, 2021

Anyway, I agree that from_parts() would be pretty useful. You could probably currently implement it yourself with something like this (untested):

path = tuple.__new__(Path, (NamedKey(k) if isinstance(k, str) else ListIndex(k) for k in data.keys[:-1]))

Hum, didn't know about Path's internals to realize it would be so simple. I'll take a look at it and test to craft an actual PR.

Also a walk() method would definitely belong in nbtlib.

Feel free to grab my tree.py, or some of its usage in my nbt (a wrapper for nbtlib). It's a new addition that came out of some experiments in generic crawlers I'm particularly proud of. Still WIP, soon I'll add a useless os.walk() wrapper using it just to test how flexible it is. print_tree is cute, take a look at its usage in nbt.nbt_explorer()

@vberlier
Copy link
Owner

vberlier commented Nov 3, 2021

Yup sorry if this wasn't really clear, the first paragraph wasn't really related to from_parts() or anything. It was just a tangent about the error you could have run into if you tried implementing your get_element function with concatenation rather than indexing. I was just trying to say that the canonical operations for extending paths are path + path and path[key], and that for concatenation the operands can be subjected to auto-conversion that can make it look like they do the same thing.

I'm completely on board with from_parts. I'm gonna look into your patches in more details but this looks really neat!

@MestreLion
Copy link
Contributor Author

Yup sorry if this wasn't really clear, the first paragraph wasn't really related to from_parts() or anything. It was just a tangent [...]

Nah, you were clear enough I could infer you approved .from_parts() and we were just discussing a side-track.

for concatenation the operands can be subjected to auto-conversion that can make it look like they do the same thing.

Yeah, I carefully chose to use indexing instead of concatenation in get_element because of that difference. I was aware since you told me in #146 . The need for .from_parts() is just to avoid the looping/recursion (or perform it in Path), as I was sure Path had a better way to craft itself out of a tuple.

I'm completely on board with from_parts. I'm gonna look into your patches in more details but this looks really neat!

Well... I didn't PR this one yet... will do so in the next days

@vberlier
Copy link
Owner

vberlier commented Nov 4, 2021

Right, makes sense!

Well... I didn't PR this one yet... will do so in the next days

By "patches" I was actually referring to your nbt.py file containing your extra utilities and where you monkey-match a few nbtlib things.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants