Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate schema improvements/autocast #158

Open
Tracked by #156
vberlier opened this issue Nov 3, 2021 · 7 comments
Open
Tracked by #156

Investigate schema improvements/autocast #158

vberlier opened this issue Nov 3, 2021 · 7 comments

Comments

@vberlier
Copy link
Owner

vberlier commented Nov 3, 2021

Original conversation:

@vberlier vberlier mentioned this issue Nov 3, 2021
9 tasks
@MestreLion
Copy link
Contributor

I totally forgot about the recursive nature of auto_cast. Well, that was only a pseudocode I've though while writing the comment, it's more of an algorithm draft than actual code.

Need to think more about this, but my initial feeling is that yes, auto_cast should be recursive. This would be set by the parent's __setitem__ (or, rather, by the .autocast_setitem() that was assigned as __setitem__), so it may require a new auto_cast keyword in Base's __init__/__new__. And currently, IIRC, Base has no init at all, so adding one would set a new precedent and a open potential can of worms.

OTOH, not sure how useful a recursive auto cast would be. I initially wanted it to make it easy to assign values to leaf keys, so I could write mycomp["mykey"] = "Bla Bla bla" instead of the verbose mycomp["somekey"] = mc.String("Bla Bla bla"). Even for numbers, it could tel apart an Int from Byte or Long by using the currently assigned value's type. For leaves it's perfect, I'm replacing a Tag so I know it's type can can use it to cast the new value.

But in a recursion, we lose that information. In {"a": {"b": {"c": 1}}}, there's no "current value" to cast the 1 to. Unless you're planning on following a whole tree to see if it matches the key of the current value. Man, that's ambitious. And surely a task for schema, not auto-cast. My initial idea of autocast was merely a convenience, a safeguard against mistakes (like mycomp[key] = 1 instead of mycomp[key] = mc.Byte(1), or a convenience to be used by adults ("I know what I'm doing" kind of things)

@vberlier
Copy link
Owner Author

vberlier commented Nov 3, 2021

I agree that the scope for this is a bit fuzzy right now. Auto-casting could be just a minimal quality-of-life thing, or affect the entire hierarchy you're working with, or be some sort of automatic deduction and application of schemas. I'll wait until lower-level issues are fleshed out before messing around with it.

@MestreLion
Copy link
Contributor

Just a self-note: there are cases where I do not want auto-cast not only because of performance. I intend to intentionally set some keys to something else, like a custom class. for example world.level['Data']['Player'] = Player(...). (and Player['inventory'] = Inventory(...))

Well, of course both Player and Inventory would still inherit from Compound, so they would still be actual tags (and that's why it currently works). But the auto casting would have to be smart enough not to cast them to plain Compounds. But even if they didn't inherit from Compound or Base (i.e. they're not tags at all), if they have .parse() and .write() then it shouldn't matter and auto-cast should not bother.

@MestreLion
Copy link
Contributor

I think its time I paste that pseudo-almost-actual-code from the "main" issue as a reference:

def __init__(*args, *, auto_cast=False, **kwargs):  # or __new__
    self.auto_cast = auto_cast

@property
def auto_cast(self, value: bool) -> bool:
    return self.autocast_setitem == super().__setitem__

@autocast.setter
def auto_cast(self, value: bool) -> None:
    # should we recurse that too? slippery slope...
    self.__setitem__ = self._autocast if value else super().__setitem__

def _autocast(self, key, value) -> None:
    if not isinstance(value, Base) and key in self:  # and isinstance(self[key], Base) too?
        # Overwriting existing key with a non-tag value. Cast to old value type (if old is a tag)
        super()[key] = self[key].__class__(value)  # non-recursive version
        return
    super()[key] = value

This would default to False (so load/parse gets non-casting, fast behavior), and File.load() could set it to True after fully creating/parsing the instance and before returning self to the caller (the same way it sets filename)
...

Or maybe File.load() should not even bother to turn it on, or at least not by default. Auto-cast is for users, so clients can manually set chunk = File.load(...); chunk.auto_cast = True (or maybe chunk.set_autocast() to make it explicit that this is an expensive operation what will change the way this tag will behave in the future?)

@MestreLion
Copy link
Contributor

But even if they didn't inherit from Compound or Base (i.e. they're not tags at all), if they have .parse() and .write() then it shouldn't matter and auto-cast should not bother.

Speaking of this... I think this is right issue to tackle on a more profound subject: What is a tag afterall?

  • Inherit from Base?
  • Have .tag_id?
  • Implement .write() and parse()? (so is tag a ABC "protocol"? what else must it implement?)

Not sure if there's any point in defining this, but sometimes I wonder what should be the "canonical" way of checking if something is a tag. Btw, currently I'm using isinstance(tag, Base) (and wishing Base was named Tag, lol)

@vberlier
Copy link
Owner Author

vberlier commented Nov 4, 2021

The way I intended it is that something is a concrete tag if it inherits from Base and its tag_id is not None. I agree that Tag would've probably been a better name though, I wonder if it will break a lot of code if we change it in 2.0.

@MestreLion
Copy link
Contributor

I agree that Tag would've probably been a better name though, I wonder if it will break a lot of code if we change it in 2.0.

You can always add a Base = Tag alias for backward compatibility, and do a phased deprecation:

  • 2.1: just add the alias, keep both __all__, just to see how it goes
  • 2.x: remove Base (alias) from __all__
  • 3.0: remove the alias. Or don't, and keep it forever.
    In any case, rename Base => Tag in all your code, so users will know that one is the the "proper" name

That's a lot of trouble for just a nitpick, btw. Tag might be better, but Base is fine too. And I love the fact that its subclasses are not prefixed with Tag (or TAG_). We all know Compound is a tag, thank you pymclevel.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants