Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Define a new __key__ protocol #64831

Closed
ncoghlan opened this issue Feb 15, 2014 · 11 comments
Closed

Define a new __key__ protocol #64831

ncoghlan opened this issue Feb 15, 2014 · 11 comments
Labels
3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement

Comments

@ncoghlan
Copy link
Contributor

BPO 20632
Nosy @rhettinger, @ncoghlan, @pitrou, @scoder, @vadmium, @MojoVampire, @csabella

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2018-02-13.02:48:16.324>
created_at = <Date 2014-02-15.00:17:55.243>
labels = ['interpreter-core', 'type-feature', '3.8']
title = 'Define a new __key__ protocol'
updated_at = <Date 2018-02-14.18:08:28.883>
user = 'https://github.com/ncoghlan'

bugs.python.org fields:

activity = <Date 2018-02-14.18:08:28.883>
actor = 'ncoghlan'
assignee = 'none'
closed = True
closed_date = <Date 2018-02-13.02:48:16.324>
closer = 'ncoghlan'
components = ['Interpreter Core']
creation = <Date 2014-02-15.00:17:55.243>
creator = 'ncoghlan'
dependencies = []
files = []
hgrepos = []
issue_num = 20632
keywords = []
message_count = 11.0
messages = ['211253', '211254', '211890', '211926', '311567', '312096', '312100', '312101', '312106', '312130', '312178']
nosy_count = 8.0
nosy_names = ['rhettinger', 'ncoghlan', 'pitrou', 'scoder', 'cvrebert', 'martin.panter', 'josh.r', 'cheryl.sabella']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue20632'
versions = ['Python 3.8']

@ncoghlan
Copy link
Contributor Author

This is an idea that would require a PEP, just writing it down here as a permanent record in case someone else wants to run with it.

Currently, the *simplest* way to define a non-identity total ordering on an immutable object is to define __hash__, __eq__ and __lt__ appropriately, and then use functools.total_ordering to add the other comparison methods.

However, many such implementations follow a very similar pattern:

    def __hash__(self):
        return hash(self._calculate_key())
    def __eq__(self, other):
        if isinstance(other, __class__):
            return self._calculate_key() == other._calculate_key()
        return NotImplemented
    def __lt__(self, other):
        if isinstance(other, __class__):
            return self._calculate_key() < other._calculate_key()
        return NotImplemented

A "__key__" protocol as an inherent part of the type system could greatly simplify that:

    def __key__(self):
        return self._calculate_key()

The interpreter would then derive appropriate implementations for __hash__ and all the rich comparison methods based on that key calculation and install them when the type object was created.

If the type is mutable (and hence orderable but not hashable), then setting "__hash__ = None" would disable the implicit hashing support (just as it can already be used to explicitly disable hash inheritance).

(Inspired by Chris Withers's python-dev thread: https://mail.python.org/pipermail/python-dev/2014-February/132332.html)

@ncoghlan ncoghlan added interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Feb 15, 2014
@ncoghlan
Copy link
Contributor Author

Note: in conjuction with a class decorator (along the lines of functools.total_ordering), this idea is amenable to experimentation as a third party module. However, any such third party module shouldn't use a reserved name like __key__ - a public name like "calculate_key" would be more appropriate.

@pitrou
Copy link
Member

pitrou commented Feb 21, 2014

This is a very nice idea, but does it have to be part of the interpreter core, or could it simply be supplied by a decorator in the functools module?

(the main advantage of having it in the interpreter is speed)

@ncoghlan
Copy link
Contributor Author

I suspect it could just be a class decorator (along the lines of
total_ordering), and it should certainly be prototyped on PyPI as such a
decorator (using a different name for the key calculating method). If it
eventually happened, elevation to a core protocol would really be about
defining this as being *preferred* in the cases where it applies, and
that's a fairly weak basis for changing the type constructor.

@csabella
Copy link
Contributor

csabella commented Feb 3, 2018

I wonder if this would make sense as a parameter to dataclass now.

@csabella csabella added the 3.8 only security fixes label Feb 3, 2018
@ncoghlan
Copy link
Contributor Author

For now, I'm going to close this as "out of date", with the guidance being "Define a data class instead" (since that gets rid of the historical boilerplate a different way: auto-generating suitable methods based on the field declarations).

If somebody comes up with a use case for this protocol idea that isn't adequately covered by data classes, then they can bring it up on python-ideas, and we can look at revisiting the question.

@MojoVampire
Copy link
Mannequin

MojoVampire mannequin commented Feb 13, 2018

Do data classes let you define some fields as being excluded from the equality/ordering/hashing? I got the impression that if a field existed, it was part of the "key" no matter what, which isn't necessarily correct in the general case. Simple examples would be attributes that equivalent C++ would tag with the mutable keyword; they're not part of the logical state of the instance (e.g. debugging counters or whatever), so they shouldn't be included in the "key".

@MojoVampire
Copy link
Mannequin

MojoVampire mannequin commented Feb 13, 2018

Ah, never mind. Looks like dataclasses.InitVar fields seem to be the answer to excluding a field from the auto-generated methods.

@ncoghlan
Copy link
Contributor Author

It isn't InitVar that you want for that use case (that's just for passing extra information to __post_init__).

Instead, you want:

    extra_field = field(compare=False): int # Excluded from __hash__, __eq_, etc

You can also exclude a field from __hash__, but keep it in the comparison methods:

    unhashed_field = field(hash=False): int # Excluded from __hash__ only

@csabella
Copy link
Contributor

Thanks, Nick.

When I first came across this issue, I thought that dataclasses would take care of what you wrote below, but after looking at the original discussion on python-dev, I thought the problem was ordering None within a comparison with None being a valid value in SQLite.

For example,
>>> a = [1, None, 'a']
>>> b = [1, 5, 'b']
>>> a == b
False
>>> a < b
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: '<' not supported between instances of 'NoneType' and 'int'

@ncoghlan
Copy link
Contributor Author

Allowing for None-first and None-last ordering is a fair use case, but I'm not sure a __key__ protocol is the right answer to that - as your own example shows, it gets tricky when dealing with nested containers.

It may make sense to raise the question on python-ideas for Python 3.8+, though, with Python-side ordering of database records as the main motivating use case.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

3 participants