Skip to content

Commit 26c5931

Browse files
committed
Add unsafe_hash alias for class-wide hash
Fixes #1003
1 parent 2812fab commit 26c5931

File tree

6 files changed

+113
-92
lines changed

6 files changed

+113
-92
lines changed

docs/conf.py

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,11 @@
4545
"sphinxcontrib.towncrier",
4646
]
4747

48+
myst_enable_extensions = [
49+
"colon_fence",
50+
"smartquotes",
51+
"deflist",
52+
]
4853

4954
# Add any paths that contain templates here, relative to this directory.
5055
templates_path = ["_templates"]

docs/hashing.md

Lines changed: 79 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,79 @@
1+
# Hashing
2+
3+
## Hash Method Generation
4+
5+
:::{warning}
6+
The overarching theme is to never set the `@attrs.define(unsafe_hash=X)` parameter yourself.
7+
Leave it at `None` which means that *attrs* will do the right thing for you, depending on the other parameters:
8+
9+
- If you want to make objects hashable by value: use `@define(frozen=True)`.
10+
- If you want hashing and equality by object identity: use `@define(eq=False)`
11+
12+
Setting `unsafe_hash` yourself can have unexpected consequences so we recommend to tinker with it only if you know exactly what you're doing.
13+
:::
14+
15+
Under certain circumstances, it's necessary for objects to be *hashable*.
16+
For example if you want to put them into a {class}`set` or if you want to use them as keys in a {class}`dict`.
17+
18+
The *hash* of an object is an integer that represents the contents of an object.
19+
It can be obtained by calling `hash` on an object and is implemented by writing a `__hash__` method for your class.
20+
21+
*attrs* will happily write a `__hash__` method for you [^fn1], however it will *not* do so by default.
22+
Because according to the [definition](https://docs.python.org/3/glossary.html#term-hashable) from the official Python docs, the returned hash has to fulfill certain constraints:
23+
24+
[^fn1]: The hash is computed by hashing a tuple that consists of a unique id for the class plus all attribute values.
25+
26+
1. Two objects that are equal, **must** have the same hash.
27+
This means that if `x == y`, it *must* follow that `hash(x) == hash(y)`.
28+
29+
By default, Python classes are compared *and* hashed by their `id`.
30+
That means that every instance of a class has a different hash, no matter what attributes it carries.
31+
32+
It follows that the moment you (or *attrs*) change the way equality is handled by implementing `__eq__` which is based on attribute values, this constraint is broken.
33+
For that reason Python 3 will make a class that has customized equality unhashable.
34+
Python 2 on the other hand will happily let you shoot your foot off.
35+
Unfortunately, *attrs* still mimics (otherwise unsupported) Python 2's behavior for backward compatibility reasons if you set `hash=False`.
36+
37+
The *correct way* to achieve hashing by id is to set `@define(eq=False)`.
38+
Setting `@define(unsafe_hash=False)` (which implies `eq=True`) is almost certainly a *bug*.
39+
40+
:::{warning}
41+
Be careful when subclassing!
42+
Setting `eq=False` on a class whose base class has a non-default `__hash__` method will *not* make *attrs* remove that `__hash__` for you.
43+
44+
It is part of *attrs*'s philosophy to only *add* to classes so you have the freedom to customize your classes as you wish.
45+
So if you want to *get rid* of methods, you'll have to do it by hand.
46+
47+
The easiest way to reset `__hash__` on a class is adding `__hash__ = object.__hash__` in the class body.
48+
:::
49+
50+
2. If two objects are not equal, their hash **should** be different.
51+
52+
While this isn't a requirement from a standpoint of correctness, sets and dicts become less effective if there are a lot of identical hashes.
53+
The worst case is when all objects have the same hash which turns a set into a list.
54+
55+
3. The hash of an object **must not** change.
56+
57+
If you create a class with `@define(frozen=True)` this is fulfilled by definition, therefore *attrs* will write a `__hash__` function for you automatically.
58+
You can also force it to write one with `hash=True` but then it's *your* responsibility to make sure that the object is not mutated.
59+
60+
This point is the reason why mutable structures like lists, dictionaries, or sets aren't hashable while immutable ones like tuples or `frozenset`s are:
61+
point 1 and 2 require that the hash changes with the contents but point 3 forbids it.
62+
63+
For a more thorough explanation of this topic, please refer to this blog post: [*Python Hashes and Equality*](https://hynek.me/articles/hashes-and-equality/).
64+
65+
66+
## Hashing and Mutability
67+
68+
Changing any field involved in hash code computation after the first call to `__hash__` (typically this would be after its insertion into a hash-based collection) can result in silent bugs.
69+
Therefore, it is strongly recommended that hashable classes be `frozen`.
70+
Beware, however, that this is not a complete guarantee of safety:
71+
if a field points to an object and that object is mutated, the hash code may change, but `frozen` will not protect you.
72+
73+
74+
## Hash Code Caching
75+
76+
Some objects have hash codes which are expensive to compute.
77+
If such objects are to be stored in hash-based collections, it can be useful to compute the hash codes only once and then store the result on the object to make future hash code requests fast.
78+
To enable caching of hash codes, pass `@define(cache_hash=True)`.
79+
This may only be done if *attrs* is already generating a hash function for the object.

docs/hashing.rst

Lines changed: 0 additions & 86 deletions
This file was deleted.

src/attr/_make.py

Lines changed: 14 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -1217,6 +1217,7 @@ def attrs(
12171217
on_setattr=None,
12181218
field_transformer=None,
12191219
match_args=True,
1220+
unsafe_hash=None,
12201221
):
12211222
r"""
12221223
A class decorator that adds :term:`dunder methods` according to the
@@ -1279,8 +1280,8 @@ def attrs(
12791280
*eq*.
12801281
:param Optional[bool] cmp: Setting *cmp* is equivalent to setting *eq*
12811282
and *order* to the same value. Must not be mixed with *eq* or *order*.
1282-
:param Optional[bool] hash: If ``None`` (default), the ``__hash__`` method
1283-
is generated according how *eq* and *frozen* are set.
1283+
:param Optional[bool] unsafe_hash: If ``None`` (default), the ``__hash__``
1284+
method is generated according how *eq* and *frozen* are set.
12841285
12851286
1. If *both* are True, ``attrs`` will generate a ``__hash__`` for you.
12861287
2. If *eq* is True and *frozen* is False, ``__hash__`` will be set to
@@ -1298,6 +1299,8 @@ def attrs(
12981299
`object.__hash__`, and the `GitHub issue that led to the default \
12991300
behavior <https://github.com/python-attrs/attrs/issues/136>`_ for more
13001301
details.
1302+
:param Optional[bool] hash: Alias for *unsafe_hash*. *unsafe_hash* takes
1303+
precedence.
13011304
:param bool init: Create a ``__init__`` method that initializes the
13021305
``attrs`` attributes. Leading underscores are stripped for the argument
13031306
name. If a ``__attrs_pre_init__`` method exists on the class, it will
@@ -1469,9 +1472,14 @@ def attrs(
14691472
.. versionchanged:: 21.1.0 Support for ``__attrs_pre_init__``
14701473
.. versionchanged:: 21.1.0 *cmp* undeprecated
14711474
.. versionadded:: 21.3.0 *match_args*
1475+
.. versionadded:: 22.2.0
1476+
*unsafe_hash* as an alias for *hash* (for :pep:`681` compliance).
14721477
"""
14731478
eq_, order_ = _determine_attrs_eq_order(cmp, eq, order, None)
1474-
hash_ = hash # work around the lack of nonlocal
1479+
1480+
# unsafe_hash takes precedence due to PEP 681.
1481+
if unsafe_hash is not None:
1482+
hash = unsafe_hash
14751483

14761484
if isinstance(on_setattr, (list, tuple)):
14771485
on_setattr = setters.pipe(*on_setattr)
@@ -1527,14 +1535,14 @@ def wrap(cls):
15271535

15281536
builder.add_setattr()
15291537

1538+
nonlocal hash
15301539
if (
1531-
hash_ is None
1540+
hash is None
15321541
and auto_detect is True
15331542
and _has_own_attribute(cls, "__hash__")
15341543
):
15351544
hash = False
1536-
else:
1537-
hash = hash_
1545+
15381546
if hash is not True and hash is not False and hash is not None:
15391547
# Can't use `hash in` because 1 == True for example.
15401548
raise TypeError(

src/attr/_next_gen.py

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -26,6 +26,7 @@ def define(
2626
*,
2727
these=None,
2828
repr=None,
29+
unsafe_hash=None,
2930
hash=None,
3031
init=None,
3132
slots=True,
@@ -81,6 +82,8 @@ def define(
8182
8283
.. versionadded:: 20.1.0
8384
.. versionchanged:: 21.3.0 Converters are also run ``on_setattr``.
85+
.. versionadded:: 22.2.0
86+
*unsafe_hash* as an alias for *hash* (for :pep:`681` compliance).
8487
"""
8588

8689
def do_it(cls, auto_attribs):
@@ -89,6 +92,7 @@ def do_it(cls, auto_attribs):
8992
these=these,
9093
repr=repr,
9194
hash=hash,
95+
unsafe_hash=unsafe_hash,
9296
init=init,
9397
slots=slots,
9498
frozen=frozen,

tests/test_functional.py

Lines changed: 11 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -739,3 +739,14 @@ class D(C):
739739
assert "_setattr('x', x)" in src
740740
assert "_setattr('y', y)" in src
741741
assert object.__setattr__ != D.__setattr__
742+
743+
def test_unsafe_hash(self, slots):
744+
"""
745+
attr.s(unsafe_hash=True) makes a class hashable.
746+
"""
747+
748+
@attr.s(slots=slots, unsafe_hash=True)
749+
class Hashable:
750+
pass
751+
752+
assert hash(Hashable())

0 commit comments

Comments
 (0)