Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bpo-21145: Add cached_property decorator in functools #6982

Merged
merged 12 commits into from Aug 28, 2018

Conversation

Projects
None yet
8 participants
@carljm
Copy link
Contributor

commented May 19, 2018

@carljm

This comment has been minimized.

Copy link
Contributor Author

commented May 19, 2018

(Failing CI builds are asyncio flakiness, not related.)

self.__doc__ = func.__doc__
self.lock = RLock()

def __get__(self, instance, cls=None):

This comment has been minimized.

Copy link
@asvetlov

asvetlov May 19, 2018

Contributor

When cls is None? When __get__ is called without cls parameter?
Btw in docs for descriptors, the third argument is called owner IIRC.

This comment has been minimized.

Copy link
@carljm

carljm May 20, 2018

Author Contributor

AFAIK every call to __get__ generated by attribute lookup machinery will include the second (not including self) argument, but both the descriptor documentation (https://docs.python.org/3/reference/datamodel.html#descriptors) and the descriptor howto guide (https://docs.python.org/3/howto/descriptor.html) explicitly include an example of manually calling a descriptor's __get__ method providing only one argument. Also, all of the code examples in the HOWTO show the second argument as optional. Given that the method can be called directly and cls is unused in this implementation, it seems reasonable to follow that lead and make it optional.

Regarding the argument name, considering both the Data Model documentation and the HOWTO, we see examples in Python's own documentation of descriptors whose __get__ second argument is named owner, type, or objtype. All three of these are found in the standard library, along with cls, klass, ownerclass, and _type. There doesn't seem to be a standard usage. Personally I find cls or klass or objtype clearer, as owner doesn't clarify the key difference from the other argument, which is that it's a type. Perhaps naming the two arguments obj and objtype does the best job of clearly indicating their relationship. That said, I don't feel strongly and will rename to owner if you prefer that.

This comment has been minimized.

Copy link
@asvetlov

asvetlov May 20, 2018

Contributor

I doubt if howto is 100% correct.

datamodel doesn't allow None for owner, AFAIK Python aways pass a class type.

Using default for always specified argument confuses me.

datamodel uses owner, please do it as well.

This comment has been minimized.

Copy link
@asvetlov

asvetlov May 20, 2018

Contributor

If we need to change recommended argument name -- let's discuss it in another issue.

This comment has been minimized.

Copy link
@carljm

carljm May 20, 2018

Author Contributor

Changed signature of __get__ to (self, instance, owner) to match Data Model documentation.

f"instance to cache {attrname!r} property."
)
raise TypeError(msg) from None
with self.lock:

This comment has been minimized.

Copy link
@asvetlov

asvetlov May 19, 2018

Contributor

__getitem__ / setdefault are atomic operations.
The code could be modified in the following way:

  1. Get attribute value without locking.
  2. If cached attr is not found -- acquire the lock, get the value again under the lock and calculate it if not present.

This comment has been minimized.

Copy link
@carljm

carljm May 20, 2018

Author Contributor

Sure, will update. Thanks for the review!

This comment has been minimized.

Copy link
@asvetlov

asvetlov May 20, 2018

Contributor

looks good

@bedevere-bot

This comment has been minimized.

Copy link

commented May 19, 2018

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@carljm

This comment has been minimized.

Copy link
Contributor Author

commented May 20, 2018

I have made the requested changes; please review again.

@bedevere-bot

This comment has been minimized.

Copy link

commented May 20, 2018

Thanks for making the requested changes!

@asvetlov: please review the changes made to this pull request.

@carljm

This comment has been minimized.

Copy link
Contributor Author

commented May 20, 2018

I have made the requested changes; please review again.

@bedevere-bot

This comment has been minimized.

Copy link

commented May 20, 2018

Thanks for making the requested changes!

@asvetlov: please review the changes made to this pull request.

@py_functools.cached_property
def cost(self):
"""The cost of the item."""
return self._cost

This comment has been minimized.

Copy link
@asvetlov

asvetlov May 22, 2018

Contributor

Looks like the method is never called

This comment has been minimized.

Copy link
@carljm

carljm May 22, 2018

Author Contributor

Yes, that’s expected. The point of this class is just to test the error message; cached property on classes with slots is not supported. Would you find it clearer if I used pass as the body? I just wrote it the way someone realistically might, to make it clear that the error is not due to a mistake in the implementation of the method.

This comment has been minimized.

Copy link
@asvetlov

asvetlov May 22, 2018

Contributor

raise RuntimeError('never executed') is even more clear.

@carljm

This comment has been minimized.

Copy link
Contributor Author

commented May 22, 2018

I have made the requested changes; please review again.

@bedevere-bot

This comment has been minimized.

Copy link

commented May 22, 2018

Thanks for making the requested changes!

@asvetlov: please review the changes made to this pull request.

@asvetlov
Copy link
Contributor

left a comment

LGTM
Would be nice to get confirmation from code owners, @rhettinger and @ncoghlan

@asvetlov

This comment has been minimized.

Copy link
Contributor

commented May 30, 2018

Ooop. @carljm please resolve merging conflicts.

@carljm carljm force-pushed the carljm:cached-property branch from 3e53053 to f77ff04 May 30, 2018

@sir-sigurd
Copy link
Contributor

left a comment

It doesn't work properly in this case:

In [75]: class A:
    ...:     def f(self, x=1):
    ...:         return x
    ...: 
    ...:     cached_f = cached_property(f)

In [76]: a = A()

In [77]: a.f(2)
Out[77]: 2

In [78]: a.cached_f
Out[78]: 1

In [79]: a.f(2)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-79-3226073480ca> in <module>()
----> 1 a.f(2)

TypeError: 'int' object is not callable

So __set_name__() should be use to determine attrname.



class CachedCostItemWait:
_cost = 1

This comment has been minimized.

Copy link
@sir-sigurd

sir-sigurd Jun 17, 2018

Contributor

Is this needed? It seems to be always shadowed in __init__().


self.assertEqual(item.cost, 2)


This comment has been minimized.

Copy link
@sir-sigurd

sir-sigurd Jun 17, 2018

Contributor

Redundant line?


class DataSet:
def __init__(self, sequence_of_numbers):
self.data = sequence_of_numbers

This comment has been minimized.

Copy link
@methane

methane Jun 28, 2018

Member

I prefer _data to imply dataset.data = newdata is not supported.

@carljm

This comment has been minimized.

Copy link
Contributor Author

commented Jun 29, 2018

Thanks for the comments! Will update next week.

@methane

This comment has been minimized.

Copy link
Member

commented Jul 27, 2018

@carljm ping?

carljm added some commits May 19, 2018

@carljm

This comment has been minimized.

Copy link
Contributor Author

commented Jul 28, 2018

I have made the requested changes; please review again.

@bedevere-bot

This comment has been minimized.

Copy link

commented Jul 28, 2018

Thanks for making the requested changes!

@asvetlov, @methane: please review the changes made to this pull request.

class cached_property:
def __init__(self, func):
self.func = func
self.attrname = func.__name__

This comment has been minimized.

Copy link
@sir-sigurd

sir-sigurd Jul 29, 2018

Contributor

This looks redundant since attrname is set in __set_name__()

This comment has been minimized.

Copy link
@carljm

carljm Jul 29, 2018

Author Contributor

What do you suggest to do instead? This is the most reasonable fallback in case __set_name__ is not called in some unusual use of this class, and it avoids the object ever being in an incompletely initialized state.

self.lock = RLock()

def __set_name__(self, owner, name):
self.attrname = name

This comment has been minimized.

Copy link
@sir-sigurd

sir-sigurd Jul 29, 2018

Contributor

Might make sense to raise error in this case, since a will never be cached:

class C:
    @cached_property
    def a(self):
        pass
    b = a

This comment has been minimized.

Copy link
@carljm

carljm Jul 29, 2018

Author Contributor

OK. I guess that also answers the above question; we will raise error in case __set_name__ is ever called twice, and also I guess raise error if __get__ is called and __set_name__ was not.

@carljm

This comment has been minimized.

Copy link
Contributor Author

commented Jul 29, 2018

@sir-sigurd How does that look?

@methane

This comment has been minimized.

Copy link
Member

commented Jul 30, 2018

@rhettinger Would you review this?

else:
raise TypeError(
"Cannot assign the same cached_property to two names "
f"({self.attrname!r} and {name!r}) on the same class."

This comment has been minimized.

Copy link
@sir-sigurd

sir-sigurd Jul 30, 2018

Contributor

This message is inaccurate because the exception is also raised when it's assigned on the different classes.
I'm not sure that we should disallow to assign it on the different classes under the same name.

This comment has been minimized.

Copy link
@carljm

carljm Jul 30, 2018

Author Contributor

Good catch. I'd like to avoid tracking owners, but it should be sufficient to just allow repeated __set_name__ as long as the name remains the same.

@carljm

This comment has been minimized.

Copy link
Contributor Author

commented Jul 30, 2018

@sir-sigurd Better?

@sir-sigurd

This comment has been minimized.

Copy link
Contributor

commented Aug 3, 2018

@carljm 👍

@ncoghlan
Copy link
Contributor

left a comment

A couple of minor comments and error checking suggestion related to the technicalities of why __dict__ may not exist, and why it may not support __setitem__, but otherwise this looks good to me.


.. note::

This decorator does not support classes which define ``__slots__``.

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Aug 4, 2018

Contributor

I'd word this note slightly differently, as there are multiple ways to end up with instances that don't have a mutable mapping as their __dict__ attribute, and you can also include __dict__ in a __slots__ list if you want the fast C level access for selected fields, but aren't worried about the per-instance dict memory overhead.

Something like:

This decorator requires that the __dict__ attribute on each instance be a mutable mapping. This means it will not work with some types, such as metaclasses (since the __dict__ attributes on type instances are read-only proxies for the class namespace), and those that specify __slots__ without including __dict__ as one of the defined slots (as such classes don't provide a __dict__ attribute at all).

"Cannot use cached_property instance without calling __set_name__ on it.")
try:
cache = instance.__dict__
except AttributeError: # objects with __slots__ have no __dict__

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Aug 4, 2018

Contributor

Tweak the comment here along similar lines to the note in the docs

# check if another thread filled cache while we awaited lock
val = cache.get(self.attrname, _NOT_FOUND)
if val is _NOT_FOUND:
val = cache[self.attrname] = self.func(instance)

This comment has been minimized.

Copy link
@ncoghlan

ncoghlan Aug 4, 2018

Contributor

Similar to the AttributeError above, it's likely worth wrapping TypeError here to account for cases like:

>>> str.__dict__["x"] = 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: 'mappingproxy' object does not support item assignment

carljm added some commits Aug 19, 2018

Merge branch 'master' into cached-property
* master: (107 commits)
  bpo-22057: Clarify eval() documentation (GH-8812)
  bpo-34318: Convert deprecation warnings to errors in assertRaises() etc. (GH-8623)
  bpo-22602: Raise an exception in the UTF-7 decoder for ill-formed sequences starting with "+". (GH-8741)
  bpo-34415: Updated logging.Formatter docstring. (GH-8811)
  bpo-34432: doc Mention complex and decimal.Decimal on str.format not about locales (GH-8808)
  bpo-34381: refer to 'Running & Writing Tests' in README.rst (GH-8797)
  Improve error message when mock.assert_has_calls fails (GH-8205)
  Warn not to set SIGPIPE to SIG_DFL (#6773)
  bpo-34419: selectmodule.c does not compile on HP-UX due to bpo-31938 (GH-8796)
  bpo-34418: Fix HTTPErrorProcessor documentation (GH-8793)
  bpo-34391: Fix ftplib test for TLS 1.3 (GH-8787)
  bpo-34217: Use lowercase for windows headers (GH-8472)
  bpo-34395: Fix memory leaks caused by incautious usage of PyMem_Resize(). (GH-8756)
  bpo-34405: Updated to OpenSSL 1.1.0i for Windows builds. (GH-8775)
  bpo-34384: Fix os.readlink() on Windows (GH-8740)
  closes bpo-34400: Fix undefined behavior in parsetok(). (GH-4439)
  bpo-34399: 2048 bits RSA keys and DH params (#8762)
  Make regular expressions in test_tasks.py raw strings. (GH-8759)
  smtplib documentation fixes (GH-8708)
  Fix misindented yaml in logging how to example (GH-8604)
  ...
@carljm

This comment has been minimized.

Copy link
Contributor Author

commented Aug 19, 2018

@ncoghlan Thanks for the comments! I think they are all addressed in the latest update.

@ncoghlan ncoghlan merged commit d658dea into python:master Aug 28, 2018

9 checks passed

VSTS: Linux-PR Linux-PR_20180819.16 succeeded
Details
VSTS: Linux-PR-Coverage Linux-PR-Coverage_20180819.14 succeeded
Details
VSTS: Windows-PR Windows-PR_20180819.16 succeeded
Details
VSTS: docs docs_20180819.16 succeeded
Details
VSTS: macOS-PR macOS-PR_20180819.16 succeeded
Details
bedevere/issue-number Issue number 21145 found
Details
bedevere/news News entry found in Misc/NEWS.d
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
@bedevere-bot

This comment has been minimized.

Copy link

commented Aug 28, 2018

@ncoghlan: Please replace # with GH- in the commit message next time. Thanks!

lisroach added a commit to lisroach/cpython that referenced this pull request Sep 12, 2018

bpo-21145: Add cached_property decorator in functools (python#6982)
Robust caching of calculated properties is
harder than it looks at first glance, so add
a solid, well-tested implementation to the
standard library.

yahya-abou-imran added a commit to yahya-abou-imran/cpython that referenced this pull request Nov 2, 2018

bpo-21145: Add cached_property decorator in functools (python#6982)
Robust caching of calculated properties is
harder than it looks at first glance, so add
a solid, well-tested implementation to the
standard library.
@xunnv

This comment has been minimized.

Copy link

commented Jun 8, 2019

haha, why not remove the underscore or rename it into cacheproperty like abstractproperty decorator, or reify like Pyramid?
If its name is neat and beautiful, I think it will be more Pythonic.
Just an opinion, never mind.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.