New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Python] Support __sizeof__ protocol for Python objects #23248
Comments
Wes McKinney / @wesm: |
Matthew Rocklin / @mrocklin: https://github.com/dask/dask/blob/539d1e27a8ccce01de5f3d49f1748057c27552f2/dask/sizeof.py#L115-L145 |
Joris Van den Bossche / @jorisvandenbossche: Main question is if we just want to return what
In [38]: a = pa.array([1, 2])
In [39]: import sys
In [40]: sys.getsizeof(a)
Out[40]: 96 but when overriding |
Antoine Pitrou / @pitrou: >>> v = set(range(500))
>>> type(v).__sizeof__(v)
32968
>>> object.__sizeof__(v)
200 |
Joris Van den Bossche / @jorisvandenbossche: In [21]: a = pa.array([1]*10)
In [22]: sys.getsizeof(a)
Out[22]: 96
In [23]: object.__sizeof__(a)
Out[23]: 72 (not sure how much we care about those small numbers, in reality users will mainly care for big arrays where the nbytes dominates the result) |
Antoine Pitrou / @pitrou: |
Joris Van den Bossche / @jorisvandenbossche: |
Antoine Pitrou / @pitrou: |
It would be helpful if PyArrow objects implemented the
__sizeof__
protocol to give other libraries hints about how much data they have allocated. This helps systems like Dask, which have to make judgements about whether or not something is cheap to move or taking up a large amount of space.Reporter: Matthew Rocklin / @mrocklin
Assignee: Joris Van den Bossche / @jorisvandenbossche
PRs and other links:
Note: This issue was originally created as ARROW-6926. Please see the migration documentation for further details.
The text was updated successfully, but these errors were encountered: