New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Shallow copy is slow, seems to spend a lot of time in repr #1694
Comments
There's a long story here:
That is, error messages that used to look like this: >>> ak.Array([[1, 2, 3], [], [4, 5]])[:, 0]
now look like this: >>> ak._v2.Array([[1, 2, 3], [], [4, 5]])[:, 0]
But...What's going on here shouldn't be related to that. Calling So what's going on here? It looks like our awkward/src/awkward/_v2/highlevel.py Lines 1417 to 1427 in 03dd965
Note that the indentation of There are a lot of expensive things in your profile trace, not just the string formatting. It's also calling So to start with, I can fix the indentation of |
I think that #1695 will directly address your issue, but the system is too close to the brink of this happening, if only missing a Can you check it? I'll include it in 1.10.0rc4. |
@jpivarski we will fix this class of error with e.g. #1657 to add both type hints and |
I can try. Do PRs here generate builds I can use? |
I don't think this is happening (or at least not exactly) since I wasn't seeing significant allocation during the shallow copy. Definitely not a full copy's worth |
It took a few tries to install, but I think I got it working. Copy speeds are much faster now! However, would you expect
|
@ivirshup predominantly this is caused by a |
The high-level awkward/src/awkward/_v2/highlevel.py Lines 1421 to 1422 in b5545a8
compared to a typical awkward/src/awkward/_v2/contents/listoffsetarray.py Lines 17 to 34 in b5545a8
But the def __copy__(self):
cls = type(self)
out = cls.__new__(cls)
out.__dict__.update(self.__dict__)
return out It would skip the usual checks that ensure that |
Is the |
Version of Awkward Array
1.10.0rc3
Description and code to reproduce
During the PR adding awkward array support into AnnData, I noticed the test suite was taking much longer than usual. Trying to track down the culprit, I found the line where we make shallow copies of an awkward array. Under this largely seemed to be code related to making
repr
strings.Checking if this is just us:
Making a minimal reproducible example:
I would expect making a shallow copy of an awkward array to be quite fast. I also wouldn't expect pretty printing methods to be called.
The text was updated successfully, but these errors were encountered: