Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: ensure that __copy__ and __deepcopy__ are enabled. #1695

Merged
merged 10 commits into from Sep 14, 2022

Conversation

jpivarski
Copy link
Member

No description provided.

jpivarski and others added 4 commits September 13, 2022 18:15
NB. I did think about introducing a `zeros_length` attribute, but I am concerned that we would need to take a great deal of care in handling this parameter for no good cause. Instead, we should make it such that the `length` parameter is the only externally visible one.
@codecov
Copy link

codecov bot commented Sep 14, 2022

Codecov Report

Merging #1695 (894aa3f) into main (e692946) will increase coverage by 0.40%.
The diff coverage is 84.45%.

Additional details and impacted files
Impacted Files Coverage Δ
src/awkward/_v2/_connect/jax/__init__.py 90.47% <ø> (+1.58%) ⬆️
src/awkward/_v2/config.py 0.00% <0.00%> (ø)
src/awkward/_v2/operations/ak_broadcast_arrays.py 100.00% <ø> (+4.34%) ⬆️
src/awkward/_v2/operations/ak_transform.py 65.51% <ø> (+56.89%) ⬆️
src/awkward/_v2/types/uniontype.py 84.61% <ø> (ø)
src/awkward/_v2/behaviors/categorical.py 73.13% <60.00%> (-8.94%) ⬇️
src/awkward/_v2/contents/bitmaskedarray.py 71.42% <60.00%> (-0.11%) ⬇️
src/awkward/_v2/contents/bytemaskedarray.py 88.32% <60.00%> (-0.03%) ⬇️
src/awkward/_v2/contents/listarray.py 91.89% <60.00%> (-0.02%) ⬇️
src/awkward/_v2/contents/unmaskedarray.py 70.76% <60.00%> (-0.13%) ⬇️
... and 53 more

This will ensure we *have* to implement this routine. Although we could use a dynamic deepcopy, it opens us up to more bugs (explicit vs implicit) and hides some of the copying logic.
This avoids handling deep copying in two places
@agoose77
Copy link
Collaborator

We might want to also make proper use of the memo dict here. Right now, a layout which re-uses the same buffers will (I assume) end up with multiple copies of these buffers. Just trying to figure out a nice interface.

@jpivarski
Copy link
Member Author

Thanks for the clean-ups! I checked each one and they are definitely improvements, though d02d927 is not actually related to this PR.

Immutable objects that take children in their constructor arguments (without delayed initialization) can't have cycles, and we've made "layouts are trees" assumptions elsewhere, too. Nevertheless, I saw the code in the copy module that makes use of the memo argument before calling our __deepcopy__, so if we also do it, it would be redundant.

As you can see here:

https://github.com/python/cpython/blob/4781535a5796838fc4ce88e6e669e8907e426685/Lib/copy.py#L128-L178

Python's copy.deepcopy fills memo with id(our_object) (key) pointing to the copy that we return (value), unless it finds id(our_object) already in the memo, in which case it returns that. So all we need to do is make sure that the memo doesn't get lost (replaced with a new empty dict) during the recursive walk.

Since your last commit was 2 hours ago, I can safely merge this.

@jpivarski jpivarski merged commit b5545a8 into main Sep 14, 2022
@jpivarski jpivarski deleted the jpivarski/ensure-copy-deepcopy-are-enabled branch September 14, 2022 12:32
@agoose77
Copy link
Collaborator

agoose77 commented Sep 14, 2022

As you can see here:

Yes, I was working off memory at the time of writing that comment. After checking the spec, I then went on to open #1697 when concerned about reference cycles (via mutable parameters), but subsequently recalled that we don't allow these, so it's a none-starter :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Shallow copy is slow, seems to spend a lot of time in repr
2 participants