-
-
Notifications
You must be signed in to change notification settings - Fork 31.7k
Improve copy.copy speed for built-in types (list/set/dict) #70355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
copy.copy uses a relatively high overhead approach to copying list, set and dict, using: def _copy_with_constructor(x):
return type(x)(x) This is despite the fact that all three types implement a .copy() method, and there is already a defined method: def _copy_with_copy_method(x):
return x.copy() that (in %timeit tests) runs with substantially less overhead, in percentage terms; for empty lists, sets and dicts, calling _copy_with_constructor or _copy_with_copy_method directly on them, the times on my machine (Python 3.5.0, Linux x86-64) are: empty list: 281 ns (constructor), 137 ns (method) The method costs could be trimmed further if _copy_with_copy_method was changed from a Python implementation to using operator.methodcaller, e.g.
_copy_with_copy_method = methodcaller('copy') This doesn't save a whole lot more (shaves another 9-17 ns off the times for the Python version of _copy_with_copy_method), but it's nice in that it avoids reinventing the wheel in the copy module. Combining the two changes (to use methodcaller for _copy_with_copy_method and to have list, set and dict use _copy_with_copy_method) means we can get rid of both Python defined functions in favor of a single use of operator.methodcaller used by all types that previously used either of them. Obviously not a high priority fix, but I noticed this while trying to figure out a way to fix zlib's lack of support in the copy module ( bpo-26166 which apparently duplicates bpo-25007 ) and how to work around it. |
methodcaller is not needed. We can use just list.copy etc. Proposed patch speeds up copying list, dict, set, bytearray, slice, NotImplemented. It makes deepcopying list, tuple, dict a little faster. It makes the code for copying and deepcopying using reduce protocol cleaner and a little faster. It cleans up the module and adds new tests for builtin types. |
Good point. Though I don't see any attached patches... |
Oh, sorry. Here is a patch. |
Here are results of microbenchmarks (time in microseconds).
() 0.993 1.02 5.25 5.38 tuple(range(10)) 0.975 1.01 26.1 26.6 tuple(range(1000)) 0.967 1.01 1.97e+03 2.07e+03 |
If there are no other comments, I'm going to commit the patch in short time. |
One suggestion: def _deepcopy_list(x, memo, deepcopy=deepcopy):
y = []
memo[id(x)] = y
y[:] = [deepcopy(a, memo) for a in x]
return y This looks nicer and should run faster by taking advantage of the LIST_APPEND opcode. It should be noted that the core concept of the patch is all about the minimizing the calling overhead of the copying operation (the cost to call type(x) and the overhead in the type constructors for parsing the positional and keyword arguments). When it comes to actually copying the data in the containers, there underlying code to do the copying is the same for both the patched and unpatched version (that is why you see a speed-up for copying empty containers and almost no change for containers that have data in them). |
The difference is insignificantly (less than 1%) for large lists (list(range(10000))), but it is 3-4% slower for small lists (list(range(10))) and 20-25% slower for empty lists. On 2.7 your code always has advantage, but on 3.x seems it doesn't have any advantage (thanks to overhead of using a generator). |
Then go ahead with the patch as is. |
New changeset 496878ac412d by Serhiy Storchaka in branch 'default': New changeset 52d7a308e3b4 by Serhiy Storchaka in branch '3.5': New changeset 8554423dd392 by Serhiy Storchaka in branch '2.7': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: