-
-
Notifications
You must be signed in to change notification settings - Fork 30.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
collections.Counter's += copies the entire object #57330
Comments
I've found some counterintuitive behavior in collections.Counter while hacking on the scikit-learn project [1]. I wanted to use a bunch of Counters to do some simple term counting in a set of documents, roughly as follows: count_total = Counter()
for doc in documents:
count_current = Counter(analyze(doc))
count_total += count_current
count_per_doc.append(count_current) Performance was horrible. After some digging, I found out that Counter [2] does not have __iadd__ and += copies the entire left-hand side in __add__. I've attached a patch that fixes the issue (for += only, and I've not patched the testsuite.) |
This is slightly backwards incompatible, as some people may depend on the old behavior. Otherwise sounds like a good idea to me. |
If this is not implemented because it is backwards incompat, then it might be useful to add a note to update's docstring explaining that it is much more efficient than +=. I was very surprised that it took *minutes* to add a few thousand moderate-sized Counters. |
I'll add the in-place methods including __iadd__, __isub__, __iand__, and __ior__. If speed is your issue, you should continue to use the update() method which will always be faster because it doesn't have a step to strip zeros and negative values from the existing Counter. Also, update() has a fast-path written in C. |
New changeset 5cced40374df by Raymond Hettinger in branch 'default': |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: