-
-
Notifications
You must be signed in to change notification settings - Fork 30.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add a key parameter (like sorted) to heapq.merge #57951
Comments
Hi, The attached patch adds a 'key' optional parameter to the heapq.merge function that behaves as in sorted(). Related discussion: http://mail.python.org/pipermail/python-ideas/2012-January/013295.html This is my first contribution to CPython. |
The attached script benchmarks the basline (current implementation) against 3 new implementations, as suggested on http://mail.python.org/pipermail/python-ideas/2012-January/013296.html On my machine, the output is:
On this particular input, merge_2 adds 6% of overhead when the key parameter is not used. While merge_3 only adds 1% of overhead, it almost doubles the amount of code. (Which was admittedly not that long to begin with.) The patch in the previous message is with the merge_2 implementation, which seemed like the best compromise to me. |
Oops, the patch to the documentation would also need 'New in 3.3: the key parameter', with the right Sphinx directive. But that depends on whether this change ends up in 3.3 or 3.4. Does 3.3 still get new features? |
Yes, 3.3 is still in the early development stage, and new features will be accepted until the first beta (in June, see PEP-398). “.. versionadded:: 3.3 The *key* parameter” will do. |
Simon, please keep the original version fast by creating two code paths: if key is None:
original_code
else:
new_code using the key_function |
Raymond, please have a look at merge_3 in benchmark_heapq_merge.py. It is implemented as you say. Do you think the speed is worth the code duplication? |
heapq_merge_key_duplicate.patch is a new patch with two code path. It also updates the function’s docstring (which the previous patch did not). Raymond, do you think the speed is worth the DRY violation? |
I'll look at this in the next couple of weeks. Hang tight :-) |
I just remembered about this. I suppose it is too late for 3.3? |
Yes, 3.3 is already in beta. |
Attaching a rough draft implementation for a fully encapsulated Heap() class that is thread-safe, supports minheaps and maxheaps, and efficiently implements key-functions (called no more than once per key). |
heap2.diff contains only a single line's change. Wrong file attached? |
Ah, I see the new file now (I'd failed to refresh my browser); sorry for the noise. |
Looks pretty good to me.
|
There is already one heap class in the stdlib: queue.PriorityQueue. Why create a duplicate instead extend queue.PriorityQueue with desired features? May be name the maxheap parameter as reverse? |
New changeset f5521f5dec4a by Raymond Hettinger in branch 'default': |
I noticed 3.5 alpha1 is not released until February 1st. Is there any way I can get my hands on this new functionality? |
Hi Tommy, the patch is already committed to Python 3.5. See https://docs.python.org/3.5/library/heapq.html#heapq.merge |
Yes, but 3.5 has not been pre-released yet. |
You can set up mecurial on your machine, make a read-only clone of the cpython repository, and compile it just as do other people, whether core-developers or otherwise. See docs.python.org/devguide for details. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
The text was updated successfully, but these errors were encountered: