Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support heapq on typed arrays? #77774

Closed
da mannequin opened this issue May 21, 2018 · 6 comments
Closed

Support heapq on typed arrays? #77774

da mannequin opened this issue May 21, 2018 · 6 comments
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@da
Copy link
Mannequin

da mannequin commented May 21, 2018

BPO 33593
Nosy @rhettinger, @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2018-05-22.19:09:53.137>
created_at = <Date 2018-05-21.21:11:04.036>
labels = ['3.8', 'type-feature', 'library']
title = 'Support heapq on typed arrays?'
updated_at = <Date 2018-05-22.19:09:53.135>
user = 'https://bugs.python.org/da'

bugs.python.org fields:

activity = <Date 2018-05-22.19:09:53.135>
actor = 'rhettinger'
assignee = 'none'
closed = True
closed_date = <Date 2018-05-22.19:09:53.137>
closer = 'rhettinger'
components = ['Library (Lib)']
creation = <Date 2018-05-21.21:11:04.036>
creator = 'da'
dependencies = []
files = []
hgrepos = []
issue_num = 33593
keywords = []
message_count = 6.0
messages = ['317250', '317303', '317311', '317315', '317317', '317327']
nosy_count = 3.0
nosy_names = ['rhettinger', 'serhiy.storchaka', 'da']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue33593'
versions = ['Python 3.8']

@da
Copy link
Mannequin Author

da mannequin commented May 21, 2018

It'd be really great if we could have support for using the heapq module on typed arrays from array. For example:

import array
import heapq
import random

a = array.array('I', (random.randrange(10) for _ in range(10)))
heapq.heapify(a)

Right now this code throws a TypeError:

TypeError: heap argument must be a list

I suppose I could use bisect to insert items one by one but I imagine a single call to heapify() would be more efficient, especially if I'm loading the array from a byte string.

From what I can tell the problem lies in the C implementation, since removing the _heapq imports at the end of the heapq module (in 3.6) makes it work.

@da da mannequin added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels May 21, 2018
@rhettinger
Copy link
Contributor

I don't think we should go down this path. The efficiency of the C implementation depends on it being tightly coupled to lists. This tool is used in the schedulers of various async tools (such as Tornando), used for merge(), nsmallest(), and nlargest() all of which depend on this foundational tool being very fast.

Also, I question whether it makes sense at all to be heapifying numpy arrays using standard library tooling. It numpy arrays actually needed this and needed for it to be efficient, it would need to be implemented natively in numpy.

@rhettinger rhettinger added the 3.8 only security fixes label May 22, 2018
@da
Copy link
Mannequin Author

da mannequin commented May 22, 2018

I was referring to the C arrays in the Python standard library: https://docs.python.org/3/library/array.html

@da
Copy link
Mannequin Author

da mannequin commented May 22, 2018

However I do see your point about the speed.

@serhiy-storchaka
Copy link
Member

Workaround:

alist = list(a)
heapq.heapify(alist)
a[:] = alist

And it should be not much slower than using heapq.heapify() directly if it could support general sequences. Using it with array.array would add significant overhead due to boxing.

@rhettinger
Copy link
Contributor

As noted by Serhiy, the interaction with the Array type would incur significant overhead. Your fastest approach will be to follow his suggest to first convert to a list and then perform heap manipulations.

Marking this as closed. Thank you for the suggestion.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.8 only security fixes stdlib Python modules in the Lib dir type-feature A feature request or enhancement
Projects
None yet
Development

No branches or pull requests

2 participants