Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify that bisect(a, x, key=...) does not call key(x) #91966

Closed
sean8223 opened this issue Apr 26, 2022 · 5 comments
Closed

Clarify that bisect(a, x, key=...) does not call key(x) #91966

sean8223 opened this issue Apr 26, 2022 · 5 comments
Assignees
Labels
docs Documentation in the Doc dir stdlib Python modules in the Lib dir

Comments

@sean8223
Copy link

The key functions generated by the functools.cmp_to_key function don't work with bisect.bisect_right or bisect.bisect_left, raising a TypeError with the message "TypeError: other argument must be K instance" when used.

Consider the following example derived from the Sorting HOWTO page (https://docs.python.org/3/howto/sorting.html#the-old-way-using-the-cmp-parameter)

Python 3.10.4 (main, Apr 25 2022, 16:19:06) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from functools import cmp_to_key
>>> from bisect import bisect_right, bisect_left
>>> ns=[1,2,3,4,5]
>>> def numeric_compare(x, y):
...     return x - y
... 
>>> bisect_right(ns, 2, key=cmp_to_key(numeric_compare))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: other argument must be K instance
>>> bisect_left(ns, 2, key=cmp_to_key(numeric_compare))
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: other argument must be K instance

The methods bisect.insort_left and bisect.insort_right functions do seem work correctly with these key functions:

Python 3.10.4 (main, Apr 25 2022, 16:19:06) [GCC 9.4.0] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> from bisect import insort_right, insort_left
>>> from functools import cmp_to_key
>>> def numeric_compare(x, y):
...     return x - y
... 
>>> ns=[1, 2, 3, 4, 5]
>>> insort_right(ns, 2, key=cmp_to_key(numeric_compare))
>>> ns
[1, 2, 2, 3, 4, 5]

Your environment

  • Python version 3.10.4
  • Linux x86
@sean8223 sean8223 added the type-bug An unexpected behavior, bug, or error label Apr 26, 2022
@AlexWaygood AlexWaygood added the stdlib Python modules in the Lib dir label Apr 26, 2022
@sweeneyde
Copy link
Member

bisect_left/bisect_right functions take in a key with the function already applied; they do not apply the key function to the x parameter. IIRC it was decided this way because this is slightly more general (you can apply the function when you need to apply the function, but sometimes you only have a key).

>>> ns=[1,2,3,4,5]
>>> key_func = cmp_to_key(numeric_compare)
>>> bisect_right(ns, key_func(2), key=key_func)
2

Maybe this is worth documenting better?

@sweeneyde sweeneyde added docs Documentation in the Doc dir and removed type-bug An unexpected behavior, bug, or error labels Apr 26, 2022
@sweeneyde
Copy link
Member

Here's an example of how it can be useful, by the way:

>>> def isqrt(n):
...     if n == 0 or n == 1:
...         return n
...     return bisect_left(range(n), True, key=lambda i: i*i>n) - 1

I'm searching for True, trying to figure out the index where that starts. It would be rather awkward to pass in float('inf') or something instead.

@sean8223
Copy link
Author

I agree that this can be fixed with documentation on the bisect functions -- I would never have deduced this usage from the error message or the examples. Thanks!

@sweeneyde sweeneyde changed the title bisect.bisect_right does not work with key functions generated by functools.cmp_to_key Clarify that bisect(a, x, key=...) does not call key(x) Apr 26, 2022
@rhettinger rhettinger self-assigned this Apr 26, 2022
@rhettinger
Copy link
Contributor

rhettinger commented Apr 27, 2022

I'll add another example in the final section showing the motivating use case. Will also amend the function specification to note that the key function is not called on the input value.

from collections import namedtuple
from operator import attrgetter
from bisect import bisect

Movie = namedtuple('Movie', ('name', 'released', 'director'))

movies = [
    Movie('Jaws', 1975, 'Speilberg'),
    Movie('Titanic', 1997, 'Cameron'),
    Movie('The Birds', 1963, 'Hitchcock'),
    Movie('Aliens', 1986, 'Scott')
]

# Find the first movie released on or after 1960
by_year = attrgetter('released') 
movies.sort(key=by_year)
print(movies[bisect(movies, 1960, key=by_year)])

This example shows that for searching records you almost never have the full key ; that is what you're looking for. Instead you have a field value and want to find the matching record.

@wpk-
Copy link

wpk- commented May 3, 2022

The docs state bisect(a, x) will "locate the insertion point for x in a to maintain sorted order." The key argument currently breaks with that definition.

To add to this, insort(a, x, key=fn) does take x and not fn(x), which means the language for insort and bisect is inconsistent.

At the same time I understand the example use case, and the current design certainly works well there. I guess the counter-intuitivity is a necessary byproduct.

miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 11, 2022
… module (pythonGH-92602)

(cherry picked from commit 63794db)

Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com>
miss-islington pushed a commit to miss-islington/cpython that referenced this issue May 11, 2022
… module (pythonGH-92602)

(cherry picked from commit 63794db)

Co-authored-by: Raymond Hettinger <rhettinger@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation in the Doc dir stdlib Python modules in the Lib dir
Projects
None yet
Development

No branches or pull requests

5 participants