Add 'find' as build-in method for lists #73697

george-shuklin · 2017-02-09T09:17:01Z

BPO	29511
Nosy	@terryjreedy, @stevendaprano

^{Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.}

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2017-02-10.20:13:25.819>
created_at = <Date 2017-02-09.09:17:00.737>
labels = ['interpreter-core', 'type-feature', '3.7']
title = "Add 'find' as build-in method for lists"
updated_at = <Date 2017-02-11.01:35:38.281>
user = 'https://bugs.python.org/george-shuklin'

bugs.python.org fields:

activity = <Date 2017-02-11.01:35:38.281>
actor = 'terry.reedy'
assignee = 'none'
closed = True
closed_date = <Date 2017-02-10.20:13:25.819>
closer = 'terry.reedy'
components = ['Interpreter Core']
creation = <Date 2017-02-09.09:17:00.737>
creator = 'george-shuklin'
dependencies = []
files = []
hgrepos = []
issue_num = 29511
keywords = []
message_count = 5.0
messages = ['287394', '287397', '287569', '287572', '287584']
nosy_count = 3.0
nosy_names = ['terry.reedy', 'steven.daprano', 'george-shuklin']
pr_nums = []
priority = 'normal'
resolution = 'rejected'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'enhancement'
url = 'https://bugs.python.org/issue29511'
versions = ['Python 3.7']

george-shuklin · 2017-02-09T09:17:00Z

I found that Python provides 'find()' and 'in' methods for strings, but lacking same functionality for lists.

Because strings and lists are very similar, it's reasonable to expect same function available for both.

Here long and rather ugly hack list on stackoverflow about 'reinventing the wheel': http://stackoverflow.com/questions/16579085/python-verifying-if-one-list-is-a-subset-of-the-other

There are short few proposals, each of them imperfect:

Use sets intersection. This looses count and order
Use collections.Count. This looses order
all(x in two for x in one) - looses order

Propsal: adds a normal 'find' method which will behave the same way as find for strings. It should perform normal __cmp__ call on each element, or, may be, asking for optional lambda to perform comparison of elements.

stevendaprano · 2017-02-09T10:25:20Z

Only 3.7 can receive new functionality.

Here is a pure Python implementation of a subsequence test:

https://code.activestate.com/recipes/577850-search-sequences-for-sub-sequence/

It appears to be reasonably popular on Activestate: it has about 7000 views, but a score of only 1. Take of that what you will. I interpret it as meaning it is of moderate but not great interest to people.

terryjreedy · 2017-02-10T20:13:25Z

Lists, tuples, ranges, dicts, and other builtin collection objects already work with 'in'.

>>> 1 in [1,2,3]
True
>>> 4 in range(9)
True

For historical reasons, stings have both 'find' and 'index'. The only difference is returning -1 (a C-ism) versus raising ValueError on failure. They are otherwise redundant.

Lists, tuples, and ranges, and other builtin sequence objects already have 'index'. There is no need to repeat the redundancy of 'find'.

>>> [1,2,3].index(2)
1
>>> [1,2,3].index(4)
Traceback (most recent call last):
  File "<pyshell#7>", line 1, in <module>
    [1,2,3].index(4)
ValueError: 4 is not in list
>>> range(9).index(4)
4

Strings are a special case of collection in that they 'contain' substrings rather than items of a different class. For this reason, 'in' and index/find are special-cased also work with contiguous substrings of length greater than 1.

>>> 'ac' in 'abc'
False
>>> 'ab' in 'abc'
True

Extending this idea to 'subsequence in sequence' or sequence.index(subsequence) has been rejected.

Note: __cmp__ does not exist in 3.x. Collection 'in' and sequence 'index' check object identity before equality to guarantee that an object in a collection (in the common sense of the term) is actually found even if it has a screwy definition of __eq__.

>>> nan = float('nan')
>>> nan == nan
False
>>> nan in [nan]
True
>>> float('nan') in [float('nan')]
False
>>> [nan].index(nan)
0

stevendaprano · 2017-02-10T22:57:09Z

Terry, I'm not sure if you've read this enhancement request correctly or not, because your reply when closing covers over a lot of detail which is not relevant to this feature request.

Extending this idea to 'subsequence in sequence' or sequence.index(subsequence) has been rejected.

And so it should, as that is a major break with backwards compatibility, but that is not what this feature request is about.

Including George's link, I count at least five questions on StackOverflow asking about this functionality: how to do subsequence tests in sequences apart from strings. That, and the interest in the recipes on ActiveState (here's another: http://code.activestate.com/recipes/117214/ ) indicate a reasonable level of interest in this feature.

In Python today, there is no obvious, good, correct way to do this in the standard library, just a bunch of hacks and tricks which solve slightly different problems.

Unless the very idea of subsequence matching has been rejected (which would surprise me greatly) I'm going to re-open this ticket. Any objections?

terryjreedy · 2017-02-11T01:35:35Z

Without any test code (other than my examples) to illustrate the desired new functionality, I may have misunderstood. But I read the George's prose (but not the SO link) and everything I wrote is relevant to what I thought it said. The request appears to be for either what now exists (other than the name and failure signal) or what Guido has specifically rejected for non-strings.

Reasons for rejecting subsequence matching:

Except for strings, practical use cases seem to be rare.
Enhancement could mask bugs.
General sequences with nesting (tuples and lists, but not range) have an ambiguity problem that strings do not.

[1, 2, [1,2]].index([1,2]) currently returns 2, not 0, and this cannot change. Similarly, [1,2] in [1,2,3] should not change from False to True.

Steven, without specific code examples, I do not understand what the 'this' is that you think is different from what you say was properly rejected, The request appears to be for extending the meaning of'in' and 'find/index' for non-strings. (See last sentence of opening post.) As you note, there are several related but different problems.

http://code.activestate.com/recipes/117214/ gives Python code for Knuth-Morris-Pratt string matching. Python uses a C-coded version of either this or an alternative in (str/bytes/bytearray).(index/find) Both methods stop with the first match, but have a 'start' parameter if one wants repeated matches, and one can choose either start as position + 1 or position + len(pattern) to allow overlaps or not.

Every presentation of KMP I have seen is as a string algorithm. In spite of the recipe title and argument name ('text'), the author claims that the Python code is generic. Since the recipe discussion only tested strings, I tried

for i in KnuthMorrisPratt([1,2,3,4,5,1,2], [1,2]):
    print(i)

and it prints 0 and 5, as claimed. Nice! Generic subsequence matching is easily possible. I believe the Python code could be rewritten in C with the Python C-API and remain generic.

If this idea is not to be dropped, I think the next step should be a python-ideas post with a clear function definition and a possible API (which will elicit alternative proposals) that avoids the back compatibility problem, specific positive and negative test examples, and real-life use cases (which I hope might be included in the SO questions).

george-shuklin mannequin added 3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) type-feature A feature request or enhancement labels Feb 9, 2017

terryjreedy closed this as completed Feb 10, 2017

ezio-melotti transferred this issue from another repository Apr 10, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add 'find' as build-in method for lists #73697

Add 'find' as build-in method for lists #73697

george-shuklin mannequin commented Feb 9, 2017

george-shuklin mannequin commented Feb 9, 2017

stevendaprano commented Feb 9, 2017

terryjreedy commented Feb 10, 2017

stevendaprano commented Feb 10, 2017

terryjreedy commented Feb 11, 2017

Add 'find' as build-in method for lists #73697

Add 'find' as build-in method for lists #73697

Comments

george-shuklin mannequin commented Feb 9, 2017

george-shuklin mannequin commented Feb 9, 2017

stevendaprano commented Feb 9, 2017

terryjreedy commented Feb 10, 2017

stevendaprano commented Feb 10, 2017

terryjreedy commented Feb 11, 2017