gh-118184: Support tuples for `find`, `index`, `rfind` & `rindex` #119501

nineteendo · 2024-05-24T11:04:26Z

One other option is for someone just to submit one or more PRs implementing the proposed feature(s). The PRs will either get accepted or rejected, and then you have your answer. The lack of response might just be because there’s not a lot that’s interesting to say.

I don’t personally think this is worth the effort to implement, and I’m not convinced I’d find it very useful in practice. But I also don’t think it’s such a big deal that it needs a big debate, or communty consensus, or a PEP. So if you want to put in the effort, just go for it.

Benchmark for 1,000,000 characters

script

# find_tuple.py
def find0(p, chars):
    for i, c in enumerate(p):
        if c in chars:
            break
    else:
        i = -1
    return i

def find1(p, subs):
    for i in range(len(p)):
        if p.startswith(subs, i):
            break
    else:
        i = -1
    return i

def find2(p, pattern):
    match = pattern.search(p)
    i = match.start() if match else -1
    return i

def find3(p, subs):
    i = -1
    for sub in subs:
        new_i = p.find(sub, 0, None if i == -1 else i)
        if new_i != -1:
            i = new_i
    return i

def find4(p, subs):
    i = p.find(subs)
    return i

def rfind0(p, chars):
    i = len(p) - 1
    while i >= 0 and p[i] not in chars:
        i -= 1
    return i

def rfind1(p, subs):
    for i in range(len(p), -1, -1):
        if p.startswith(subs, i):
            break
    else:
        i = -1
    return i

rfind2 = find2

def rfind3(p, subs):
    i = -1
    for sub in subs:
        new_i = p.rfind(sub, 0 if i == -1 else i)
        if new_i != -1:
            i = new_i
    return i

def rfind4(p, subs):
    i = p.rfind(subs)
    return i

# find_tuple.sh
echo find chars best case
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'ab' + '_' * 999_998; chars   = 'ab'"               "find_tuple.find0(string, chars)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'ab' + '_' * 999_998; subs    = tuple('ab')"        "find_tuple.find1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, re; string = 'ab' + '_' * 999_998; pattern = re.compile('[ab]')" "find_tuple.find2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'ab' + '_' * 999_998; subs    = 'ab'"               "find_tuple.find3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'ab' + '_' * 999_998; subs    = tuple('ab')"        "find_tuple.find4(string, subs)"
echo find chars mixed case
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'b' + '_' * 999_999; chars   = 'ab'"               "find_tuple.find0(string, chars)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'b' + '_' * 999_999; subs    = tuple('ab')"        "find_tuple.find1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, re; string = 'b' + '_' * 999_999; pattern = re.compile('[ab]')" "find_tuple.find2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'b' + '_' * 999_999; subs    = 'ab'"               "find_tuple.find3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'b' + '_' * 999_999; subs    = tuple('ab')"        "find_tuple.find4(string, subs)"
echo find chars worst case
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; chars   = 'ab'"               "find_tuple.find0(string, chars)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = tuple('ab')"        "find_tuple.find1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, re; string = '_' * 1_000_000; pattern = re.compile('[ab]')" "find_tuple.find2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = 'ab'"               "find_tuple.find3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = tuple('ab')"        "find_tuple.find4(string, subs)"
echo find subs best case
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'abcd' + '_' * 999_996; subs    = 'ab', 'cd'"          "find_tuple.find1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, re; string = 'abcd' + '_' * 999_996; pattern = re.compile('ab|cd')" "find_tuple.find2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'abcd' + '_' * 999_996; subs    = 'ab', 'cd'"          "find_tuple.find3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'abcd' + '_' * 999_996; subs    = 'ab', 'cd'"          "find_tuple.find4(string, subs)"
echo find subs mixed case
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'cd' + '_' * 999_998; subs    = 'ab', 'cd'"          "find_tuple.find1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, re; string = 'cd' + '_' * 999_998; pattern = re.compile('ab|cd')" "find_tuple.find2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'cd' + '_' * 999_998; subs    = 'ab', 'cd'"          "find_tuple.find3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = 'cd' + '_' * 999_998; subs    = 'ab', 'cd'"          "find_tuple.find4(string, subs)"
echo find subs worst case
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = 'ab', 'cd'"          "find_tuple.find1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, re; string = '_' * 1_000_000; pattern = re.compile('ab|cd')" "find_tuple.find2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = 'ab', 'cd'"          "find_tuple.find3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = 'ab', 'cd'"          "find_tuple.find4(string, subs)"
echo find many prefixes
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = tuple(f'prefix{i}' for i in range(100))"                "find_tuple.find1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, re; string = '_' * 1_000_000; pattern = re.compile('|'.join(f'prefix{i}' for i in range(100)))" "find_tuple.find2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = tuple(f'prefix{i}' for i in range(100))"                "find_tuple.find3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = tuple(f'prefix{i}' for i in range(100))"                "find_tuple.find4(string, subs)"
echo find many infixes
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = tuple(f'{i}infix{i}' for i in range(100))"                "find_tuple.find1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, re; string = '_' * 1_000_000; pattern = re.compile('|'.join(f'{i}infix{i}' for i in range(100)))" "find_tuple.find2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = tuple(f'{i}infix{i}' for i in range(100))"                "find_tuple.find3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;     string = '_' * 1_000_000; subs    = tuple(f'{i}infix{i}' for i in range(100))"                "find_tuple.find4(string, subs)"

echo ---

echo rfind chars best case
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_998 + 'ba'; chars   = 'ab'"                      "find_tuple.rfind0(string, chars)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_998 + 'ba'; subs    = tuple('ab')"               "find_tuple.rfind1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, regex; string = '_' * 999_998 + 'ba'; pattern = regex.compile('(?r)[ab]')" "find_tuple.rfind2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_998 + 'ba'; subs    = 'ab'"                      "find_tuple.rfind3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_998 + 'ba'; subs    = tuple('ab')"               "find_tuple.rfind4(string, subs)"
echo rfind chars mixed case
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_999 + 'b'; chars   = 'ab'"                      "find_tuple.rfind0(string, chars)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_999 + 'b'; subs    = tuple('ab')"               "find_tuple.rfind1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, regex; string = '_' * 999_999 + 'b'; pattern = regex.compile('(?r)[ab]')" "find_tuple.rfind2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_999 + 'b'; subs    = 'ab'"                      "find_tuple.rfind3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_999 + 'b'; subs    = tuple('ab')"               "find_tuple.rfind4(string, subs)"
echo rfind chars worst case
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; chars   = 'ab'"                      "find_tuple.rfind0(string, chars)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = tuple('ab')"               "find_tuple.rfind1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, regex; string = '_' * 1_000_000; pattern = regex.compile('(?r)[ab]')" "find_tuple.rfind2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = 'ab'"                      "find_tuple.rfind3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = tuple('ab')"               "find_tuple.rfind4(string, subs)"
echo rfind subs best case
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_996 + 'cdab'; subs    = 'ab', 'cd'"                 "find_tuple.rfind1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, regex; string = '_' * 999_996 + 'cdab'; pattern = regex.compile('(?r)ab|cd')" "find_tuple.rfind2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_996 + 'cdab'; subs    = 'ab', 'cd'"                 "find_tuple.rfind3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_996 + 'cdab'; subs    = 'ab', 'cd'"                 "find_tuple.rfind4(string, subs)"
echo rfind subs mixed case
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_998 + 'cd'; subs    = 'ab', 'cd'"                 "find_tuple.rfind1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, regex; string = '_' * 999_998 + 'cd'; pattern = regex.compile('(?r)ab|cd')" "find_tuple.rfind2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_998 + 'cd'; subs    = 'ab', 'cd'"                 "find_tuple.rfind3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 999_998 + 'cd'; subs    = 'ab', 'cd'"                 "find_tuple.rfind4(string, subs)"
echo rfind subs worst case
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = 'ab', 'cd'"                 "find_tuple.rfind1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, regex; string = '_' * 1_000_000; pattern = regex.compile('(?r)ab|cd')" "find_tuple.rfind2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = 'ab', 'cd'"                 "find_tuple.rfind3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = 'ab', 'cd'"                 "find_tuple.rfind4(string, subs)"
echo rfind many suffixes
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = tuple(f'{i}suffix' for i in range(100))"                            "find_tuple.rfind1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, regex; string = '_' * 1_000_000; pattern = regex.compile(f'(?r){'|'.join(f'{i}suffix' for i in range(100))}')" "find_tuple.rfind2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = tuple(f'{i}suffix' for i in range(100))"                            "find_tuple.rfind3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = tuple(f'{i}suffix' for i in range(100))"                            "find_tuple.rfind4(string, subs)"
echo rfind many infixes
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = tuple(f'{i}infix{i}' for i in range(100))"                            "find_tuple.rfind1(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple, regex; string = '_' * 1_000_000; pattern = regex.compile(f'(?r){'|'.join(f'{i}infix{i}' for i in range(100))}')" "find_tuple.rfind2(string, pattern)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = tuple(f'{i}infix{i}' for i in range(100))"                            "find_tuple.rfind3(string, subs)"
find-tuple/python.exe -m timeit -s "import find_tuple;        string = '_' * 1_000_000; subs    = tuple(f'{i}infix{i}' for i in range(100))"                            "find_tuple.rfind4(string, subs)"

find chars best case - 2.33x faster

2000000 loops, best of 5: 177 nsec per loop
1000000 loops, best of 5: 213 nsec per loop
1000000 loops, best of 5: 277 nsec per loop
1000000 loops, best of 5: 227 nsec per loop
5000000 loops, best of 5: 76 nsec per loop

find chars mixed case - 1.80x faster

2000000 loops, best of 5: 177 nsec per loop
1000000 loops, best of 5: 219 nsec per loop
1000000 loops, best of 5: 276 nsec per loop
20000 loops, best of 5: 16.2 usec per loop
5000000 loops, best of 5: 98.4 nsec per loop

find chars worst case - 1.69x slower

5 loops, best of 5: 41.7 msec per loop
5 loops, best of 5: 53.8 msec per loop
50 loops, best of 5: 4.32 msec per loop
10000 loops, best of 5: 32 usec per loop
5000 loops, best of 5: 54.1 usec per loop

find subs best case - 2.93x faster

1000000 loops, best of 5: 213 nsec per loop
1000000 loops, best of 5: 306 nsec per loop
1000000 loops, best of 5: 217 nsec per loop
5000000 loops, best of 5: 72.7 nsec per loop

find subs mixed case - 3.75x slower

1000000 loops, best of 5: 220 nsec per loop
1000000 loops, best of 5: 285 nsec per loop
500 loops, best of 5: 733 usec per loop
500000 loops, best of 5: 824 nsec per loop

find subs worst case - 1.04x slower

5 loops, best of 5: 53.2 msec per loop
50 loops, best of 5: 4.79 msec per loop
200 loops, best of 5: 1.46 msec per loop
200 loops, best of 5: 1.52 msec per loop

find many prefixes - 56.3x slower

1 loop, best of 5: 602 msec per loop
500 loops, best of 5: 480 usec per loop
10 loops, best of 5: 36.8 msec per loop
10 loops, best of 5: 27 msec per loop

find many infixes - 5.69x slower

1 loop, best of 5: 603 msec per loop
50 loops, best of 5: 4.32 msec per loop
10 loops, best of 5: 33.2 msec per loop
10 loops, best of 5: 24.6 msec per loop

rfind chars best case - 1.24x faster

2000000 loops, best of 5: 114 nsec per loop
1000000 loops, best of 5: 294 nsec per loop
500000 loops, best of 5: 517 nsec per loop
1000000 loops, best of 5: 221 nsec per loop
5000000 loops, best of 5: 91.9 nsec per loop

rfind chars mixed case - 6.16x slower

2000000 loops, best of 5: 115 nsec per loop
1000000 loops, best of 5: 301 nsec per loop
500000 loops, best of 5: 518 nsec per loop
500 loops, best of 5: 598 usec per loop
500000 loops, best of 5: 708 nsec per loop

rfind chars worst case - 1.06x slower

5 loops, best of 5: 51.3 msec per loop
5 loops, best of 5: 53.3 msec per loop
50 loops, best of 5: 7 msec per loop
200 loops, best of 5: 1.19 msec per loop
200 loops, best of 5: 1.26 msec per loop

rfind subs best case - 2.41x faster

1000000 loops, best of 5: 359 nsec per loop
500000 loops, best of 5: 574 nsec per loop
1000000 loops, best of 5: 229 nsec per loop
2000000 loops, best of 5: 94.9 nsec per loop

rfind subs mixed case - 2.26x slower

1000000 loops, best of 5: 368 nsec per loop
500000 loops, best of 5: 542 nsec per loop
500 loops, best of 5: 724 usec per loop
500000 loops, best of 5: 832 nsec per loop

rfind subs worst case - 1.04x slower

5 loops, best of 5: 53.9 msec per loop
50 loops, best of 5: 7 msec per loop
200 loops, best of 5: 1.44 msec per loop
200 loops, best of 5: 1.5 msec per loop

rfind many suffixes - 54.8x slower

1 loop, best of 5: 605 msec per loop
500 loops, best of 5: 484 usec per loop
10 loops, best of 5: 24.6 msec per loop
10 loops, best of 5: 26.5 msec per loop

rfind many infixes - 2.60x slower

1 loop, best of 5: 603 msec per loop
50 loops, best of 5: 9.37 msec per loop
10 loops, best of 5: 22.5 msec per loop
10 loops, best of 5: 24.4 msec per loop

Old benchmark on Ubuntu

expand

find best case - 2.02x faster - regex - 2.77x slower

1000000 loops, best of 5: 303 nsec per loop
1000000 loops, best of 5: 357 nsec per loop
1000000 loops, best of 5: 260 nsec per loop
2000000 loops, best of 5: 129 nsec per loop

find mixed case - 1.25x faster - regex - 1.54x slower

1000000 loops, best of 5: 301 nsec per loop
1000000 loops, best of 5: 370 nsec per loop
10000 loops, best of 5: 23.9 usec per loop
1000000 loops, best of 5: 240 nsec per loop

find worst case - 1.33x faster - regex - 500x slower

5 loops, best of 5: 47.2 msec per loop
20 loops, best of 5: 17.1 msec per loop
5000 loops, best of 5: 45.5 usec per loop
10000 loops, best of 5: 34.2 usec per loop

rfind best case - 1.28x faster - regex - 84,049x slower

1000000 loops, best of 5: 209 nsec per loop
20 loops, best of 5: 13.7 msec per loop
1000000 loops, best of 5: 265 nsec per loop
2000000 loops, best of 5: 163 nsec per loop

rfind mixed case - 1.27x slower - regex - 62,673x slower

1000000 loops, best of 5: 217 nsec per loop
20 loops, best of 5: 13.6 msec per loop
10000 loops, best of 5: 23.8 usec per loop
1000000 loops, best of 5: 275 nsec per loop

rfind worst case - 1.30x faster - regex - 1,068x slower

5 loops, best of 5: 85.9 msec per loop
10 loops, best of 5: 37.8 msec per loop
5000 loops, best of 5: 44.6 usec per loop
10000 loops, best of 5: 35.4 usec per loop

Issue: Support tuples for more stringlike functions #118184

📚 Documentation preview 📚: https://cpython-previews--119501.org.readthedocs.build/

nineteendo

Fix bytes & bytearray test

Lib/test/string_tests.py

nineteendo

Fix space

nineteendo · 2024-05-24T14:17:25Z

Could someone run the benchmark on Linux? I believe it will make it faster in all cases (at least for relatively small strings).

eendebakpt · 2024-05-24T15:21:39Z

Could someone run the benchmark on Linux? I believe it will make it faster in all cases (at least for relatively small strings).

What about find where the argument is not a tuple but a string? Will that become slower?

nineteendo · 2024-05-24T15:26:36Z

EDIT: 2ns faster:

script

# find_tuple.sh
echo find && main/python.exe -m timeit "'foobar'.find('foo')" && find-tuple/python.exe -m timeit "'foobar'.find('foo')"
echo rfind && main/python.exe -m timeit "'foobar'.rfind('foo')" && find-tuple/python.exe -m timeit "'foobar'.rfind('foo')"

find
10000000 loops, best of 5: 34 nsec per loop
10000000 loops, best of 5: 32.2 nsec per loop
rfind
10000000 loops, best of 5: 36.8 nsec per loop
10000000 loops, best of 5: 35.9 nsec per loop

elisbyberi · 2024-05-24T15:33:47Z

@nineteendo There is an issue that has not been addressed in the discussion: https://discuss.python.org/t/add-tuple-support-to-more-str-functions/50628/66

Why? If we provide such APIs, why we can ignore long input + many words use cases?

General purpose methods in Python are expected to be works well for non small input.
I expect user may do one_mega_string.count(tuple(thousands_of_words)).

nineteendo · 2024-05-24T19:00:05Z

~~While it doesn't short-circuit for long input~~, it performs a lot better than re in the absolute worst case.
For many words you can use re as there will likely be patterns.
~~We can put in the docs that rfind(subs) equivalent to max(string.rfind(sub1), string.rfind(sub2), ...) and you shouldn't expect a huge improvement.~~

nineteendo · 2024-05-24T19:56:10Z

I'm going to try to improve the mixed case.

nineteendo · 2024-05-24T21:00:57Z

Could someone run the benchmark on Linux?

Doc/library/stdtypes.rst

Lib/test/string_tests.py

Objects/unicodeobject.c

Objects/bytes_methods.c

erlend-aasland · 2024-05-25T01:04:54Z

Adding do-not-merge, since the linked issue is closed as wont-implement.

Objects/bytes_methods.c

nineteendo · 2024-06-02T21:06:36Z

@serhiy-storchaka all your issues are now addressed. I don't like the way the heap_subs are cleaned up, but it does work.

if (heap_subs) {
    for (Py_ssize_t i = 0; i < subs_len; i++) {
        PyMem_Free((void *)heap_subs[i]);
    }
}

Could you please review again? If you think the code is too complex, remember you asked for this. I wanted to keep this as simple as possible, but I have been repeatedly asked to further optimise it.

erlend-aasland · 2024-06-02T21:13:16Z

Reminding the reviewers that there are mixed opinions regarding adding this feature at all (hence the linked issue closed as wont-implement); AFAIK, no core dev has expressed immense support of the idea.

This reverts commit 0cbf03a.

This reverts commit ff6eea2.

erlend-aasland · 2024-06-02T22:00:07Z

Ideally, this PoC should be kept at your fork; the CPython repo is not the place for experimentation. Consider closing this PR and continue working on it on your fork.

serhiy-storchaka

If you think the code is too complex, remember you asked for this. I wanted to keep this as simple as possible, but I have been repeatedly asked to further optimise it.

I did not ask for this. I only pointed out that the current version was not optimal and did repeatedly unnecessary things. If the simpler version be merged, we would spend months or years on optimizing it. I given some hints about how it could be optimized, but predicted that the result will be complex.

I am not enthusiastic about this feature, because, on one hand, it is not algorithmically optimal (the optimal algorithm needs the costly preparation step, the set of needles should be "compiled" before use), and on other hand, it is too complex for "practicality beats purity".

Objects/bytes_methods.c

nineteendo · 2024-06-03T08:52:00Z

the optimal algorithm needs the costly preparation step

Costly it is indeed. The only area where it's faster is where the performance wasn't needed (the worst case)!
We can get better performance there by simply setting the CHUNK_SIZE to 10,000 which affects the other cases much less.
I wanted to be cooperative and follow your suggestions, but that was a huge waste of my time. Is it fine to revert this?
How was I supposed to know PyMem_RawMalloc() is very slow?

find chars best case - 3.26x slower

5000000 loops, best of 5: 76 nsec per loop
1000000 loops, best of 5: 248 nsec per loop

find chars mixed case - 2.71x slower

5000000 loops, best of 5: 98.4 nsec per loop
1000000 loops, best of 5: 267 nsec per loop

find chars worst case - 1.22x faster

5000 loops, best of 5: 54.1 usec per loop
5000 loops, best of 5: 44.4 usec per loop

find subs best case - 3.45x slower

5000000 loops, best of 5: 72.7 nsec per loop
1000000 loops, best of 5: 251 nsec per loop

find subs mixed case - 1.21x slower

500000 loops, best of 5: 824 nsec per loop
500000 loops, best of 5: 999 nsec per loop

find subs worst case - no difference

200 loops, best of 5: 1.52 msec per loop
200 loops, best of 5: 1.51 msec per loop

find many prefixes - 1.02x faster

10 loops, best of 5: 27 msec per loop
10 loops, best of 5: 26.4 msec per loop

find many infixes - 1.03x faster

10 loops, best of 5: 24.6 msec per loop
10 loops, best of 5: 24 msec per loop

rfind chars best case - 3x slower

5000000 loops, best of 5: 91.9 nsec per loop
1000000 loops, best of 5: 276 nsec per loop

rfind chars mixed case - 1.25x slower

500000 loops, best of 5: 708 nsec per loop
500000 loops, best of 5: 887 nsec per loop

rfind chars worst case - no difference

200 loops, best of 5: 1.26 msec per loop
200 loops, best of 5: 1.26 msec per loop

rfind subs best case - 2.95x slower

2000000 loops, best of 5: 94.9 nsec per loop
1000000 loops, best of 5: 280 nsec per loop

rfind subs mixed case - 1.23x slower

500000 loops, best of 5: 832 nsec per loop
200000 loops, best of 5: 1.02 usec per loop

rfind subs worst case - no difference

200 loops, best of 5: 1.5 msec per loop
200 loops, best of 5: 1.49 msec per loop

rfind many suffixes - 1.02x faster

10 loops, best of 5: 26.5 msec per loop
10 loops, best of 5: 25.9 msec per loop

rfind many infixes - 1.03x faster

10 loops, best of 5: 24.4 msec per loop
10 loops, best of 5: 23.8 msec per loop

erlend-aasland · 2024-06-03T09:07:07Z

It looks indeed like this is not going anywhere. I suggest you continue your experiments on your own fork. If you manage to get an improved version up and running, try first to gain traction for the feature on Discord, before you create a new issue/PR. So far, no core dev is super enthusiastic about this, which means that this PR (and any other like it) is only a waste of CI and review resources.

serhiy-storchaka · 2024-06-03T10:06:07Z

I am sorry, but yes, without support of at least one of core developers it is a waste of your and our time.

I worked on similar code, so I can estimate how much it could cost. I could be wrong, and I'd like to be wrong in this case, but it seems that at this stage the cost/benefit ratio is too high. Your current code likely has bugs, and fixing them can make it even more complicated. It has a potential for simplification (the forward and the backward loops can be merged), but you need to understand the code from top to bottom to make it simpler and efficient, and this may be not enough. I hope you learned something new and can write better code from beginning next time and better estimate the cost of future changes.

Raw memory management is relatively slow. In this case you can use an array of constants size allocated on the stack for small number of needles. It is also more efficient to allocate a single buffer in dynamic memory than several buffers.

Keeping a reference to the object providing a buffer is not enough, for example the bytearray object can be resized if the buffer is released, and its internal buffer can be allocated in different place.

nineteendo · 2024-06-03T14:22:45Z

If you manage to get an improved version up and running, try first to gain traction for the feature on Discord, before you create a new issue/PR.

I have an improved version here using dynamic chunk sizes: nineteendo#2. It's now the fastest algorithm in the benchmark for cases where you wouldn't use regex (if you ignore that rfind() is 9% slower because memrchr() doesn't exist on macOS). The code is also a lot more readable now. I've asked dgrigonis to post a message on Discourse.

In this case you can use an array of constants size allocated on the stack for small number of needles.

That really didn't feel right to me. It's a lot of code for a very small improvement in the worst case. While the new pull request destroys this strategy using a lot less code.

nineteendo added 3 commits May 24, 2024 11:00

Support tuples for find & rfind

3632624

Update docs

e39b040

Add tests

cb905bc

bedevere-app bot mentioned this pull request May 24, 2024

Support tuples for more stringlike functions #118184

Closed

📜🤖 Added by blurb_it.

1807fd8

nineteendo commented May 24, 2024

View reviewed changes

Lib/test/string_tests.py Outdated Show resolved Hide resolved

Lib/test/string_tests.py Outdated Show resolved Hide resolved

Apply suggestions from code review

cca08fa

nineteendo commented May 24, 2024

View reviewed changes

nineteendo added 2 commits May 24, 2024 13:38

Apply suggestions from code review

302faa3

Fix signature tests

cb95578

nineteendo marked this pull request as ready for review May 24, 2024 12:32

bedevere-app bot added the awaiting review label May 24, 2024

nineteendo marked this pull request as draft May 24, 2024 19:55

bedevere-app bot removed the awaiting review label May 24, 2024

Short circuit

a35d3ae

nineteendo marked this pull request as ready for review May 24, 2024 20:56

bedevere-app bot added the awaiting review label May 24, 2024

dg-pb reviewed May 24, 2024

View reviewed changes

erlend-aasland added the DO-NOT-MERGE label May 25, 2024

methane reviewed May 25, 2024

View reviewed changes

Objects/bytes_methods.c Outdated Show resolved Hide resolved

methane reviewed May 25, 2024

View reviewed changes

Objects/bytes_methods.c Outdated Show resolved Hide resolved

Fix start for rfind

65c0a9e

nineteendo added 9 commits June 2, 2024 15:32

Rename argument

aada7f5

Decrease diff

090ddee

Decrease diff 2

6b85fd7

Decrease diff 3

41a6c20

Remove continue

460effa

Parentheses

d1c4af6

Store converted needles on the heap

c19ddcf

cleanup

6beae49

Fix uninitialised variable

ff514be

nineteendo added 2 commits June 2, 2024 23:07

Try to prevent segmentation fault

ff6eea2

Fix cast

0cbf03a

nineteendo added 2 commits June 2, 2024 23:55

Revert "Fix cast"

d412046

This reverts commit 0cbf03a.

Revert "Try to prevent segmentation fault"

168fe84

This reverts commit ff6eea2.

nineteendo added 4 commits June 3, 2024 07:03

Uninitialised memory?

41b11e5

More tests

ffe1152

Rename parameter

ac91f79

Unnest

6751992

serhiy-storchaka reviewed Jun 3, 2024

View reviewed changes

Objects/bytes_methods.c Outdated Show resolved Hide resolved

Keep buffers acquired during search

44aebd1

erlend-aasland added the pending The issue will be closed if no feedback is provided label Jun 3, 2024

Add buffers_len

9a51fd9

erlend-aasland closed this Jun 3, 2024

nineteendo mentioned this pull request Jun 25, 2024

Support tuples for [r]find() & [r]index() nineteendo/cpython#2

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

gh-118184: Support tuples for `find`, `index`, `rfind` & `rindex` #119501

gh-118184: Support tuples for `find`, `index`, `rfind` & `rindex` #119501

nineteendo commented May 24, 2024 •

edited

Loading

nineteendo left a comment

nineteendo left a comment

nineteendo commented May 24, 2024 •

edited

Loading

eendebakpt commented May 24, 2024

nineteendo commented May 24, 2024 •

edited

Loading

elisbyberi commented May 24, 2024

nineteendo commented May 24, 2024 •

edited

Loading

nineteendo commented May 24, 2024

nineteendo commented May 24, 2024

erlend-aasland commented May 25, 2024

nineteendo commented Jun 2, 2024 •

edited

Loading

erlend-aasland commented Jun 2, 2024

erlend-aasland commented Jun 2, 2024

serhiy-storchaka left a comment •

edited

Loading

nineteendo commented Jun 3, 2024 •

edited

Loading

erlend-aasland commented Jun 3, 2024

serhiy-storchaka commented Jun 3, 2024

nineteendo commented Jun 3, 2024 •

edited

Loading

gh-118184: Support tuples for find, index, rfind & rindex #119501

gh-118184: Support tuples for find, index, rfind & rindex #119501

Conversation

nineteendo commented May 24, 2024 • edited Loading

Benchmark for 1,000,000 characters

Old benchmark on Ubuntu

nineteendo left a comment

Choose a reason for hiding this comment

nineteendo left a comment

Choose a reason for hiding this comment

nineteendo commented May 24, 2024 • edited Loading

eendebakpt commented May 24, 2024

nineteendo commented May 24, 2024 • edited Loading

elisbyberi commented May 24, 2024

nineteendo commented May 24, 2024 • edited Loading

nineteendo commented May 24, 2024

nineteendo commented May 24, 2024

erlend-aasland commented May 25, 2024

nineteendo commented Jun 2, 2024 • edited Loading

erlend-aasland commented Jun 2, 2024

erlend-aasland commented Jun 2, 2024

serhiy-storchaka left a comment • edited Loading

Choose a reason for hiding this comment

nineteendo commented Jun 3, 2024 • edited Loading

erlend-aasland commented Jun 3, 2024

serhiy-storchaka commented Jun 3, 2024

nineteendo commented Jun 3, 2024 • edited Loading

gh-118184: Support tuples for `find`, `index`, `rfind` & `rindex` #119501

gh-118184: Support tuples for `find`, `index`, `rfind` & `rindex` #119501

nineteendo commented May 24, 2024 •

edited

Loading

nineteendo commented May 24, 2024 •

edited

Loading

nineteendo commented May 24, 2024 •

edited

Loading

nineteendo commented May 24, 2024 •

edited

Loading

nineteendo commented Jun 2, 2024 •

edited

Loading

serhiy-storchaka left a comment •

edited

Loading

nineteendo commented Jun 3, 2024 •

edited

Loading

nineteendo commented Jun 3, 2024 •

edited

Loading