<a href="https://colab.research.google.com/github/mwtam/blog/blob/main/Python_String_Comparison_Performace.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [16]:
s1 = "abcde" * 10000
s2 = "abcde" * 10000
ss = "abcde" * 20000

Compare s1 + s2 and ss. A naive approach.

In [17]:
%timeit s1 + s2 == ss

The slowest run took 10.14 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 9.6 µs per loop


In C++, we may gain some performance benefit if we can avoid making the new string `s1 + s2`. The idea is like (Python as C++ pseudocode):



> s1 == ss[:len(s1)] and s2 == ss[len(s1):]



In [18]:
%timeit s1 == ss[:len(s1)] and s2 == ss[len(s1):]

The slowest run took 5.06 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 9.38 µs per loop


However, string slice in Python is done by copying. Remove this cost by a pre-made string `s12`. It almost doubled the speed.

In [19]:
s12 = s1 + s2

In [20]:
%timeit s12 == ss

The slowest run took 7.31 times longer than the fastest. This could mean that an intermediate result is being cached.
100000 loops, best of 5: 3.71 µs per loop


What about controlling the detailed operations?

In [21]:
from itertools import zip_longest, chain

In [22]:
%timeit all(c1 == c2 for c1, c2 in zip_longest(chain(s1, s2), ss))

100 loops, best of 5: 9.81 ms per loop


It is super slow. Almost 1000 times slower.

A naive string comparison is done by `memcmp` under the hood. Do it myself in Python level isn't good.