Skip to content

Conversation

vstinner
Copy link
Member

@vstinner vstinner commented Sep 24, 2025

Specialize _PyUnicode_EncodeCharmap() for EncodingMapType which is used by Python codecs such as iso8859_15.

Specialize _PyUnicode_EncodeCharmap() for EncodingMapType which is
used by Python codecs such as iso8859_15.
@vstinner
Copy link
Member Author

Benchmark:

import pyperf
runner = pyperf.Runner()
sizes = (1, 100, 10_000)
for size in sizes:
    runner.timeit(f'{size:,} ASCII chars',
        setup=f's="x"*{size}',
        stmt='s.encode("iso8859-15")')
for size in sizes:
    runner.timeit(f'{size:,} UCS-2 chars',
        setup=f's="€"*{size}',
        stmt='s.encode("iso8859-15")')

Results using the main branch as the reference:

Benchmark ref change
1 ASCII chars 629 ns 622 ns: 1.01x faster
100 ASCII chars 1.13 us 997 ns: 1.13x faster
10,000 ASCII chars 49.8 us 31.8 us: 1.57x faster
1 UCS-2 chars 630 ns 622 ns: 1.01x faster
100 UCS-2 chars 1.18 us 1.06 us: 1.12x faster
10,000 UCS-2 chars 54.3 us 38.2 us: 1.42x faster
Geometric mean (ref) 1.19x faster

Results using Python 3.14 as the reference:

Benchmark 314 change
1 ASCII chars 647 ns 622 ns: 1.04x faster
100 ASCII chars 1.12 us 997 ns: 1.12x faster
10,000 ASCII chars 45.2 us 31.8 us: 1.42x faster
1 UCS-2 chars 648 ns 622 ns: 1.04x faster
100 UCS-2 chars 1.17 us 1.06 us: 1.10x faster
10,000 UCS-2 chars 49.8 us 38.2 us: 1.30x faster
Geometric mean (ref) 1.16x faster

@vstinner
Copy link
Member Author

cc @serhiy-storchaka

Co-authored-by: Stan Ulbrych <89152624+StanFromIreland@users.noreply.github.com>
@vstinner vstinner merged commit e9c538d into python:main Sep 25, 2025
43 checks passed
@vstinner vstinner deleted the charmap_specialize branch September 25, 2025 09:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants