Faster utf7 encode #373

carsonip · 2019-03-25T03:41:26Z

~40% faster for input with a mix of unicode and ascii chars. The improvement is more significant for pure ascii chars.

This is achieved by removing the .encode() in the most common path (ascii char) and comparing ordinal values instead of unicode objects.

Actually we can get 10% more by eliminating dot notation, 10-20% more by inlining consume_b64_buffer function, some more by caching ord global function, but then the code will get quite ugly and I'm not sure if that's welcomed.

~40% faster for input with a mix of unicode and ascii chars

carsonip · 2019-03-25T04:05:52Z

As a side note, combining with Cython, this can achieve a total of 10x improvement.

mjs

Brilliant, thanks!

carsonip · 2019-04-05T03:28:04Z

Forgot to attach the benchmark script and results:

# -*- coding: utf-8 -*-
import time
from imapclient.imap_utf7 import encode, decode

q = u'你好嗎你好嗎ab 123&123 abc你' * 50

start = time.time()
for i in xrange(5000):
   encode(q)
print(time.time() - start)

CPython 2.7.12
before: 2.45s
after: 1.48s

PyPy2.7 v7.1
before: 0.284s
after: 0.198s

Faster utf7 encode

9327862

~40% faster for input with a mix of unicode and ascii chars

mjs approved these changes Apr 5, 2019

View reviewed changes

mjs merged commit 46a3ee6 into mjs:master Apr 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Faster utf7 encode #373

Faster utf7 encode #373

carsonip commented Mar 25, 2019 •

edited

Loading

carsonip commented Mar 25, 2019

mjs left a comment

carsonip commented Apr 5, 2019 •

edited

Loading

Faster utf7 encode #373

Faster utf7 encode #373

Conversation

carsonip commented Mar 25, 2019 • edited Loading

carsonip commented Mar 25, 2019

mjs left a comment

Choose a reason for hiding this comment

carsonip commented Apr 5, 2019 • edited Loading

carsonip commented Mar 25, 2019 •

edited

Loading

carsonip commented Apr 5, 2019 •

edited

Loading