Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve performance of binascii.unhexlify() by using conversion table #76328

Closed
sir-sigurd mannequin opened this issue Nov 27, 2017 · 11 comments
Closed

improve performance of binascii.unhexlify() by using conversion table #76328

sir-sigurd mannequin opened this issue Nov 27, 2017 · 11 comments
Labels
3.8 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@sir-sigurd
Copy link
Mannequin

sir-sigurd mannequin commented Nov 27, 2017

BPO 32147
Nosy @pitrou, @meadori, @serhiy-storchaka, @csabella, @sir-sigurd
PRs
  • bpo-32147: Improved perfomance of binascii.unhexlify(). #4586
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-03-20.04:02:39.166>
    created_at = <Date 2017-11-27.12:14:29.522>
    labels = ['3.8', 'library', 'performance']
    title = 'improve performance of binascii.unhexlify() by using conversion table'
    updated_at = <Date 2019-03-20.04:02:39.162>
    user = 'https://github.com/sir-sigurd'

    bugs.python.org fields:

    activity = <Date 2019-03-20.04:02:39.162>
    actor = 'serhiy.storchaka'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-03-20.04:02:39.166>
    closer = 'serhiy.storchaka'
    components = ['Library (Lib)']
    creation = <Date 2017-11-27.12:14:29.522>
    creator = 'sir-sigurd'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 32147
    keywords = ['patch']
    message_count = 11.0
    messages = ['307053', '307368', '307378', '307379', '307381', '307504', '307505', '308491', '308498', '312951', '338413']
    nosy_count = 5.0
    nosy_names = ['pitrou', 'meador.inge', 'serhiy.storchaka', 'cheryl.sabella', 'sir-sigurd']
    pr_nums = ['4586']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue32147'
    versions = ['Python 3.8']

    @sir-sigurd
    Copy link
    Mannequin Author

    sir-sigurd mannequin commented Nov 27, 2017

    Before:

    $ ./python -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
    50 loops, best of 5: 5.68 msec per loop

    After:

    $ ./python -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
    100 loops, best of 5: 2.06 msec per loop

    @sir-sigurd sir-sigurd mannequin added performance Performance or resource usage stdlib Python modules in the Lib dir labels Nov 27, 2017
    @serhiy-storchaka
    Copy link
    Member

    I can't reproduce the performance difference.

    @serhiy-storchaka serhiy-storchaka added the 3.7 (EOL) end of life label Dec 1, 2017
    @sir-sigurd
    Copy link
    Mannequin Author

    sir-sigurd mannequin commented Dec 1, 2017

    Serhiy, did you use the same benchmark as mentioned here?

    @serhiy-storchaka
    Copy link
    Member

    Yes. And I can't reproduce a slowdown with a simplified a2b_qp(). Maybe this depends on the compiler and on the CPU. What are your OS, compiler and CPU? Do you build 32- or 64-bit Python? Do you build in a debug or release mode?

    @sir-sigurd
    Copy link
    Mannequin Author

    sir-sigurd mannequin commented Dec 1, 2017

    OS
    x86_64 GNU/Linux

    compiler
    gcc version 7.2.0 (Debian 7.2.0-16)

    CPU
    Architecture: x86_64
    CPU op-mode(s): 32-bit, 64-bit
    Byte Order: Little Endian
    CPU(s): 4
    On-line CPU(s) list: 0-3
    Thread(s) per core: 2
    Core(s) per socket: 2
    Socket(s): 1
    NUMA node(s): 1
    Vendor ID: GenuineIntel
    CPU family: 6
    Model: 58
    Model name: Intel(R) Core(TM) i5-3210M CPU @ 2.50GHz
    Stepping: 9
    CPU MHz: 2494.521
    CPU max MHz: 3100,0000
    CPU min MHz: 1200,0000
    BogoMIPS: 4989.04
    Virtualization: VT-x
    L1d cache: 32K
    L1i cache: 32K
    L2 cache: 256K
    L3 cache: 3072K
    NUMA node0 CPU(s): 0-3
    Flags: fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush dts acpi mmx fxsr sse sse2 ss ht tm pbe syscall nx rdtscp lm constant_tsc arch_perfmon pebs bts rep_good nopl xtopology nonstop_tsc cpuid aperfmperf pni pclmulqdq dtes64 monitor ds_cpl vmx est tm2 ssse3 cx16 xtpr pdcm pcid sse4_1 sse4_2 x2apic popcnt tsc_deadline_timer xsave avx f16c rdrand lahf_lm cpuid_fault epb tpr_shadow vnmi flexpriority ept vpid fsgsbase smep erms xsaveopt dtherm ida arat pln pts

    Do you build 32- or 64-bit Python?
    I'm not sure about that, I guess 64 is default on 64 OS?

    Do you build in a debug or release mode?
    I tried with --enable-optimizations, --with-pydebug and without any flags. Numbers are different, but magnitude of a change is the same.

    @pitrou
    Copy link
    Member

    pitrou commented Dec 3, 2017

    Here are the results here:

    • Before:
      $ ./python -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
      50 loops, best of 5: 4.37 msec per loop

    • After:
      $ ./python -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
      200 loops, best of 5: 1.16 msec per loop

    @pitrou
    Copy link
    Member

    pitrou commented Dec 3, 2017

    (platform is Ubuntu 16.04, 64-bit, on a Core i5-2500K CPU)

    @sir-sigurd
    Copy link
    Mannequin Author

    sir-sigurd mannequin commented Dec 17, 2017

    Is there anything I can do to push this forward?

    BTW, Serhiy, what are your OS, compiler and CPU?

    @meadori
    Copy link
    Member

    meadori commented Dec 17, 2017

    FWIW, I see a win on OS X 10.12.6:

    λ:master !?=> cc --version
    Apple LLVM version 8.1.0 (clang-802.0.42)
    Target: x86_64-apple-darwin16.7.0
    Thread model: posix
    InstalledDir: /Applications/Xcode.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin
    λ:master !?=> uname -a
    Darwin ripley.attlocal.net 16.7.0 Darwin Kernel Version 16.7.0: Wed Oct 4 00:17:00 PDT 2017; root:xnu-3789.71.6~1/RELEASE_X86_64 x86_64

    • Before:
      λ:master ?=> ./python.exe -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
      20 loops, best of 5: 11.3 msec per loop

    • After:
      λ:master !?=> ./python.exe -m timeit -s "from binascii import unhexlify; b = b'aa'*2**20" "unhexlify(b)"
      50 loops, best of 5: 4.15 msec per loop

    @ned-deily ned-deily added 3.8 only security fixes and removed 3.7 (EOL) end of life labels Feb 26, 2018
    @serhiy-storchaka
    Copy link
    Member

    New changeset 6b5df90 by Serhiy Storchaka (Sergey Fedoseev) in branch 'master':
    bpo-32147: Improved perfomance of binascii.unhexlify(). (GH-4586)
    6b5df90

    @csabella
    Copy link
    Contributor

    Since this PR was merged, can the issue be closed?

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants