Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve method cache efficiency #67036

Closed
pitrou opened this issue Nov 11, 2014 · 14 comments
Closed

Improve method cache efficiency #67036

pitrou opened this issue Nov 11, 2014 · 14 comments
Assignees
Labels
interpreter-core Interpreter core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@pitrou
Copy link
Member

pitrou commented Nov 11, 2014

BPO 22847
Nosy @arigo, @rhettinger, @gpshead, @pitrou, @benjaminp, @serhiy-storchaka, @koobs
Files
  • methcache.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/benjaminp'
    closed_at = <Date 2016-02-07.06:36:14.320>
    created_at = <Date 2014-11-11.16:13:46.877>
    labels = ['interpreter-core', 'performance']
    title = 'Improve method cache efficiency'
    updated_at = <Date 2016-02-07.06:36:14.318>
    user = 'https://github.com/pitrou'

    bugs.python.org fields:

    activity = <Date 2016-02-07.06:36:14.318>
    actor = 'python-dev'
    assignee = 'benjamin.peterson'
    closed = True
    closed_date = <Date 2016-02-07.06:36:14.320>
    closer = 'python-dev'
    components = ['Interpreter Core']
    creation = <Date 2014-11-11.16:13:46.877>
    creator = 'pitrou'
    dependencies = []
    files = ['37176']
    hgrepos = []
    issue_num = 22847
    keywords = ['patch']
    message_count = 14.0
    messages = ['231031', '231032', '231034', '231036', '231038', '231040', '231163', '231196', '231197', '259583', '259633', '259634', '259764', '259766']
    nosy_count = 9.0
    nosy_names = ['arigo', 'rhettinger', 'gregory.p.smith', 'pitrou', 'benjamin.peterson', 'Arfrever', 'python-dev', 'serhiy.storchaka', 'koobs']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue22847'
    versions = ['Python 2.7', 'Python 3.5']

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 11, 2014

    The method cache is currently very small. Attached patch bumps the size a bit and improves the hash computation formula.

    @pitrou pitrou added interpreter-core Interpreter core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Nov 11, 2014
    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 11, 2014

    It's not easy to get stable benchmark runs, but here is an example:

    Report on Linux fsol 3.16.0-24-generic #32-Ubuntu SMP Tue Oct 28 13:07:32 UTC 2014 x86_64 x86_64
    Total CPU cores: 4

    ### 2to3 ###
    7.083762 -> 6.904087: 1.03x faster

    ### formatted_logging ###
    Min: 0.351515 -> 0.330884: 1.06x faster
    Avg: 0.352954 -> 0.332422: 1.06x faster
    Significant (t=62.94)
    Stddev: 0.00120 -> 0.00197: 1.6382x larger

    ### mako_v2 ###
    Min: 0.035797 -> 0.034659: 1.03x faster
    Avg: 0.036427 -> 0.035378: 1.03x faster
    Significant (t=50.65)
    Stddev: 0.00032 -> 0.00034: 1.0668x larger

    ### richards ###
    Min: 0.174242 -> 0.163918: 1.06x faster
    Avg: 0.175643 -> 0.165689: 1.06x faster
    Significant (t=58.69)
    Stddev: 0.00086 -> 0.00084: 1.0168x smaller

    ### simple_logging ###
    Min: 0.300215 -> 0.287112: 1.05x faster
    Avg: 0.301957 -> 0.288785: 1.05x faster
    Significant (t=80.08)
    Stddev: 0.00086 -> 0.00078: 1.1052x smaller

    The following not significant results are hidden, use -v to show them:
    django_v2, silent_logging, tornado_http.

    @serhiy-storchaka
    Copy link
    Member

    serhiy-storchaka commented Nov 11, 2014

    Current hashing algorithm takes middle bits, proposed code takes low bits. Doesn't this make the hash worse?

    @serhiy-storchaka
    Copy link
    Member

    serhiy-storchaka commented Nov 11, 2014

    FYI method cache optimization was added in bpo-1700288.

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 11, 2014

    The low bits of the unicode hash should be as good as the middle bits.

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 11, 2014

    In addition, the tp_version_tag evolves incrementally, so the low bits should be better when the same name is looked up on different types.

    @rhettinger
    Copy link
    Contributor

    rhettinger commented Nov 14, 2014

    Thanks for adding the instrumentation. Otherwise, it would have been difficult to tell whether the xor hashing had a significant deleterious effect on collisions.

    Also, +1 on bumping up the cache size.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 15, 2014

    New changeset 97dc64adb6fe by Antoine Pitrou in branch 'default':
    Issue bpo-22847: Improve method cache efficiency.
    https://hg.python.org/cpython/rev/97dc64adb6fe

    @pitrou
    Copy link
    Member Author

    pitrou commented Nov 15, 2014

    Now pushed. Thanks for the comments!

    @pitrou pitrou closed this as completed Nov 15, 2014
    @gpshead
    Copy link
    Member

    gpshead commented Feb 4, 2016

    I'm reopening this and assigning it to benjamin as the 2.7 release manager. This change is valuable to apply to 2.7.x as well. It is very simple and is a clear performance improvement for realistic workloads. No API change.

    When you profile Python 2.7 applications today, the _PyType_Lookup function shows up in the ~3% of all CPU cycles range. This reduces that for a small memory tradeoff.

    We're raising our cache exponent to be even larger than the 12 in this patch at work as we've got some huge applications. Regardless, 12 is a much better default than the existing 9.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 5, 2016

    New changeset 6357d851029d by Antoine Pitrou in branch '2.7':
    Issue bpo-22847: Improve method cache efficiency.
    https://hg.python.org/cpython/rev/6357d851029d

    @benjaminp
    Copy link
    Contributor

    benjaminp commented Feb 5, 2016

    I suppose we've backported scarier things.

    @koobs
    Copy link

    koobs commented Feb 7, 2016

    It appears this change broke all FreeBSD builders (9: gcc, 10/11: clang) for the 2.7 branch with:

    ==== koobs-freebsd-current (clang 3.7.x)

    cc -pthread -c -fno-strict-aliasing -OPT:Olimit=0 -g -O2 -g -O0 -Wall -Wstrict-prototypes  -I. -IInclude -I./Include  -DPy_BUILD_CORE -o Objects/typeobject.o Objects/typeobject.c
    Objects/typeobject.c:2568:44: error: no member named 'hash' in 'PyStringObject'
            assert(((PyStringObject *)(name))->hash != -1);
                   ~~~~~~~~~~~~~~~~~~~~~~~~~~  ^
    /usr/include/assert.h:54:21: note: expanded from macro 'assert'
    #define assert(e)       ((e) ? (void)0 : __assert(__func__, __FILE__, \
                              ^
    1 error generated.
    *** Error code 1

    ==== koobs-freebsd10 (clang 3.4.x)

    cc -pthread -c -fno-strict-aliasing -OPT:Olimit=0 -g -O2 -g -O0 -Wall -Wstrict-prototypes -I. -IInclude -I./Include -fPIC -DPy_BUILD_CORE -o Objects/unicodeobject.o Objects/unicodeobject.c
    --- Objects/typeobject.o ---

    Objects/typeobject.c:2568:18: error: use of undeclared identifier 'PyASCIIObject'
            assert(((PyASCIIObject *)(name))->hash != -1);
                     ^
    /usr/include/assert.h:54:21: note: expanded from macro 'assert'
    #define assert(e)       ((e) ? (void)0 : __assert(__func__, __FILE__, \
                              ^
    Objects/typeobject.c:2568:33: error: expected expression
            assert(((PyASCIIObject *)(name))->hash != -1);
                                    ^
    /usr/include/assert.h:54:21: note: expanded from macro 'assert'
    #define assert(e)       ((e) ? (void)0 : __assert(__func__, __FILE__, \
                              ^
    --- Objects/unicodectype.o 

    cc -pthread -c -fno-strict-aliasing -OPT:Olimit=0 -g -O2 -g -O0 -Wall -Wstrict-prototypes -I. -IInclude -I./Include -fPIC -DPy_BUILD_CORE -o Objects/unicodectype.o Objects/unicodectype.c
    --- Objects/typeobject.o ---
    2 errors generated.
    *** [Objects/typeobject.o] Error code 1

    ==== koobs-freebsd9 (gcc 4.2.1 + patches)

    gcc -pthread -c -fno-strict-aliasing -g -O2 -g -O0 -Wall -Wstrict-prototypes -I. -IInclude -I./Include -DPy_BUILD_CORE -o Objects/unicodeobject.o Objects/unicodeobject.c
    Objects/unicodeobject.c: In function 'PyUnicode_DecodeUTF7Stateful':
    Objects/unicodeobject.c:1694: warning: comparison is always true due to limited range of data type
    gcc -pthread -c -fno-strict-aliasing -g -O2 -g -O0 -Wall -Wstrict-prototypes -I. -IInclude -I./Include -DPy_BUILD_CORE -o Objects/unicodectype.o Objects/unicodectype.c
    Objects/typeobject.c: In function '_PyType_Lookup':
    Objects/typeobject.c:2568: error: 'PyASCIIObject' undeclared (first use in this function)
    Objects/typeobject.c:2568: error: (Each undeclared identifier is reported only once
    Objects/typeobject.c:2568: error: for each function it appears in.)
    Objects/typeobject.c:2568: error: expected expression before ')' token
    *** [Objects/typeobject.o] Error code 1

    @koobs koobs reopened this Feb 7, 2016
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Feb 7, 2016

    New changeset 04424651f76c by Benjamin Peterson in branch '2.7':
    fix hash member name (closes bpo-22847)
    https://hg.python.org/cpython/rev/04424651f76c

    @python-dev python-dev mannequin closed this as completed Feb 7, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core Interpreter core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants