Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

_PyFunction_FastCallDict and _PyFunction_FastCallKeywords: fast path not used #76016

Closed
akruis mannequin opened this issue Oct 21, 2017 · 5 comments
Closed

_PyFunction_FastCallDict and _PyFunction_FastCallKeywords: fast path not used #76016

akruis mannequin opened this issue Oct 21, 2017 · 5 comments
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage

Comments

@akruis
Copy link
Mannequin

akruis mannequin commented Oct 21, 2017

BPO 31835
Nosy @vstinner, @akruis
PRs
  • bpo-31835: Optimize also FASTCALL using __future__ #4087
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2017-10-25.12:27:25.707>
    created_at = <Date 2017-10-21.11:14:04.491>
    labels = ['interpreter-core', '3.7', 'performance']
    title = '_PyFunction_FastCallDict and _PyFunction_FastCallKeywords: fast path not used'
    updated_at = <Date 2017-10-25.12:27:25.706>
    user = 'https://github.com/akruis'

    bugs.python.org fields:

    activity = <Date 2017-10-25.12:27:25.706>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2017-10-25.12:27:25.707>
    closer = 'vstinner'
    components = ['Interpreter Core']
    creation = <Date 2017-10-21.11:14:04.491>
    creator = 'anselm.kruis'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31835
    keywords = ['patch']
    message_count = 5.0
    messages = ['304702', '304811', '304812', '304982', '304983']
    nosy_count = 2.0
    nosy_names = ['vstinner', 'anselm.kruis']
    pr_nums = ['4087']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue31835'
    versions = ['Python 3.7']

    @akruis
    Copy link
    Mannequin Author

    akruis mannequin commented Oct 21, 2017

    Just a minor performance issue.

    The C functions _PyFunction_FastCallDict() and _PyFunction_FastCallKeywords() (branch 'master', Objects/call.c) and their predecessors fast_function() and _PyFunction_FastCallDict() in Python/ceval.c all contain the following sub-expression in the "if"-statement for the fast-path. For instance Objects/call.c:318

    co->co_flags == (CO_OPTIMIZED | CO_NEWLOCALS | CO_NOFREE)

    Now, if co_flags has any of the CO_FUTURE_... bits set, the expression is always False and the fast path is not used.

    Currently this affects only Python 3.6 and Python 2.7, because other Python versions do not use the __future__ mechanism.

    The fix is simple. Replace the faulty sub-expression by

    (co->co_flags & (~PyCF_MASK)) == (CO_OPTIMIZED | CO_NEWLOCALS | CO_NOFREE))

    I discovered this issue while debugging reference leaks in Stackless Python a few month ago. It is hard to write a test case, but one can compare C call stacks using a debugger.

    $ ulimit -c unlimited  # enable core dumps
    $ python3.6 -c 'from __future__ import generator_stop; import os; (lambda: os.abort())()'
    $ gdb -batch -ex bt  python3.6 core > trace_with_future
    $ python3.6 -c 'import os; (lambda: os.abort())()'
    $ gdb -batch -ex bt  python3.6 core > trace_without_future

    If you compare the traces, the difference is in stack frame #9. Same for python2.7.

    @akruis akruis mannequin added interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage labels Oct 21, 2017
    @pitrou pitrou added the 3.7 (EOL) end of life label Oct 22, 2017
    @vstinner
    Copy link
    Member

    The fix is simple. Replace the faulty sub-expression by
    (co->co_flags & (~PyCF_MASK)) == (CO_OPTIMIZED | CO_NEWLOCALS | CO_NOFREE))

    I proposed PR 4087 to implement this optimization.

    I wouldn't call it a "fix", since the "co->co_flags == (CO_OPTIMIZED | CO_NEWLOCALS | CO_NOFREE)" check exists since Python 2.7 at least (whereas Python 2.7 also has CO_FUTURE_xxx flags).

    Just a minor performance issue.

    I prefer to call it a performance opportunity :-)

    @vstinner
    Copy link
    Member

    I reset Versions to Python 3.7. I don't consider this issue as a bug, but only as a new optimization. So it can only go into the future Python 3.7.

    @vstinner
    Copy link
    Member

    New changeset 086c3ae by Victor Stinner in branch 'master':
    bpo-31835: Optimize also FASTCALL using __future__ (bpo-4087)
    086c3ae

    @vstinner
    Copy link
    Member

    Thank you Anselm Kruis for spotting this nice optimization opportunity! Sadly, as I wrote, I don't want to backport the optimization to the stable Python 3.6 branch.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs) performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants