Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

scary frame speed hacks #39815

Closed
mwhudson opened this issue Jan 13, 2004 · 16 comments
Closed

scary frame speed hacks #39815

mwhudson opened this issue Jan 13, 2004 · 16 comments
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@mwhudson
Copy link

BPO 876206
Nosy @mwhudson, @tim-one, @arigo, @birkenfeld, @rhettinger
Files
  • scary-frame-hacks.diff: mwh#s patch Support "bpo-" in Misc/NEWS #1 attempt 2
  • frame_dealloc_clears_more.diff: arigo patch Support "bpo-" in Misc/NEWS #1
  • zombie-frames-2.diff: arigo patch Rename README to README.rst and enhance formatting #2 attempt 2
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2006-05-23.11:16:33.000>
    created_at = <Date 2004-01-13.16:49:51.000>
    labels = ['interpreter-core']
    title = 'scary frame speed hacks'
    updated_at = <Date 2006-05-23.11:16:33.000>
    user = 'https://github.com/mwhudson'

    bugs.python.org fields:

    activity = <Date 2006-05-23.11:16:33.000>
    actor = 'tim.peters'
    assignee = 'none'
    closed = True
    closed_date = None
    closer = None
    components = ['Interpreter Core']
    creation = <Date 2004-01-13.16:49:51.000>
    creator = 'mwh'
    dependencies = []
    files = ['5789', '5790', '5791']
    hgrepos = []
    issue_num = 876206
    keywords = ['patch']
    message_count = 16.0
    messages = ['45295', '45296', '45297', '45298', '45299', '45300', '45301', '45302', '45303', '45304', '45305', '45306', '45307', '45308', '45309', '45310']
    nosy_count = 7.0
    nosy_names = ['mwh', 'tim.peters', 'jhylton', 'arigo', 'richard', 'georg.brandl', 'rhettinger']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue876206'
    versions = ['Python 2.5']

    @mwhudson
    Copy link
    Author

    In ceval.c we find

    	/* XXX Perhaps we should create a specialized
    	   PyFrame_New() that doesn't take locals, but does
    	   take builtins without sanity checking them.
    	\*/
    

    This patch takes that idea rather further than you
    might have expected... it creates a "light" subtype of
    frame that assumes certain things about the frame,
    gives this type its own free list (so it can assume
    more about objects on the freelist) and converts light
    frames into "heavy" frames when assumptions stop being
    true.

    Good for a ~5% improvement on "./python -s 'def f():
    pass' 'f()'"; a bit less on pystone. It also conflicts
    slightly with my function reorg patch -- apply that
    first, apply this, ignore the reject and edit
    func_caller_nofrees in funcobject.c to call
    PyFrame_NewLight.

    All three patches I just submitted together get ~6% on
    pystone.

    @mwhudson mwhudson added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Jan 13, 2004
    @mwhudson mwhudson added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Jan 13, 2004
    @jhylton
    Copy link
    Mannequin

    jhylton mannequin commented Jan 13, 2004

    Logged In: YES
    user_id=31392

    I don't see any files attached.

    @mwhudson
    Copy link
    Author

    Logged In: YES
    user_id=6656

    sigh

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Jan 14, 2004

    Logged In: YES
    user_id=4771

    (Side note first: I'm not sure 'builtins = back->f_builtins'
    is right.)

    Is the whole subclassing complexity worth the effort, given
    that the invariants of light frames only seem to be that
    four specific fields are null? Changing the type of an
    object under Python code's feet is calling for troubles.
    Moreover it is bound to break code that expect
    'type(frame)==FrameType', even if such code can be
    considered bad style.

    Moreover it requires a number of hacks here and there --
    e.g. you turn a light frame into a "heavy" frame when
    f_trace is set; is it on purpose that you don't do it when
    f_locals is set?

    I cannot seem to get reliable performance results on my
    machine, but maybe you want to compare with the attached
    patch which speeds up the regular PyFrame_New by putting
    stronger invariants on all the frames in the free_list.

    @mwhudson
    Copy link
    Author

    Logged In: YES
    user_id=6656

    I'm fairly sure this made a difference on my iBook; haven't
    tried on x86.

    It's possible that the correct response to this patch is to
    add "... nah, not worth it" to the XXX comment in ceval.c...

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Jan 27, 2004

    Logged In: YES
    user_id=4771

    Here is yet another try, which seems to perform better on my PentiumIII. I get the following speed improvements for this patch alone, for a loop calling an empty function:

    zombie-frames.diff: 11.4% (PyStone 3.8%)
    scary-frame-hacks.diff: 6.4% (PyStone 0.85%)

    The idea is to get rid of the free_list and instead store the most recently finished ("zombie") frame in an internal field of the code object. This saves half of the frame creation overhead because half of the fields are already correct when the frame is reused, e.g. f_code, f_nlocals, f_stacksize, f_valuestack...

    (you might need to cvs up frameobject.c before you can apply the patch)

    @mwhudson
    Copy link
    Author

    mwhudson commented Feb 2, 2004

    Logged In: YES
    user_id=6656

    Did I mention this idea to you or did you come up with it
    independently? I forget...

    I'll try to time stuff on my iBook tomorrow.

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Feb 2, 2004

    Logged In: YES
    user_id=4771

    I guess the idea was just in the air, after your published attempts.

    Ideally I'd have liked to have the cached frame depend on the globals as well as the code object itself; I considered moving the cache field to function objects. This way you also save the f_globals and f_builtins initialization. There were problems but maybe we should try harder.

    @rhettinger
    Copy link
    Contributor

    Logged In: YES
    user_id=80475

    Armin's second patch gives gives the expected speedups on a
    Pentium3 running WinME, and the test suite runs without
    exception. I recommend accepting and applying this patch as
    is. Further improvements can be considered separately.

    @mwhudson
    Copy link
    Author

    mwhudson commented Mar 1, 2004

    Logged In: YES
    user_id=6656

    It slows recursive functions down a noticeable amount (did
    this get noted anywhere? Maybe Armin & I just talked about
    it on IRC), so that should be considered before this patch
    is applied. But I think it's probably worth it, FWIW.

    @rhettinger
    Copy link
    Contributor

    Logged In: YES
    user_id=80475

    The effect on recursive functions could be mitigated by
    restoring the freelist and falling back to it when
    code->co_zombieframe == NULL.

    I don't know if that is worth it. The current patch is a
    code simplication as well as an optimization. Adding back
    the freelist, adds a lot of clutter. Python is not
    especially friendly to recursive functions anyway.

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Mar 1, 2004

    Logged In: YES
    user_id=4771

    On a small recursive example it slows down from 2.64s to 3.26s.
    This is a serious difference (20%). Is it bad enough to keep the
    freelist ?

    @rhettinger
    Copy link
    Contributor

    Logged In: YES
    user_id=80475

    I think that's a question for python-dev.

    @birkenfeld
    Copy link
    Member

    Logged In: YES
    user_id=849994

    Can this be reviewed for 2.5? The relevant discussion on
    python-dev is at
    http://mail.python.org/pipermail/python-dev/2004-March/042871.html.

    @richard
    Copy link
    Mannequin

    richard mannequin commented May 22, 2006

    Logged In: YES
    user_id=6405

    Patch modified and applied to python2.5. Mods:

    1. updated to python2.5
    2. reinstated use of free list

    See the "rjones-funccall" branch in SVN.

    @tim-one
    Copy link
    Member

    tim-one commented May 23, 2006

    Logged In: YES
    user_id=31435

    Closed as Accepted. While re-adding the free list removed
    the code simplification benefit, the measurable x-platform
    speedup is well worth getting.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants