Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improved profiler #42041

Closed
bdrosen mannequin opened this issue Jun 1, 2005 · 11 comments
Closed

Improved profiler #42041

bdrosen mannequin opened this issue Jun 1, 2005 · 11 comments
Labels
extension-modules C modules in the Modules dir

Comments

@bdrosen
Copy link
Mannequin

bdrosen mannequin commented Jun 1, 2005

BPO 1212837
Nosy @mwhudson, @arigo
Files
  • patch.zip: zip file contains the new module and test script/results
  • profile.c: Updated version of the file
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2007-03-07.07:12:45.000>
    created_at = <Date 2005-06-01.16:05:58.000>
    labels = ['extension-modules']
    title = 'Improved profiler'
    updated_at = <Date 2007-03-07.07:12:45.000>
    user = 'https://bugs.python.org/bdrosen'

    bugs.python.org fields:

    activity = <Date 2007-03-07.07:12:45.000>
    actor = 'arigo'
    assignee = 'none'
    closed = True
    closed_date = None
    closer = None
    components = ['Extension Modules']
    creation = <Date 2005-06-01.16:05:58.000>
    creator = 'bdrosen'
    dependencies = []
    files = ['6674', '6675']
    hgrepos = []
    issue_num = 1212837
    keywords = ['patch']
    message_count = 11.0
    messages = ['48399', '48400', '48401', '48402', '48403', '48404', '48405', '48406', '48407', '48408', '48409']
    nosy_count = 5.0
    nosy_names = ['mwh', 'arigo', 'lcreighton', 'zseil', 'bdrosen']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1212837'
    versions = ['Python 2.4']

    @bdrosen
    Copy link
    Mannequin Author

    bdrosen mannequin commented Jun 1, 2005

    I tried using both hotshot and the old python profiler
    and found them both to be inadequate.

    The old python profiler seems to give accurate results
    for timing, but is slow enough that it adds significant
    overhead to what it is measuing (approximately 10 times).
    Furthermore, it has no ability to give detailed stats
    about
    children. (how much of the cumulative time was taken up
    by each function called by a function)

    The hotshot profiler is much faster (profiling) adding
    only 30% overhead. However, it is extremely slow to load
    the results from the log file. It does not currently
    support detailed child stats, although I imagine that
    it could be made to do so using the information in the
    log file. The biggest problem with it, is that the
    time results seem to be highly inaccurate. (doesn't
    correspond to actual seconds, although they seem to be
    proportionally wrong)

    To address these shortcomings, I wrote a new profiling
    module. It adds about the same overhead (30%) as hotshot,
    but is much faster in retrieving results. It supports
    detailed child stats and gives accurate timing information
    in milliseconds. The accompanying .py module could use
    additional work though - because of the child stats, I
    was not able to reuse the stats module like hotshot does.

    I've included a simple test script that runs pystone
    for all 3 profilers (and without the profiler) to give
    a better idea of the differences. I've also included
    a dump of the output of the script running under Windows
    XP with python 2.4.1

    @bdrosen bdrosen mannequin closed this as completed Jun 1, 2005
    @bdrosen bdrosen mannequin added the extension-modules C modules in the Modules dir label Jun 1, 2005
    @bdrosen bdrosen mannequin closed this as completed Jun 1, 2005
    @bdrosen bdrosen mannequin added the extension-modules C modules in the Modules dir label Jun 1, 2005
    @bdrosen
    Copy link
    Mannequin Author

    bdrosen mannequin commented Sep 8, 2005

    Logged In: YES
    user_id=1289249

    I've enclosed an updated version of profile.c that fixes
    a problem in the original patch

    @lcreighton
    Copy link
    Mannequin

    lcreighton mannequin commented Sep 21, 2005

    Logged In: YES
    user_id=376262

    We've been using Hotshot at Strakt for a while now. We
    tried your patch and
    it worked precisely as advertised. Thank you for writing it.

    Laura Creighton

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Sep 21, 2005

    Logged In: YES
    user_id=4771

    Thanks for your work. The two existing profilers are
    definitely not satisfactory in my opinion too (I am also
    encountering crashes with profile.py).

    There are a number of details that should be addressed
    before your profiler can be a replacement for the existing
    ones, e.g. providing simple entry points and documentations,
    and having the C code reviewed. I am willing to help with
    all this.

    I hope you won't mind that I have checked your source code
    in a public Subversion repository, where I am working a bit
    on it together with Michael Hudson. (Obviously, the goal
    is to have the code eventually in the CPython CVS.)

    http://codespeak.net/svn/user/arigo/hack/misc/lsprof/

    @bdrosen
    Copy link
    Mannequin Author

    bdrosen mannequin commented Sep 21, 2005

    Logged In: YES
    user_id=1289249

    I welcome any additional changes (and help making those changes)
    that would improve the patch and help get it ready for
    inclusion in CPython.

    The main detail that I knew that would need to be addressed
    was lspstats.py. I didn't spend a lot of time working on it
    because I generally use a wx TreeListCtrl object to view/sort
    the results. What other details need to be addressed?

    @mwhudson
    Copy link

    Logged In: YES
    user_id=6656

    Well, you can see what we've done to your baby:

    http://codespeak.net/svn/user/arigo/hack/misc/lsprof/profile.c

    Mostly it's just C style conformance so far, though we've fixed a couple of
    little bugs too.

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Sep 23, 2005

    Logged In: YES
    user_id=4771

    I replaced the linked lists with some kind of auto-balacing
    trees; the linked lists were creating a huge overhead to
    profile large programs. The reason for not using plain
    Python dicts instead is that PyCodeObjects are not very
    good at being keys in dicts -- their hash computation takes
    ages, and we'd prefer an identity mapping anyway.

    The current SVN version is now the first profiler that
    works and gives sensible results when profiling the PyPy
    translation process.

    getstats() is now producing tuples-with-attribute-names
    instead of dicts (similar to os.stat()). This was mostly
    motivated by a lack of motivation to introduce error
    checking everywhere in the dict-building code, but I think
    it's a reasonable change. There are only a couple of
    places left in profile.c still missing checks for error
    results or out-of-memory conditions.

    The lsprof.py module exposes a simple but minimalistic
    interface. I suggest we keep and document it or a similar
    one, but also support -- for compatibility -- the
    convoluted interface of the existing profile.py/pstats.py
    and/or hotshot, with the option to dump the stats to a file
    and reload them. However I don't think it makes sense to
    use exactly the same format as pstats does (as far as I
    can guess it doesn't support per-caller information).

    @bdrosen
    Copy link
    Mannequin Author

    bdrosen mannequin commented Sep 26, 2005

    Logged In: YES
    user_id=1289249

    I looked over the changes so far and they look reasonable. I
    did have a few questions though:

    1 Do we not need to Increment/Decrement references to
    the code objects? We are using them as keys in the trees as
    well as payload data later on, but I don't see how we are
    guaranteed that they won't be reaped. (although it seems
    unlikely)

    2 Is it deliberate to use lsprof.YYY style names for some
    of the objects (ie lsprof.Profiler) instead of _lsprof ? (is
    this the normal convention?)

    3 Do you have a feel for the performance differences
    of using the tree instead of the lists? Doing the simple
    benchmark test they seemed to be comparable, but
    that test is pretty simple. I'm assuming tht in a large
    program, the tree approach will be considerably faster?

    4 In lsprof.py, is there a reason that the Stats class
    does not derive from object?

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Sep 27, 2005

    Logged In: YES
    user_id=4771

    1 The Py_INCREF(_code) at line 86 is the same as in
    your original code; it should guarantee that the
    code object doesn't go away. However, I forgot
    the corresponding Py_DECREF()...

    2 It was a quick hack to have help(lsprof) display
    these types as well. Now I'm no longer sure that
    we need help(lsprof) to display them anyway, so
    let's use the standard '_lsprof.XXX' names.

    3 Profiling a large program took forever. I
    interrupted it after 30 minutes when it showed no
    sign of wanting to go past the initial step that
    normally takes only a few minutes. With rotating
    trees this step is fast again. Maybe they are not
    an optimal structure, though, because it still
    takes something like three times longer to finish
    the whole program (normally takes half an hour).

    4 No.

    Checked in your proposed changes. What is still
    missing: deciding how much similar to the
    profiler.py and pstats.py API we need to be, and
    writing some tests (ideally, having good test
    coverage would be nice).

    @zseil
    Copy link
    Mannequin

    zseil mannequin commented Mar 6, 2007

    Can this patch be closed? Python 2.5 has a new
    cProfile module, which AFAIK, was derived from
    this patch.

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Mar 7, 2007

    Yes, this profiler is now available as the cProfile
    module of Python 2.5.

    For older Python versions, it is not packaged but
    can be checked out from my svn repository at
    http://codespeak.net/svn/user/arigo/hack/misc/lsprof .

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant