Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

splitext performances improvement #36353

Closed
skeim mannequin opened this issue Mar 29, 2002 · 9 comments
Closed

splitext performances improvement #36353

skeim mannequin opened this issue Mar 29, 2002 · 9 comments
Assignees
Labels
stdlib Python modules in the Lib dir

Comments

@skeim
Copy link
Mannequin

skeim mannequin commented Mar 29, 2002

BPO 536661
Nosy @tim-one, @loewis, @arigo
Files
  • posixpath.dif: posixpath.py and test_posixpath.py diff
  • xxxpath.dif
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/loewis'
    closed_at = <Date 2002-12-12.20:31:52.000>
    created_at = <Date 2002-03-29.08:06:22.000>
    labels = ['library']
    title = 'splitext performances improvement'
    updated_at = <Date 2002-12-12.20:31:52.000>
    user = 'https://bugs.python.org/skeim'

    bugs.python.org fields:

    activity = <Date 2002-12-12.20:31:52.000>
    actor = 'loewis'
    assignee = 'loewis'
    closed = True
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2002-03-29.08:06:22.000>
    creator = 's_keim'
    dependencies = []
    files = ['4095', '4096']
    hgrepos = []
    issue_num = 536661
    keywords = ['patch']
    message_count = 9.0
    messages = ['39410', '39411', '39412', '39413', '39414', '39415', '39416', '39417', '39418']
    nosy_count = 4.0
    nosy_names = ['tim.peters', 'loewis', 'arigo', 's_keim']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue536661'
    versions = ['Python 2.3']

    @skeim
    Copy link
    Mannequin Author

    skeim mannequin commented Mar 29, 2002

    After more thought, I must admit that the behavior change in splitext, I proposed with patch 536120 is not acceptable. So I would instead propose this one which should only improve performances without modifying behavior.
    The following bench says that patched splitext is between 2x(for l1) and 25x(for l2) faster than the original one.

    The diff patch also test_posixpath.py to check the pitfall described by Tim comments in patch 536120 page.

    def splitext(p):
        root, ext = '', ''
        for c in p:
            if c == '/':
                root, ext = root + ext + c, ''
            elif c == '.':
                if ext:
                    root, ext = root + ext, c
                else:
                    ext = c
            elif ext:
                ext = ext + c
            else:
                root = root + c
        return root, ext
    
    def splitext2(p):
        i = p.rfind('.')
        if i<=p.rfind('/'):
            return p, ''
        else:
            return p[:i], p[i:]
    
    l1 = ('t','.t','a.b/','a.b','/a.b','a.b/.c','a.b/c.d')
    
    l2 = (
    'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.tyyttyt',
    'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut.',
    'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/.tyyttyt',
    'usr/tmp.doc/list/home/sebastien/foo/bar/hghgt/yttyutyuyuttyuyut',
    'reeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeeyttyutyuyuttyuyut.tyyttyt',
    '/iuouiiuuoiiuiikhjzekezhjzekejkejkzejkhejkhzejzehjkhjezhjkehzkhjezh.tyyttyt'
        )
    
    for i in l1+l2:
        assert splitext2(i) == splitext(i)
    
    import time
    
    def test(f,args):
        t = time.clock()
        for p in args:
            for i in range(1000):
                f(p)
        return time.clock() - t
    
    def f(p):pass
    
    a=test(splitext, l1)
    b=test(splitext2, l1)
    c=test(f,l1)
    print a,b,c,(a-c)/(b-c)
    
    a=test(splitext, l2)
    b=test(splitext2, l2)
    c=test(f,l2)
    print a,b,c,(a-c)/(b-c)

    @skeim skeim mannequin closed this as completed Mar 29, 2002
    @skeim skeim mannequin assigned loewis Mar 29, 2002
    @skeim skeim mannequin added the stdlib Python modules in the Lib dir label Mar 29, 2002
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 29, 2002

    Logged In: YES
    user_id=21627

    The patch looks good to me.

    @tim-one
    Copy link
    Member

    tim-one commented Mar 29, 2002

    Logged In: YES
    user_id=31435

    I like it fine so far as it goes, but I'd like it a lot
    more if it also patched the splitext and test
    implementations for other platforms. It's not good that,
    e.g., posixpath.py and ntpath.py get more and more out of
    synch over time, and that their test suites also diverge.

    @skeim
    Copy link
    Mannequin Author

    skeim mannequin commented Apr 2, 2002

    Logged In: YES
    user_id=498191

    I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?

    1 similar comment
    @skeim
    Copy link
    Mannequin Author

    skeim mannequin commented Apr 2, 2002

    Logged In: YES
    user_id=498191

    I have take a look at macpath, dospath and ntpath. I have found quite a lot of code duplication. What would be your opinion, if I tried to do a little refactoring on this?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Apr 2, 2002

    Logged In: YES
    user_id=21627

    Sharing code is a good thing. However, it would be critical
    as to how exactly this is done, since os is such a central
    module. If you start now, and don't get agreement
    immediately, it may well be that you cannot complete until
    Python 2.3.

    @skeim
    Copy link
    Mannequin Author

    skeim mannequin commented Apr 3, 2002

    Logged In: YES
    user_id=498191

    xxxpath.dif contains the splitext patch for posixpath, ntpath, dospath macpath and the corresponding test files (I have added a test file for macpath).

    I have found better to not attempt to modify riscospath.py since I don't know this platform. Anyway, it already use a rfind strategy.

    @arigo
    Copy link
    Mannequin

    arigo mannequin commented Dec 7, 2002

    Logged In: YES
    user_id=4771

    The test_macpath module should probably use

        from test import test_support

    instead of

        import test_support

    Apart from this the patch looks fine.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Dec 12, 2002

    Logged In: YES
    user_id=21627

    Sebastien, thanks for the patch, and Armin, thanks for the
    review. Applied as

    macpath.py 1.41
    ntpath.py 1.52
    posixpath.py 1.55
    test_macpath.py 1.1
    test_ntpath.py 1.17
    test_posixpath.py 1.5

    (dospath has gone meanwhile)

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant