Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed-up dict.copy() up to 5.5 times. #75362

Closed
1st1 opened this issue Aug 10, 2017 · 13 comments
Closed

Speed-up dict.copy() up to 5.5 times. #75362

1st1 opened this issue Aug 10, 2017 · 13 comments
Labels
3.7 interpreter-core performance

Comments

@1st1
Copy link
Member

@1st1 1st1 commented Aug 10, 2017

BPO 31179
Nosy @rhettinger, @vstinner, @methane, @serhiy-storchaka, @1st1
PRs
  • #3067
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2018-01-22.16:55:17.220>
    created_at = <Date 2017-08-10.21:31:33.005>
    labels = ['interpreter-core', '3.7', 'performance']
    title = 'Speed-up dict.copy() up to 5.5 times.'
    updated_at = <Date 2018-01-22.16:55:17.219>
    user = 'https://github.com/1st1'

    bugs.python.org fields:

    activity = <Date 2018-01-22.16:55:17.219>
    actor = 'yselivanov'
    assignee = 'none'
    closed = True
    closed_date = <Date 2018-01-22.16:55:17.220>
    closer = 'yselivanov'
    components = ['Interpreter Core']
    creation = <Date 2017-08-10.21:31:33.005>
    creator = 'yselivanov'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 31179
    keywords = []
    message_count = 13.0
    messages = ['300117', '300120', '300123', '300153', '300155', '300159', '300163', '300164', '300423', '310380', '310426', '310428', '310432']
    nosy_count = 5.0
    nosy_names = ['rhettinger', 'vstinner', 'methane', 'serhiy.storchaka', 'yselivanov']
    pr_nums = ['3067']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue31179'
    versions = ['Python 3.7']

    @1st1
    Copy link
    Member Author

    @1st1 1st1 commented Aug 10, 2017

    It's possible to significantly improve performance of shallow dict copy. Currently, PyDict_Copy creates a new empty dict object and then inserts key/values into it one by one.

    My idea is to simply memcpy the whole keys/items region and do the necessary increfs after it. This works just fine for non-key-sharing dicts.

    With the following simple microbenchmark:

        import time
    
        N = 1000000
    
        for size in [0, 1, 10, 20, 50, 100, 500, 1000]:
            d = dict([(str(i), i) for i in range(size)])
    
            t = time.monotonic()
            for i in range(N):
                d.copy()
            e = time.monotonic() - t
        print(f'dict(size={size}).copy() x {N} times:\t {e:.4f}')
    

    Output for 3.7 master:

    dict(size=0).copy() x 1000000 times:     0.1299
    dict(size=1).copy() x 1000000 times:     0.1499
    dict(size=10).copy() x 1000000 times:    0.3758
    dict(size=20).copy() x 1000000 times:    0.7722
    dict(size=50).copy() x 1000000 times:    1.2784
    dict(size=100).copy() x 1000000 times:   2.5128
    dict(size=500).copy() x 1000000 times:   12.8968
    dict(size=1000).copy() x 1000000 times:  25.4276
    

    Output for patched 3.7:

    dict(size=0).copy() x 1000000 times:     0.1352
    dict(size=1).copy() x 1000000 times:     0.1285
    dict(size=10).copy() x 1000000 times:    0.1632
    dict(size=20).copy() x 1000000 times:    0.3076
    dict(size=50).copy() x 1000000 times:    0.3663
    dict(size=100).copy() x 1000000 times:   0.5140
    dict(size=500).copy() x 1000000 times:   2.3419
    dict(size=1000).copy() x 1000000 times:  4.6176
    

    @1st1 1st1 added 3.7 interpreter-core performance labels Aug 10, 2017
    @vstinner
    Copy link
    Member

    @vstinner vstinner commented Aug 10, 2017

    PyDict_Copy creates a new empty dict object and then inserts key/values into it one by one.

    Why not creating a "preallocated" dict in that case? _PyDict_NewPresized()

    @1st1
    Copy link
    Member Author

    @1st1 1st1 commented Aug 10, 2017

    > PyDict_Copy creates a new empty dict object and then inserts key/values into it one by one.

    Why not creating a "preallocated" dict in that case? _PyDict_NewPresized()

    I don't think it's related to the proposed patch. Please take a look at the PR. _PyDict_NewPresized and inserting entries one by one will not be faster than a memcpy.

    @methane
    Copy link
    Member

    @methane methane commented Aug 11, 2017

    I like idea.
    One worrying point is how to deal with dirty dict.
    How about do it only when ma_used == keys->dk_nentries?

    Slightly off topic. Copy on write can be implemented via dk_refcnt.
    Functions just passing **kwargs has temporal copy of dict.
    And CoW will reduce the cost of temporal dict.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Aug 11, 2017

    The PR adds over 50 lines of code for optimising not very often used feature. There are two obvious ways of copying, dict(d) and d.copy(), the PR optimises just the latter one, and I'm not sure this is the most used way. The PR duplicates the low-level code, that increases maintainability cost.

    The PR changes the behavior. Currently the effect of copying is compacting the dict.

    >>> import sys
    >>> sys.getsizeof(d)
    41020
    >>> sys.getsizeof(d.copy())
    41020
    >>> sys.getsizeof(dict(d))
    41020
    >>> for i in range(1000): del d[i]
    ... 
    >>> sys.getsizeof(dict(d))
    20544
    >>> sys.getsizeof(d.copy())
    20544
    >>> sys.getsizeof(d)
    41020
    >>> import sys
    >>> d = dict.fromkeys(range(2000))
    >>> sys.getsizeof(d)
    41020
    >>> sys.getsizeof(d.copy())
    41020
    >>> d = dict.fromkeys(range(2000))
    >>> for i in range(1999): del d[i]
    ... 
    >>> sys.getsizeof(d)
    41020
    >>> sys.getsizeof(d.copy())
    136

    The PR preserves non compact layout in the copy.

    @vstinner
    Copy link
    Member

    @vstinner vstinner commented Aug 11, 2017

    >>> d = dict.fromkeys(range(2000))
    >>> for i in range(1999): del d[i]
    ...
    >>> sys.getsizeof(d)
    41020
    >>> sys.getsizeof(d.copy())
    136

    Why "del" doesn't compact the dict?

    @1st1
    Copy link
    Member Author

    @1st1 1st1 commented Aug 11, 2017

    I like idea.
    One worrying point is how to deal with dirty dict.
    How about do it only when ma_used == keys->dk_nentries?

    I've added this check. See the updated PR.

    The PR changes the behavior. Currently the effect of copying is compacting the dict.

    The check that INADA suggested enables compacting on copy, if it is needed.

    The PR adds over 50 lines of code for optimising not very often used feature.

    I started to look into the problem because I need this for my upcoming PEP, so please don't dismiss this idea right away.

    I also think that copying a dict isn't a "not very often used feature", it depends on your frame of references. In some applications you do copy dict a lot. 50 lines of code speeding up one of the core methods 5.5x is a fair price to pay.

    There are two obvious ways of copying, dict(d) and d.copy()

    That can also be easily optimized, btw. I'll see if I can do that without impacting the performance of creating new dicts.

    The PR duplicates the low-level code, that increases maintainability cost.

    FWIW, the PR doesn't duplicate any of the code. It provides a new implementation that is more efficient than the old approach.

    @1st1
    Copy link
    Member Author

    @1st1 1st1 commented Aug 11, 2017

    Why "del" doesn't compact the dict?

    This is a good question, btw.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Aug 17, 2017

    The side effect of this patch is making dict.copy() atomic. This is a worthy feature if extent it to dict constructor. For now the only way of making an atomic (or almost atomic) copy of a dict is dict(list(d.itemview())). It isn't very time and memory efficient.

    If you will make dict copying removing holes and extend your patch to dict constructor, it could be more useful.

    Look at the set implementation. It doesn't just use memcpy, but it contains specialized insertion implementation for the case if all items are unique. Fast copying is more important for dicts since the copying is more common for sets. It is a part of set operations and it is common to convert a set to a frozenset.

    @1st1
    Copy link
    Member Author

    @1st1 1st1 commented Jan 21, 2018

    I've pushed a new version of the patch that I intend to merge tomorrow.

    The last version has only one minor change: it uses fast-path for "slightly" non-compact dicts too (dicts don't use *at most* 3 entries). This protects us from pathological cases when a huge dict being almost emptied with pop/del and then gets copied -- we indeed want the copy to be compact.

    Although I believe that the real issue is that del and pop don't compact dicts from time to time, but I don't want that issue to hold off this patch in any way.

    If you will make dict copying removing holes and extend your patch to dict constructor, it could be more useful.

    It's already useful -- I'm supporting a large code base (>0.5M LOC) which uses dict.copy() extensively, and it shows up in profile. I've seen it in many other places (particularly ORMs love to store information in dicts and use dict.copy() to track dirty state/changes). Please don't say that dict.copy() is not a common operation or that dict(other_dict) is more common than other_dict.copy() -- that's simply incorrect.

    The side effect of this patch is making dict.copy() atomic. This is a worthy feature if extent it to dict constructor.

    I agree. I'll work on that later in a follow-up PR. Let's move in small steps.

    @vstinner
    Copy link
    Member

    @vstinner vstinner commented Jan 22, 2018

    Yury: Would you mind to open an issue to investigate why dict are not compatected automatically?

    @1st1
    Copy link
    Member Author

    @1st1 1st1 commented Jan 22, 2018

    @1st1
    Copy link
    Member Author

    @1st1 1st1 commented Jan 22, 2018

    New changeset b0a7a03 by Yury Selivanov in branch 'master':
    bpo-31179: Make dict.copy() up to 5.5 times faster. (bpo-3067)
    b0a7a03

    @1st1 1st1 closed this as completed Jan 22, 2018
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 interpreter-core performance
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants