Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reduce temporary unicode object while adding descriptors #73569

Closed
methane opened this issue Jan 28, 2017 · 5 comments
Closed

Reduce temporary unicode object while adding descriptors #73569

methane opened this issue Jan 28, 2017 · 5 comments
Assignees
Labels
3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs)

Comments

@methane
Copy link
Member

methane commented Jan 28, 2017

BPO 29383
Nosy @methane, @serhiy-storchaka
Files
  • descr-remove-getitemstring.patch
  • descr-remove-getitemstring-2.patch
  • dict-setitemstring.patch: removes unnecessary comment.
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/methane'
    closed_at = <Date 2017-01-28.07:39:33.947>
    created_at = <Date 2017-01-28.03:04:14.956>
    labels = ['interpreter-core', '3.7']
    title = 'Reduce temporary unicode object while adding descriptors'
    updated_at = <Date 2017-01-28.07:39:33.946>
    user = 'https://github.com/methane'

    bugs.python.org fields:

    activity = <Date 2017-01-28.07:39:33.946>
    actor = 'methane'
    assignee = 'methane'
    closed = True
    closed_date = <Date 2017-01-28.07:39:33.947>
    closer = 'methane'
    components = ['Interpreter Core']
    creation = <Date 2017-01-28.03:04:14.956>
    creator = 'methane'
    dependencies = []
    files = ['46441', '46443', '46444']
    hgrepos = []
    issue_num = 29383
    keywords = ['patch']
    message_count = 5.0
    messages = ['286394', '286399', '286400', '286401', '286404']
    nosy_count = 3.0
    nosy_names = ['methane', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue29383'
    versions = ['Python 3.7']

    @methane
    Copy link
    Member Author

    methane commented Jan 28, 2017

    add_methods(), add_members(), and add_getset() creates PyUnicode from C string 3 times, and calls PyUnicode_InternInplace 2 times.

    1. PyDict_GetItemString() at first. (PyUnicode_FromString() is called).
    2. In middle, descr_new() calls PyUnicode_InternFromString().
    3. PyDict_SetItemString() at last. (creates unicode and intern it).

    Skipping (2) is require adding new private APIs to pass PyUnicodeObject.
    But I don't think it worth enough. (I'll try it later.)
    So this patch only remove last temporary unicode.
    (3 PyUnicode_FromString + 2 PyUnicode_InternInplace) becomes (2 PyUnicode_FromString + 2 PyUnicode_InternInplace).

    It seems ~1% startup speedup (without site).

      $ ./python -m performance.benchmarks.bm_python_startup --no-site
      default: python_startup_no_site: Median +- std dev: 12.7 ms +- 0.1 ms
      patched: python_startup_no_site: Median +- std dev: 12.6 ms +- 0.1 ms

    While speedup is small, this patch removes time to think "How large this
    overhead of GetItemString + SetItemString pair?" while reading code :)

    Additionally, this patch removes this comment in PyDict_SetItemString:

    • PyUnicode_InternInPlace(&kv); /* XXX Should we really? */

    SetItemString is used to add something to namespace.
    Changing this behavior affects too widely. So we should do it.

    @methane methane added the 3.7 (EOL) end of life label Jan 28, 2017
    @methane methane self-assigned this Jan 28, 2017
    @methane methane added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Jan 28, 2017
    @methane
    Copy link
    Member Author

    methane commented Jan 28, 2017

    I think I found better way.

    Interned string can be get from descripter. Interning can be reduced
    without adding private API.
    Please don't review the first patch.

    @methane
    Copy link
    Member Author

    methane commented Jan 28, 2017

    descr-remove-getitemstring-2.patch is more compact than first patch.
    (3 unicode + 2 intern) becomes (2 unicode + 1 intern).

    python_startup_no_site: Median +- std dev: 12.5 ms +- 0.1 ms

    @serhiy-storchaka
    Copy link
    Member

    descr-remove-getitemstring-2.patch LGTM.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Jan 28, 2017

    New changeset d7ec72c1620c by INADA Naoki in branch 'default':
    Issue bpo-29383: reduce temporary interned unicode
    https://hg.python.org/cpython/rev/d7ec72c1620c

    @methane methane closed this as completed Jan 28, 2017
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life interpreter-core (Objects, Python, Grammar, and Parser dirs)
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants