Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed up _decimal import #63431

Closed
skrah mannequin opened this issue Oct 12, 2013 · 21 comments
Closed

Speed up _decimal import #63431

skrah mannequin opened this issue Oct 12, 2013 · 21 comments
Assignees
Labels
extension-modules C modules in the Modules dir performance Performance or resource usage

Comments

@skrah
Copy link
Mannequin

skrah mannequin commented Oct 12, 2013

BPO 19232
Nosy @warsaw, @rhettinger, @mdickinson, @pitrou, @vstinner, @ericvsmith, @skrah
Files
  • issue19232.patch
  • issue19232-2.patch: patch without --git option
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/skrah'
    closed_at = <Date 2014-09-12.09:55:05.220>
    created_at = <Date 2013-10-12.12:03:19.594>
    labels = ['extension-modules', 'performance']
    title = 'Speed up _decimal import'
    updated_at = <Date 2014-10-12.11:31:14.065>
    user = 'https://github.com/skrah'

    bugs.python.org fields:

    activity = <Date 2014-10-12.11:31:14.065>
    actor = 'python-dev'
    assignee = 'skrah'
    closed = True
    closed_date = <Date 2014-09-12.09:55:05.220>
    closer = 'skrah'
    components = ['Extension Modules']
    creation = <Date 2013-10-12.12:03:19.594>
    creator = 'skrah'
    dependencies = []
    files = ['32060', '32066']
    hgrepos = []
    issue_num = 19232
    keywords = ['patch']
    message_count = 21.0
    messages = ['199553', '199555', '199558', '199562', '199564', '199567', '199570', '199571', '199612', '199619', '199677', '199691', '199724', '199864', '199872', '216562', '226700', '226701', '226812', '226816', '229141']
    nosy_count = 10.0
    nosy_names = ['barry', 'rhettinger', 'mark.dickinson', 'pitrou', 'vstinner', 'eric.smith', 'Arfrever', 'skrah', 'fijall', 'python-dev']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'performance'
    url = 'https://bugs.python.org/issue19232'
    versions = ['Python 3.5']

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Oct 12, 2013

    As discussed on python-dev, importing _decimal at the bottom of
    decimal.py is about 9x slower than importing _decimal directly.

    @skrah skrah mannequin self-assigned this Oct 12, 2013
    @skrah skrah mannequin added extension-modules C modules in the Modules dir performance Performance or resource usage labels Oct 12, 2013
    @vstinner
    Copy link
    Member

    I proposed something similar for issue bpo-19229.

    @ericvsmith
    Copy link
    Member

    Remember that one reason for importing the C version at the bottom of the python version is so that alternate implementations (PyPy, IronPython, Jython) could provide partial versions of the C (or equivalent) versions. By importing after the Python version, the alternate implementation could continue to use parts of the Python code.

    I think the impact on alternate implementations needs to be considered before we start rearchitecting these imports.

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Oct 12, 2013

    Right, let's start collecting objections. :)

    Mark, Raymond: Would you support the change (name hack and all)?

    Maciej: Is this approach a problem for PyPy?

    @vstinner
    Copy link
    Member

    If the Python implementation is renamed to _pydecimal, I don't expect it to be used in CPython. I never used _pyio in a real application, only for some tests to debug. I don't think that we need the __name__ = 'decimal' "hack". If you really want to keep it, please add at least a comment explaining it.

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Oct 12, 2013

    I guess if some of the pickling stuff get's rewritten, we can drop
    __name__.

    The other thing is that traditionally the types were "decimal.Decimal"
    etc., so I'm not sure if it is good idea to have "_decimal.Decimal" and
    "_pydecimal.Decimal".

    Of course adding __module__ everywhere is another option.

    @vstinner
    Copy link
    Member

    The other thing is that traditionally the types were "decimal.Decimal"
    etc., so I'm not sure if it is good idea to have "_decimal.Decimal" and
    "_pydecimal.Decimal".

    Why not renaming the _decimal module to decimal?

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Oct 12, 2013

    _decimal already lies about its name (for pickling).

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Oct 12, 2013

    I can't apply the patch that was created with diff --git, so here is
    another one that is less readable but applies.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 12, 2013

    I can't apply the patch that was created with diff --git, so here is
    another one that is less readable but applies.

    You can apply it using "hg import --no-commit", I think.

    @mdickinson
    Copy link
    Member

    Mark, Raymond: Would you support the change (name hack and all)?

    No objections here.

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Oct 13, 2013

    Antoine Pitrou <report@bugs.python.org> wrote:

    You can apply it using "hg import --no-commit", I think.

    mercurial 2.1 throws an exception even though the patch was created with
    that version. Now I upgraded to 2.7.2 and it works.

    Rietveld also seems to choke on the first patch (no review link).

    @rhettinger
    Copy link
    Contributor

    I think we should save these sort of tricks only for modules imported during startup.

    Ideally, a user should expect that the code for the decimal module is in decimal.py. Ideally, tools like IDLE's "Open Module" should be able to find the source code using only the module name.

    I'm -0 on this one. I don't think hacking up the source tree is worth it. The import is only done once -- the important part is what decimal does when it is imported.

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Oct 14, 2013

    About IDLE I can't say anything, but I'm not entirely sure if the
    PEP-399 import method is easier to understand for users, see e.g.:

    http://stackoverflow.com/questions/13194384/instantiate-decimal-class

    I get the impression that the posters at first did not even realize
    that there *is* an accelerator in Python 3.3, so they had a hard time
    debugging the issue.

    As for the load time: Personally I don't have any application
    that relies on a very short load time, but I wan't to note again
    that importing decimal at Python startup actually doubles the
    startup time.

    So if there are applications that start a new Python script
    for each request (e.g. via djb's tcpserver), it could really
    matter.

    OTOH I don't like moving code around either, so we can wait until
    there's a demonstrated need for the speedup.

    @pitrou
    Copy link
    Member

    pitrou commented Oct 14, 2013

    OTOH I don't like moving code around either, so we can wait until
    there's a demonstrated need for the speedup.

    Keep in mind that people who have startup speed problems aren't
    likely to open an issue on the tracker about it, let alone
    diagnose it enough to put the blame on decimal.

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Apr 16, 2014

    I would like to go ahead with this. As Antoine mentioned, most
    people don't diagnose import problems, especially when they compare
    Python 2 against Python 3.

    As it turns out, the slowdown is even significant in a simple tcpserver
    application that starts a Python script.

    Another benefit is that it will be possible to import the Python version
    easily, e.g. to check error messages (currently _decimal justs displays
    the signal in tracebacks). I've often wanted that myself instead of
    going through the import_fresh_module() ordeal.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Sep 10, 2014

    New changeset 8bf51cf94405 by Stefan Krah in branch 'default':
    Issue bpo-19232: Speed up decimal import. Additionally, since _decimal is
    http://hg.python.org/cpython/rev/8bf51cf94405

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Sep 10, 2014

    We could speed up the import further by not importing collections
    in _decimal. That could be done once structseq fully implements
    the namedtuple protocol (for DecimalTuple).

    @vstinner
    Copy link
    Member

    "We could speed up the import further by not importing collections
    in _decimal. That could be done once structseq fully implements
    the namedtuple protocol (for DecimalTuple)."

    I suggest to close this issue. I guess that importing decimal is already fast enough, and enhance structseq is a completly different issue. (Is there an open issue, just to get the link?)

    @skrah
    Copy link
    Mannequin Author

    skrah mannequin commented Sep 12, 2014

    I'm fine with closing this. The structseq issue is bpo-1820.

    @skrah skrah mannequin closed this as completed Sep 12, 2014
    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Oct 12, 2014

    New changeset 75b5617b8dfc by Stefan Krah in branch 'default':
    Issue bpo-19232: Fix sys.modules lookup (--without-threads)
    https://hg.python.org/cpython/rev/75b5617b8dfc

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    extension-modules C modules in the Modules dir performance Performance or resource usage
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants