Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fastcall uses more C stack #73044

Closed
vstinner opened this issue Dec 2, 2016 · 10 comments
Closed

Fastcall uses more C stack #73044

vstinner opened this issue Dec 2, 2016 · 10 comments
Labels
3.7 (EOL) end of life

Comments

@vstinner
Copy link
Member

vstinner commented Dec 2, 2016

BPO 28858
Nosy @vstinner, @serhiy-storchaka
Files
  • stack_overflow.py
  • stack_overflow.py
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2016-12-05.17:45:34.448>
    created_at = <Date 2016-12-02.07:54:27.620>
    labels = ['3.7']
    title = 'Fastcall uses more C stack'
    updated_at = <Date 2016-12-05.17:45:34.447>
    user = 'https://github.com/vstinner'

    bugs.python.org fields:

    activity = <Date 2016-12-05.17:45:34.447>
    actor = 'vstinner'
    assignee = 'none'
    closed = True
    closed_date = <Date 2016-12-05.17:45:34.448>
    closer = 'vstinner'
    components = []
    creation = <Date 2016-12-02.07:54:27.620>
    creator = 'vstinner'
    dependencies = []
    files = ['45731', '45757']
    hgrepos = []
    issue_num = 28858
    keywords = []
    message_count = 10.0
    messages = ['282228', '282247', '282252', '282370', '282371', '282373', '282377', '282385', '282426', '282447']
    nosy_count = 3.0
    nosy_names = ['vstinner', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue28858'
    versions = ['Python 3.7']

    @vstinner
    Copy link
    Member Author

    vstinner commented Dec 2, 2016

    Serhiy Storchaka reported that Python 3.6 crashs earlier than Python 3.5 on calling json.dumps() when sys.setrecursionlimit() is increased.

    I tested the script he wrote. Results on Python built in release mode:

    Python 3.7:

    ...
    58100 116204
    Segmentation fault (core dumped)

    Python 3.6:

    ...
    74800 149604
    Segmentation fault (core dumped)

    Python 3.5:

    ...
    74700 149404
    Segmentation fault (core dumped)

    Oh, it seems like Python 3.7 does crash earlier.

    But to be clear, it's hard to control the usage of the C stack.

    @vstinner vstinner added the 3.7 (EOL) end of life label Dec 2, 2016
    @vstinner
    Copy link
    Member Author

    vstinner commented Dec 2, 2016

    Oh, I didn't understand that the regression was introduced by the revision b9c9691c72c5. The purpose of this revision was to *reduce* the memory usage of the C stack!?

    It seems like _PyObject_CallArg1() uses more stack memory than PyObject_CallFunctionObjArgs(). PyObject_CallFunctionObjArgs() allocates 4O bytes (5*sizeof(PyObject*)) on the stack.

    At least, I can say that when the crash occurs, _PyObject_FastCallDict() is not the gdb backtrace.

    @serhiy-storchaka
    Copy link
    Member

    Yes, that is why I asked you to revert your changes.

    In additional, they introduced compiler warnings.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 4, 2016

    New changeset d35fc6e58a70 by Victor Stinner in branch 'default':
    Backed out changeset b9c9691c72c5
    https://hg.python.org/cpython/rev/d35fc6e58a70

    @vstinner
    Copy link
    Member Author

    vstinner commented Dec 4, 2016

    Serhiy Storchaka reported that Python 3.6 crashs earlier than Python 3.5 on calling json.dumps() when sys.setrecursionlimit() is increased.

    Reference: http://bugs.python.org/issue23507#msg282190 (issue bpo-23507).

    Serhiy Storchaka: "Yes, that is why I asked you to revert your changes."

    Sorry, I misunderstood your comments. So yes, my change b9c9691c72c5 introduced a regression. Sorry, I didn't have time before now to revert my change. I just pushed the change d35fc6e58a70 which reverts b9c9691c72c5.

    The question is how replacing PyObject_CallFunctionObjArgs() with _PyObject_CallArg1() increases the usage of the C stack. I wrote my change to reduce the usage of the C stack.

    PyObject_CallFunctionObjArgs() allocates 5 "PyObject *", so 40 bytes, on the C stack. Maybe using _PyObject_CallArg1() increases the usage of C stack in the *caller*.

    In additional, they introduced compiler warnings.

    This one was fixed by Benjamin Peterson in the issue bpo-28855 (change 96245d4af0ca).

    @serhiy-storchaka
    Copy link
    Member

    Thanks Victor.

    Following script includes several examples of achieving a stack overflow (most are real world examples or can be used in real world examples). It measures maximal deep for every type of recursion.

    @vstinner
    Copy link
    Member Author

    vstinner commented Dec 4, 2016

    When I wrote the _PyObject_CallArg1(), it looks as a cool hack:

    #define _PyObject_CallArg1(func, arg) \
        _PyObject_FastCall((func), (PyObject **)&(arg), 1)

    It hacks the declaration of an explicit "stack" like:

       PyObject *stack[1];
       stack[0] = arg;
       res = _PyObject_FastCall(func, stack, 1);

    And I expected that the C compiler magically computes the memory address of the argument. But it seems like requesting the memory address of an argument allocates something on the C stack.

    On x86_64, first function arguments are passed with CPU registers. Maybe requesting the memory address of an argument requires to allocate a local variable, copy the register into the variable, to get the address of the local variable?

    So, I suggest to *remove* the _PyObject_CallArg1() macro, and use existing functions like PyObject_CallFunctionObjArgs().

    What do you think Serhiy?

    @serhiy-storchaka
    Copy link
    Member

    That was my initial preference. Mainly because this doesn't add code churn.

    But I don't understand how PyObject_CallFunctionObjArgs() that uses _PyObject_CallArg1() and has many local variables can consume less stack than _PyObject_CallArg1() itself.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Dec 5, 2016

    New changeset 4171cc26272c by Victor Stinner in branch 'default':
    Issue bpo-28858: Remove _PyObject_CallArg1() macro
    https://hg.python.org/cpython/rev/4171cc26272c

    @vstinner
    Copy link
    Member Author

    vstinner commented Dec 5, 2016

    The changeset 4171cc26272c "Remove _PyObject_CallArg1() macro" fixed the initial bug report, so I now close the issue.

    Serhiy: If you see further enhancements, please open a new issue. Your second stack_overflow.py script is interesting, but I don't see any obvious possible changes to enhance results.

    Thanks Serhiy for digging into this issue ;-)

    @vstinner vstinner closed this as completed Dec 5, 2016
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants