Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Re-enable threading test on macOS #62249

Closed
ronaldoussoren opened this issue May 24, 2013 · 30 comments
Closed

Re-enable threading test on macOS #62249

ronaldoussoren opened this issue May 24, 2013 · 30 comments
Assignees
Labels
3.8 (EOL) end of life 3.9 only security fixes OS-mac type-bug An unexpected behavior, bug, or error

Comments

@ronaldoussoren
Copy link
Contributor

BPO 18049
Nosy @ronaldoussoren, @vstinner, @ned-deily, @zooba, @matrixise, @aixtools, @miss-islington, @tirkarthi, @aeros
PRs
  • bpo-18049: Sync thread stack size to main thread size on macOS #14748
  • [3.8] bpo-18049: Sync thread stack size to main thread size on macOS (GH-14748) #15065
  • bpo-18049: Reintroduce platform guard in recursion limit test #15075
  • bpo-18049: Define THREAD_STACK_SIZE for AIX to pass default recursion limit test #15081
  • [3.8] bpo-18049: Define THREAD_STACK_SIZE for AIX to pass default recursion limit test (GH-15081) #15089
  • [3.8] bpo-18049: Define THREAD_STACK_SIZE for AIX to pass default recursion limit test (GH-15081) #15572
  • Files
  • reenable-threading-test.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/ronaldoussoren'
    closed_at = <Date 2019-08-03.07:20:18.384>
    created_at = <Date 2013-05-24.14:27:46.013>
    labels = ['OS-mac', 'type-bug', '3.8', '3.9']
    title = 'Re-enable threading test on macOS'
    updated_at = <Date 2019-09-06.14:30:57.071>
    user = 'https://github.com/ronaldoussoren'

    bugs.python.org fields:

    activity = <Date 2019-09-06.14:30:57.071>
    actor = 'ronaldoussoren'
    assignee = 'ronaldoussoren'
    closed = True
    closed_date = <Date 2019-08-03.07:20:18.384>
    closer = 'ronaldoussoren'
    components = ['macOS']
    creation = <Date 2013-05-24.14:27:46.013>
    creator = 'ronaldoussoren'
    dependencies = []
    files = ['30359']
    hgrepos = []
    issue_num = 18049
    keywords = ['patch', 'needs review']
    message_count = 30.0
    messages = ['189912', '192389', '270766', '271071', '272116', '323757', '347830', '347868', '347888', '347901', '348839', '348847', '348860', '348875', '348876', '348887', '348891', '348894', '348895', '348896', '348897', '348899', '348900', '348945', '350083', '350718', '351180', '351185', '351195', '351256']
    nosy_count = 9.0
    nosy_names = ['ronaldoussoren', 'vstinner', 'ned.deily', 'steve.dower', 'matrixise', 'Michael.Felt', 'miss-islington', 'xtreak', 'aeros']
    pr_nums = ['14748', '15065', '15075', '15081', '15089', '15572']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue18049'
    versions = ['Python 3.8', 'Python 3.9']

    @ronaldoussoren
    Copy link
    Contributor Author

    I recently notied that test.test_threading.ThreadingExceptionTests.test_recursion_limit is disabled for debug builds.

    The attached patch re-enables this test case and fixes the crash be increasing the stack size for new threads.

    The disadvantage of the patch is that is increases the default stack size for all new threads on OSX from 5 to 8 megabytes (for threads created by Python) and that could affect program behavior as fewer threads can be created before running out of memory.

    The patch could be updated to check for Py_DEBUG and only increase the stack size for --with-pydebug builds (which are the ones using -O0), as those builds need the increased stack size and debug builds already consume more memory that normal builds. That's not 100% reliable though, as users could also build the with CFLAGS set to -O0 without using --with-pydebug.

    I haven't found a way to detect the optimization level with clang or gcc using the preprocessor.

    @ronaldoussoren ronaldoussoren self-assigned this May 24, 2013
    @ronaldoussoren ronaldoussoren added OS-mac type-bug An unexpected behavior, bug, or error labels May 24, 2013
    @ronaldoussoren
    Copy link
    Contributor Author

    I intend to apply this patch to default (and only that branch) after running the testsuite and with a small change: the stack size will be 0x1000000 (double it current value) to sync it with the stack size for the main thread, which was recently set to that value in the default branch.

    @matrixise
    Copy link
    Member

    for my part, I think you should use a #ifdef PY_DEBUG in your code and increase the default size of the stack for the threads, and only if you are on APPLE and PY_DEBUG mode.

    @ronaldoussoren
    Copy link
    Contributor Author

    The patch increases the stack size for threads to slightly less than size as is used for the main thread, see this fragment in configure.ac:

    # Issue bpo-18075: the default maximum stack size (8MBytes) is too
    # small for the default recursion limit. Increase the stack size
    # to ensure that tests don't crash
    LINKFORSHARED="-Wl,-stack_size,1000000 $LINKFORSHARED"

    Maybe the patch should be updated to use the same size?

    The disadvantage of any increase of size is that this reduces the number of threads that can be used, in particular for 32-bit builds (64-bit builds have more than enough address space)

    @matrixise
    Copy link
    Member

    yes, maybe, could you provide a patch ?

    @ronaldoussoren
    Copy link
    Contributor Author

    I have a patch for making the stack sizes the same in bpo-34264.

    @ronaldoussoren
    Copy link
    Contributor Author

    I've converted the patch to a pull request (and closed to other issue because that has a patch that is less complete).

    I'm not sure if it is acceptable to backport this patch to 3.8 at this time.

    @ronaldoussoren ronaldoussoren added 3.8 (EOL) end of life 3.9 only security fixes labels Jul 13, 2019
    @ronaldoussoren ronaldoussoren changed the title Re-enable threading test on OSX Re-enable threading test on macOS Jul 13, 2019
    @zooba
    Copy link
    Member

    zooba commented Jul 14, 2019

    I believe this is acceptable to include in 3.8, provided it makes sense for 3.9, so I added the backport tag on the PR.

    That said, I'm not making any comment on whether it's appropriate at all :)

    For Windows we found some ways to reduce the stack usage of debug builds (IIRC, refactoring some large local variables into their own function) and improved recursion depth that way. Without any data on whether worker threads generally go deeper than the main thread, I would hesitate to say that they ought to be the same, as there is a definite advantage to having a smaller stack by default when you have many worker threads.

    But if someone else confidently approves the change, it can go into 3.8.

    @ronaldoussoren
    Copy link
    Contributor Author

    I've done some more testing, and the thread stack size could be smaller than 16MB, but definitely not as small as the value that's currently in thread_pthread.h.

    I prefer to have a slightly too large value here because IIRC the relation between the recursion limit and C stack usage is a heuristic.

    BTW. It should be possible to implement PyOS_CheckStack() on macOS. There are (undocumented) APIs in the public header files to get the stack base and size for the current thread. These APIs also work for the main thread:

    # ---

    #include <pthread.h>
    #include <stdio.h>
    
    int main(void)
    {
            printf("%#lx\n", (long)pthread_get_stacksize_np(pthread_self()));
            printf("%p\n", pthread_get_stackaddr_np(pthread_self()));
    }

    # ---

    @ronaldoussoren
    Copy link
    Contributor Author

    W.r.t. PyOS_CheckStack: I had that idea last year as well, and someone contributed a pull request, see bpo-33955.

    @ronaldoussoren
    Copy link
    Contributor Author

    New changeset 1a057ba by Ronald Oussoren in branch 'master':
    bpo-18049: Sync thread stack size to main thread size on macOS (GH-14748)
    1a057ba

    @vstinner
    Copy link
    Member

    vstinner commented Aug 1, 2019

    I'm not sure if it's related, but test_threading.test_recursion_limit() started to crash on two AIX buildbot workers after commit 1a057ba:
    https://bugs.python.org/issue36273#msg348845

    @miss-islington
    Copy link
    Contributor

    New changeset 8399641 by Miss Islington (bot) in branch '3.8':
    bpo-18049: Sync thread stack size to main thread size on macOS (GH-14748)
    8399641

    @ronaldoussoren
    Copy link
    Contributor Author

    That is related, my patch removes a guard that should enable this test only on macOS.

    This fragment is the issue:

    • @unittest.skipUnless(sys.platform == 'darwin' and test.support.python_is_optimized(),
    •                        'test macosx problem')
      

    I'll create a pull request for this.

    BTW. This does point to an issue with AIX though, this means it is possible to crash threads by deep recursion on AIX. But that's separate from this issue.

    @ronaldoussoren
    Copy link
    Contributor Author

    The patch to fix this should be:

    diff --git a/Lib/test/test_threading.py b/Lib/test/test_threading.py
    index 1466d25e94..658bc1c7a0 100644
    --- a/Lib/test/test_threading.py
    +++ b/Lib/test/test_threading.py
    @@ -1057,6 +1057,7 @@ class ThreadingExceptionTests(BaseTestCase):
             lock = threading.Lock()
             self.assertRaises(RuntimeError, lock.release)
     
    +    @unittest.skipUnless(sys.platform == 'darwin', 'test macos problem')
         def test_recursion_limit(self):
             # Issue 9670
             # test that excessive recursion within a non-main thread causes

    PR will follow after I've run tests locally, I'll probably create the PR in the morning.

    @ronaldoussoren
    Copy link
    Contributor Author

    PR 15075 should fix this issue. As mentioned in that PR I do have a question w.r.t. the exact shape of that patch: I currently reintroduce a "skipUnless" decorator that only runs this test on macOS, I could also introduce one that just disables the test on AIX as that's where the test fails.

    P.S. According to this old developer works blog AIX has a very small default stack size for threads (like more BSD systems) and also has an API to change the size of those stacks. It's probably worthwhile to add some code to thread_pthread.h to do this.

    The blog: https://www.ibm.com/developerworks/community/blogs/5894415f-be62-4bc0-81c5-3956e82276f3/entry/don_t_forget_about_stack_size_limits_for_multi_threaded_applications_on_aix?lang=en

    @aixtools
    Copy link
    Contributor

    aixtools commented Aug 2, 2019

    @ronald - thanks.

    Gives me something to work from. Would not have found this so easily!

    But - where to put it... :)

    @ronaldoussoren
    Copy link
    Contributor Author

    @michael: The relevant code is in Python/thread_pthread.h:

    /* The default stack size for new threads on OSX and BSD is small enough that

    There are already blocks for macOS and FreeBSD, adding a block for AIX should be easy enough. I don't know what a good size for the stack would be though, the macOS size is large enough to not cause crashes with the default recursion limit in debug builds and was experimentally determined.

    @aixtools
    Copy link
    Contributor

    aixtools commented Aug 2, 2019


    Looking in ./Python/thread_pthread.h"

    +252 #if defined(THREAD_STACK_SIZE)
    +253 PyThreadState *tstate = _PyThreadState_GET();
    +254 size_t stacksize = tstate ? tstate->interp->pythread_stacksize : 0;
    +255 tss = (stacksize != 0) ? stacksize : THREAD_STACK_SIZE;
    +256 if (tss != 0) {
    +257 if (pthread_attr_setstacksize(&attrs, tss) != 0) {
    +258 pthread_attr_destroy(&attrs);
    +259 return PYTHREAD_INVALID_THREAD_ID;
    +260 }
    +261 }
    +262 #endif

    It appears asif the call needed (for AIX) - thread_attr_setstacksize(&attrs, tss) should be occurring.

    Can you help me with a quick program that reports back the actual stack size an AIX thread has?

    What may be at the heart of this (assuming the call above is working as expected) - the default memory size for AIX 32-bit is 256MB. Assuming 16 to 32MB is already in use - if I understand this "recurse" logic - after about 12-14 recursions the 256MB is consumed.

    In short (work calls) - it seems the API mentioned is already in place and what is happening in terms of "exception catching" is different.

    Looking forward to hints on how to dig further.

    @ronaldoussoren
    Copy link
    Contributor Author

    That code is only called if THREAD_STACK_SIZE is defined. The block I mention defines it for macOS and FreeBSD, but not for other platforms. I therefore expect that this code is not used on AIX (which would be easy enough to verify by adding an #error to the block you mention.

    @aixtools
    Copy link
    Contributor

    aixtools commented Aug 2, 2019

    Going to take a stab in the dark - the the issue lies here:

    "./Python/errors.c"
    #ifndef Py_NORMALIZE_RECURSION_LIMIT
    #define Py_NORMALIZE_RECURSION_LIMIT 32
    #endif

    As there is not enough memory for this to run in the default memory model.

    However, 32 does not seem to be a counter limit:

    With this "hack" I get success:

            script = """if True:
                import threading
    
                def recurse(loop=0):
                    loop = loop+1
                    if loop < 128+32+8+3:
                        return recurse(loop)
                    else:
                        return
    
                def outer():
                    try:
                        recurse()
                    except RecursionError:
                        pass
    
                w = threading.Thread(target=outer)
                w.start()
                w.join()
                print('end of main thread')
                """

    So, I hope this helps point at where we need to look to find why RecursionError is not being returned when (I assume) memory runs out.

    @aixtools
    Copy link
    Contributor

    aixtools commented Aug 2, 2019

    On 02/08/2019 11:48, Ronald Oussoren wrote:

    Ronald Oussoren <ronaldoussoren@mac.com> added the comment:

    That code is only called if THREAD_STACK_SIZE is defined. The block I mention defines it for macOS and FreeBSD, but not for other platforms. I therefore expect that this code is not used on AIX (which would be easy enough to verify by adding an #error to the block you mention.
    Should have been watching mail - I'll look at this asap.

    ----------


    Python tracker <report@bugs.python.org>
    <https://bugs.python.org/issue18049\>


    @aixtools
    Copy link
    Contributor

    aixtools commented Aug 2, 2019

    On 02/08/2019 11:57, Michael Felt wrote:

    On 02/08/2019 11:48, Ronald Oussoren wrote:
    > Ronald Oussoren <ronaldoussoren@mac.com> added the comment:
    >
    > That code is only called if THREAD_STACK_SIZE is defined. The block I mention defines it for macOS and FreeBSD, but not for other platforms. I therefore expect that this code is not used on AIX (which would be easy enough to verify by adding an #error to the block you mention.
    Should have been watching mail - I'll look at this asap.

    Actually, it is defined for "most" platforms...

    root@x066:[/data/prj/python/python3-3.9]make
    xlc_r -c  -DNDEBUG -O -I/opt/include -O2 -qmaxmem=-1    
    -I../git/python3-3.9/Include/internal -IObjects -IInclude -IPython -I.
    -I../git/python3-3.9/Include -I/opt/include   -DPy_BUILD_CORE -o
    Python/thread.o ../git/python3-3.9/Python/thread.c
    "../git/python3-3.9/Python/thread_pthread.h", line 255.2: 1506-205 (S)
    #error "Not expected"
    make: *** [Makefile:1706: Python/thread.o] Error 1

    At the top of the file:

    /* The POSIX spec requires that use of pthread_attr_setstacksize
       be conditional on _POSIX_THREAD_ATTR_STACKSIZE being defined. */
    #ifdef _POSIX_THREAD_ATTR_STACKSIZE
    #ifndef THREAD_STACK_SIZE
    #define THREAD_STACK_SIZE       0       /* use default stack size */
    #endif

    /* The default stack size for new threads on OSX and BSD is small enough
    that
     * we'll get hard crashes instead of 'maximum recursion depth exceeded'
     * exceptions.
     *
     * The default stack sizes below are the empirically determined minimal
    stack
     * sizes where a simple recursive function doesn't cause a hard crash.
     */

    So, the block is executed - but the size is 0 aka default.

    As a first attempt - I'll take the FreeBSD size going:

    michael@x071:[/data/prj/python/git/python3-3.9]git diff
    diff --git a/Python/thread_pthread.h b/Python/thread_pthread.h
    index 994e35b..83b7e77 100644
    --- a/Python/thread_pthread.h
    +++ b/Python/thread_pthread.h
    @@ -47,6 +47,10 @@
     #undef  THREAD_STACK_SIZE
     #define THREAD_STACK_SIZE       0x400000
     #endif
    +#if defined(_AIX) && defined(THREAD_STACK_SIZE) && THREAD_STACK_SIZE == 0
    +#undef  THREAD_STACK_SIZE
    +#define THREAD_STACK_SIZE       0x400000
    +#endif
     /* for safety, ensure a viable minimum stacksize */
     #define THREAD_STACK_MIN        0x8000  /* 32 KiB */
     #else  /* !_POSIX_THREAD_ATTR_STACKSIZE */

    And with this - the test passes. :)

    @ronaldoussoren
    Copy link
    Contributor Author

    New changeset 9670ce7 by Ronald Oussoren (Michael Felt) in branch 'master':
    bpo-18049: Define THREAD_STACK_SIZE for AIX to pass default recursion limit test (GH-15081)
    9670ce7

    @vstinner
    Copy link
    Member

    The test does now crash on FreeBSD: see bpo-37906.

    @miss-islington
    Copy link
    Contributor

    New changeset f92bb6e by Miss Islington (bot) in branch '3.8':
    bpo-18049: Define THREAD_STACK_SIZE for AIX to pass default recursion limit test (GH-15081)
    f92bb6e

    @aeros
    Copy link
    Contributor

    aeros commented Sep 5, 2019

    Ronald, is it feasible that the changes made in https://github.com/python/cpython/pull/14748/files to THREAD_STACK_SIZE in Python/thread_pthread.h could be causing intermittent failures for the Azure macOS PR tests?

    In a recent PR (#15651), test_pending_calls_race (test.test_concurrent_futures.ThreadPoolWaitTests) seemed to be timing out. See the build log for more details: https://dev.azure.com/Python/cpython/_build/results?buildId=49786&view=logs&j=18d1a34d-6940-5fc1-f55b-405e2fba32b1.

    It doesn't appear to be causing consistent failures for Azure, but the latest commit to PR-15651 was only involving a formatting change to the news entry, so it should not have had any influence on the test that timed out. Prior to that commit, there were no CI failures.

    @tirkarthi
    Copy link
    Member

    @aeros167 The build log failures due to timeout in the linked Azure pipeline failure seem to be similar to report at bpo-37245

    @vstinner
    Copy link
    Member

    vstinner commented Sep 5, 2019

    Ronald, is it feasible that the changes made in https://github.com/python/cpython/pull/14748/files to THREAD_STACK_SIZE in Python/thread_pthread.h could be causing intermittent failures for the Azure macOS PR tests?

    See also bpo-37245 ("Azure Pipeline 3.8 CI: multiple tests hung and timed out on macOS 10.13").

    @ronaldoussoren
    Copy link
    Contributor Author

    I think it is unlikely that the timeouts on Azure CI are related to this change.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life 3.9 only security fixes OS-mac type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants