Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_multiprocessing_fork probabilistic failure #91405

Open
sxt1001 mannequin opened this issue Apr 7, 2022 · 5 comments
Open

test_multiprocessing_fork probabilistic failure #91405

sxt1001 mannequin opened this issue Apr 7, 2022 · 5 comments
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes tests Tests in the Lib/test dir topic-multiprocessing type-bug An unexpected behavior, bug, or error

Comments

@sxt1001
Copy link
Mannequin

sxt1001 mannequin commented Apr 7, 2022

BPO 47249
Files
  • test hang on.png
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2022-04-07.13:41:52.675>
    labels = ['type-bug', 'tests', '3.9', '3.10', '3.11']
    title = 'test_multiprocessing_fork probabilistic failure'
    updated_at = <Date 2022-04-07.13:50:37.734>
    user = 'https://bugs.python.org/sxt1001'

    bugs.python.org fields:

    activity = <Date 2022-04-07.13:50:37.734>
    actor = 'sxt1001'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Tests']
    creation = <Date 2022-04-07.13:41:52.675>
    creator = 'sxt1001'
    dependencies = []
    files = ['50726']
    hgrepos = []
    issue_num = 47249
    keywords = []
    message_count = 2.0
    messages = ['416926', '416927']
    nosy_count = 1.0
    nosy_names = ['sxt1001']
    pr_nums = []
    priority = 'normal'
    resolution = None
    stage = None
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue47249'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @sxt1001
    Copy link
    Mannequin Author

    sxt1001 mannequin commented Apr 7, 2022

    This problem is probabilistic. I run Python3 all test cases on OBS. In recent months, there have been about three times that the test cases hang on probabilistically. The current version with problems is 3.10.2, this problem has also appeared in 3.10.0 and 3.9.9.

    @sxt1001 sxt1001 mannequin added 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes tests Tests in the Lib/test dir type-bug An unexpected behavior, bug, or error labels Apr 7, 2022
    @sxt1001
    Copy link
    Mannequin Author

    sxt1001 mannequin commented Apr 7, 2022

    I previously suspected the patch d0d83a9 (d0d83a9) fixed this problem, so I upgraded python3 to 3.10.2, but the problem still occurred.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @xiaoge1001
    Copy link

    The problem occurred again on these two days. There are a lot of errors after forcibly interrupting the task.Please refer to the attachment for details. It has appeared on the arm architecture machine several times recently.
    python3_build.log

    @zuhu2195
    Copy link

    zuhu2195 commented Jul 8, 2023

    Hi,
    can you try adding time.sleep(0) in _handle_tasks, _handle_workers and _handle_results functions at the start of these functions in pool.py file.

    @gpshead
    Copy link
    Member

    gpshead commented Jul 10, 2023

    We are not seeing this issue in our CI systems or buildbots that I'm aware of. Without an ability to reproduce it and confirm that an issue exists on the main branch so we can gather diagnostic data, I lean towards closing this issue as not reproduceable / not planned.

    As for a fix, time.sleep(0) is bad a code smell. It doesn't do anything deterministic. So it can't solve a problem, only shift the probability of one happening by maybe triggering the OS to switch tasks or a for different Python thread to get the GIL. The underlying problem, if one exists, would still exist.

    (I don't actually doubt that there are such issues in the multiprocessing and concurrent.futures test suites or even within multiprocessing itself, just that they're extremely hard to find and debug at this point)

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes tests Tests in the Lib/test dir topic-multiprocessing type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    4 participants