Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Timeouts on "x86 Ubuntu Shared 3.x" buildbot #67959

Closed
vstinner opened this issue Mar 25, 2015 · 7 comments
Closed

Timeouts on "x86 Ubuntu Shared 3.x" buildbot #67959

vstinner opened this issue Mar 25, 2015 · 7 comments

Comments

@vstinner
Copy link
Member

BPO 23771
Nosy @vstinner, @applio

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2015-03-30.20:18:58.811>
created_at = <Date 2015-03-25.02:42:02.819>
labels = []
title = 'Timeouts on "x86 Ubuntu Shared 3.x" buildbot'
updated_at = <Date 2015-03-30.20:18:58.809>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2015-03-30.20:18:58.809>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2015-03-30.20:18:58.811>
closer = 'vstinner'
components = []
creation = <Date 2015-03-25.02:42:02.819>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 23771
keywords = []
message_count = 7.0
messages = ['239214', '239217', '239346', '239469', '239470', '239540', '239639']
nosy_count = 3.0
nosy_names = ['vstinner', 'Arfrever', 'davin']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = None
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue23771'
versions = ['Python 3.5']

@vstinner
Copy link
Member Author

First timeout:
http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/11358

This build was only triggered by one changeset 0b99d7043a99: "Issue bpo-23694: Enhance _Py_open(), it now raises exceptions".

  • _Py_open() now raises exceptions on error. If open() fails, it raises an OSError with the filename.
  • _Py_open() now releases the GIL while calling open()
  • Add _Py_open_noraise() when _Py_open() cannot be used because the GIL is not held

Example of subprocess timeout:
---
Timeout (1:00:00)!
Thread 0x55aaedc0 (most recent call first):
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/subprocess.py", line 1502 in _try_wait
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/subprocess.py", line 1552 in wait
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_subprocess.py", line 58 in tearDown
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 580 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 625 in __call__
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/runner.py", line 176 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/init.py", line 1773 in _run_suite
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/init.py", line 1807 in run_unittest
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_subprocess.py", line 2532 in test_main
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 1284 in runtest_inner
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 967 in runtest
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 763 in main
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 1568 in main_in_temp_cwd
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/main.py", line 3 in <module>
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 85 in _run_code
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 170 in _run_module_as_main
make: *** [buildbottest] Error 1
---

Example of multiprocessing timeout:
---
[240/393] test_multiprocessing_spawn
Timeout (1:00:00)!
Thread 0x68fddb40 (most recent call first):
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 379 in _recv
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 407 in _recv_bytes
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 250 in recv
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 717 in _callmethod
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 955 in wait
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/_test_multiprocessing.py", line 834 in f
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 871 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 923 in _bootstrap_inner
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 891 in _bootstrap

Thread 0x58a5ab40 (most recent call first):
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 379 in _recv
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 407 in _recv_bytes
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 250 in recv
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 717 in _callmethod
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 955 in wait
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/_test_multiprocessing.py", line 834 in f
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 871 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 923 in _bootstrap_inner
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/threading.py", line 891 in _bootstrap

(...)

Thread 0x55aaedc0 (most recent call first):
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 379 in _recv
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 407 in _recv_bytes
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/connection.py", line 250 in recv
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 717 in _callmethod
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/multiprocessing/managers.py", line 943 in acquire
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/_test_multiprocessing.py", line 933 in test_notify_all
...
---

@vstinner
Copy link
Member Author

test_multiprocessing_spawn.test_notify_all() also hangs on "AMD64 Debian root 3.x" buildbot:

http://buildbot.python.org/all/builders/AMD64%20Debian%20root%203.x/builds/1959/steps/test/logs/stdio
---
[330/393] test_multiprocessing_spawn
Timeout (1:00:00)!
Thread 0x00002b303f269700 (most recent call first):
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/multiprocessing/synchronize.py", line 262 in wait
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/_test_multiprocessing.py", line 834 in f
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 871 in run
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 923 in _bootstrap_inner
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 891 in _bootstrap

Thread 0x00002b303d918700 (most recent call first):
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/multiprocessing/synchronize.py", line 262 in wait
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/_test_multiprocessing.py", line 834 in f
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 871 in run
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 923 in _bootstrap_inner
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 891 in _bootstrap

Thread 0x00002b3060947700 (most recent call first):
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/multiprocessing/synchronize.py", line 262 in wait
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/_test_multiprocessing.py", line 834 in f
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 871 in run
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 923 in _bootstrap_inner
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/threading.py", line 891 in _bootstrap

Thread 0x00002b303289db20 (most recent call first):
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/test_multiprocessing.py", line 933 in test_notify_all
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/case.py", line 577 in run
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/case.py", line 625 in __call
_
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 122 in run
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 84 in __call__
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 122 in run
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 84 in __call__
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 122 in run
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/suite.py", line 84 in __call__
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/unittest/runner.py", line 176 in run
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/support/init.py", line 1773 in _run_suite
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/support/init.py", line 1807 in run_unittest
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 1283 in test_runner
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 1284 in runtest_inner
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 967 in runtest
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 763 in main
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/regrtest.py", line 1568 in main_in_temp_cwd
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/test/main.py", line 3 in <module>
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/runpy.py", line 85 in _run_code
File "/root/buildarea/3.x.angelico-debian-amd64/build/Lib/runpy.py", line 170 in _run_module_as_main
---

@vstinner
Copy link
Member Author

A new failure test_subprocess.test_double_close_on_error:

http://buildbot.python.org/all/builders/x86%20Ubuntu%20Shared%203.x/builds/11411/steps/test/logs/stdio
---
Timeout (1:00:00)!
Thread 0x55aafdc0 (most recent call first):
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/subprocess.py", line 1407 in _execute_child
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/subprocess.py", line 855 in __init__
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_subprocess.py", line 1074 in test_double_close_on_error
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 577 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/case.py", line 625 in __call__
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 122 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/suite.py", line 84 in __call__
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/unittest/runner.py", line 176 in run
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/init.py", line 1773 in _run_suite
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/support/init.py", line 1807 in run_unittest
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/test_subprocess.py", line 2532 in test_main
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 1284 in runtest_inner
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 967 in runtest
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 763 in main
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/regrtest.py", line 1568 in main_in_temp_cwd
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/test/main.py", line 3 in <module>
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 85 in _run_code
File "/srv/buildbot/buildarea/3.x.bolen-ubuntu/build/Lib/runpy.py", line 170 in _run_module_as_main
make: *** [buildbottest] Error 1
---

@applio
Copy link
Member

applio commented Mar 29, 2015

@Haypo: Could you please take a look at applying my patches from bpo-23713 to address an intermittent multiprocessing test failure? I don't know that it is the sole cause, but it is indeed one potential source of the misbehavior.

@applio
Copy link
Member

applio commented Mar 29, 2015

@Haypo: I didn't think that one through -- the consequences of that fragile test are not a hang. It's unrelated.

@vstinner
Copy link
Member Author

This build was only triggered by one changeset 0b99d7043a99: "Issue bpo-23694: Enhance _Py_open(), it now raises exceptions".

I was reproduce to issue on a buildbot and I got access to the buildbot. Using gdb, I saw that a process was stuck in _close_open_fds_safe(). The problem is that this function is called after fork() to run a child process. It's not safe to play with the GIL here.

This bug is the regression which makes some buildbots to hang.

I fixed the bug in the changeset 2e1234208bab.

I keep the issue open a few days to check if buildbots are repaired or not.

@vstinner
Copy link
Member Author

Last 5 builds of "x86 Ubuntu Shared 3.x" and "AMD64 Debian root 3.x" buildbots are green (success). The sporadic hang is gone! I close the issue.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants