Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

test_wait4 error on AIX #55394

Closed
sable mannequin opened this issue Feb 11, 2011 · 10 comments
Closed

test_wait4 error on AIX #55394

sable mannequin opened this issue Feb 11, 2011 · 10 comments
Labels
type-bug An unexpected behavior, bug, or error

Comments

@sable
Copy link
Mannequin

sable mannequin commented Feb 11, 2011

BPO 11185
Nosy @pitrou

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2013-07-04.19:30:09.420>
created_at = <Date 2011-02-11.11:39:59.997>
labels = ['type-bug']
title = 'test_wait4 error on AIX'
updated_at = <Date 2013-07-04.19:30:09.406>
user = 'https://bugs.python.org/sable'

bugs.python.org fields:

activity = <Date 2013-07-04.19:30:09.406>
actor = 'pitrou'
assignee = 'neologix'
closed = True
closed_date = <Date 2013-07-04.19:30:09.420>
closer = 'pitrou'
components = []
creation = <Date 2011-02-11.11:39:59.997>
creator = 'sable'
dependencies = []
files = []
hgrepos = []
issue_num = 11185
keywords = []
message_count = 10.0
messages = ['128375', '128727', '130164', '130247', '130291', '130308', '192255', '192305', '192306', '192307']
nosy_count = 5.0
nosy_names = ['pitrou', 'sable', 'neologix', 'python-dev', 'David.Edelsohn']
pr_nums = []
priority = 'normal'
resolution = 'fixed'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue11185'
versions = ['Python 2.7', 'Python 3.3', 'Python 3.4']

@sable
Copy link
Mannequin Author

sable mannequin commented Feb 11, 2011

I get an error when running test_wait4 with trunk on AIX:

test_wait (main.Wait4Test) ... FAIL

======================================================================
FAIL: test_wait (main.Wait4Test)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/san_cis/home/cis/.buildbot/python-aix6/3.x.phenix.xlc/build/Lib/test/fork_wait.py", line 72, in test_wait
    self.wait_impl(cpid)
  File "./Lib/test/test_wait4.py", line 23, in wait_impl
    self.assertEqual(spid, cpid)
AssertionError: 0 != 1486954

Ran 1 test in 12.030s

FAILED (failures=1)
Traceback (most recent call last):
  File "./Lib/test/test_wait4.py", line 32, in <module>
    test_main()
  File "./Lib/test/test_wait4.py", line 28, in test_main
    run_unittest(Wait4Test)
  File "/san_cis/home/cis/.buildbot/python-aix6/3.x.phenix.xlc/build/Lib/test/support.py", line 1145, in run_unittest
    _run_suite(suite)
  File "/san_cis/home/cis/.buildbot/python-aix6/3.x.phenix.xlc/build/Lib/test/support.py", line 1128, in _run_suite
    raise TestFailed(err)
test.support.TestFailed: Traceback (most recent call last):
  File "/san_cis/home/cis/.buildbot/python-aix6/3.x.phenix.xlc/build/Lib/test/fork_wait.py", line 72, in test_wait
    self.wait_impl(cpid)
  File "./Lib/test/test_wait4.py", line 23, in wait_impl
    self.assertEqual(spid, cpid)
AssertionError: 0 != 1486954

Thanks in advance

@sable
Copy link
Mannequin Author

sable mannequin commented Feb 17, 2011

This issue already existed on Python 2.5.2 with AIX 5.2:

http://www.mail-archive.com/python-list@python.org/msg192219.html

The documentation for WNOHANG says:
http://docs.python.org/library/os.html#os.WNOHANG
"""
The option for waitpid() to return immediately if no child process status is available immediately. The function returns (0, 0) in this case.
"""

It seems wait4 always returns 0 on AIX when WNOHANG is specified.

Removing WNOHANG will make the test succeed.

waitpid does not have the same limitation.

I suppose this is a bug of AIX, though there is not even a man page to describe wait4 on this platform.

Here is a proposition for a patch that will workaround this bug...

Index: Lib/test/test_wait4.py
===================================================================

--- Lib/test/test_wait4.py      (revision 88430)
+++ Lib/test/test_wait4.py      (working copy)
@@ -3,6 +3,7 @@
 
 import os
 import time
+import sys
 from test.fork_wait import ForkWait
 from test.support import run_unittest, reap_children, get_attribute
 
@@ -13,10 +14,14 @@
 
 class Wait4Test(ForkWait):
     def wait_impl(self, cpid):
+        option = os.WNOHANG
+        if sys.platform.startswith('aix'):
+            # wait4 is broken on AIX and will always return 0 with WNOHANG
+            option = 0
         for i in range(10):
             # wait4() shouldn't hang, but some of the buildbots seem to hang
             # in the forking tests.  This is an attempt to fix the problem.
-            spid, status, rusage = os.wait4(cpid, os.WNOHANG)
+            spid, status, rusage = os.wait4(cpid, option)
             if spid == cpid:
                 break
             time.sleep(1.0)

@neologix
Copy link
Mannequin

neologix mannequin commented Mar 6, 2011

If test_wait3 and test_fork1 pass, then yes, it's probably an issue with AIX's wait4.
See http://fixunix.com/aix/84872-sigchld-recursion.html:

"""
Replace the wait4() call with a waitpid() call...
....like this:
for(n=0;waitpid(-1, &status, WNOHANG) > 0; n++) ;

Or, compile the existing code with the BSD library:
cc -o demo demo.c -D_BSD -lbsd

Both will work...

The current problem is that child process is not "seen" by the wait4()
call,
so that when "signal" is rearmed, it immediately goes (recursively)
into the
child_handler() function.
"""

So it seems that under AIX, posix_wait4 should be compiled with -D_BSD -lbsd.
Could you try this ?

If this doesn't do the trick, then avoiding passing WNOHANG could be the second option.

@sable
Copy link
Mannequin Author

sable mannequin commented Mar 7, 2011

I had seen that post you mentioned and already tested the -lbsd without success.

wait4 is not even present in libbsd.

phenix:$ nm /usr/lib/libbsd.a | grep wait
phenix:
$

Maybe it was present on older versions of the system. But I couldn't find any documentation mentioning wait4 and -lbsd anywhere.

Actually wait4 is never mentioned in the IBM documentation concerning AIX.

wait4 without WNOHANG works fine. waitpid works fine even with WNOHANG.
I don't know which workaround is the better.
I will also try to report this bug to IBM so that a future version of AIX could work correctly.

@neologix
Copy link
Mannequin

neologix mannequin commented Mar 7, 2011

wait4 without WNOHANG works fine. waitpid works fine even with WNOHANG.
I don't know which workaround is the better.

As far as the test is concerned, it's of course better to use wait4
without WNOHANG in a test names test_wait4 (especially since waitpid
is tested elsewhere)...

@sable
Copy link
Mannequin Author

sable mannequin commented Mar 8, 2011

Yes, for the test, as I put in msg128727, it works fine by removing WNOHANG.

However I should put a note in the AIX-NOTES file to explain that wait4 is broken with WNOHANG on AIX and suggest the 2 workarounds.

@DavidEdelsohn DavidEdelsohn mannequin added the type-bug An unexpected behavior, bug, or error label Jun 19, 2013
@DavidEdelsohn
Copy link
Mannequin

DavidEdelsohn mannequin commented Jul 3, 2013

The patch in msg128727 is correct for AIX and should be applied.

@neologix neologix mannequin self-assigned this Jul 4, 2013
@python-dev
Copy link
Mannequin

python-dev mannequin commented Jul 4, 2013

New changeset b3ea1b5a1617 by Antoine Pitrou in branch '3.3':
Issue bpo-11185: Fix test_wait4 under AIX. Patch by Sébastien Sablé.
http://hg.python.org/cpython/rev/b3ea1b5a1617

New changeset 8055521e372f by Antoine Pitrou in branch 'default':
Issue bpo-11185: Fix test_wait4 under AIX. Patch by Sébastien Sablé.
http://hg.python.org/cpython/rev/8055521e372f

@python-dev
Copy link
Mannequin

python-dev mannequin commented Jul 4, 2013

New changeset e3fd5fc5dc47 by Antoine Pitrou in branch '2.7':
Issue bpo-11185: Fix test_wait4 under AIX. Patch by Sébastien Sablé.
http://hg.python.org/cpython/rev/e3fd5fc5dc47

@pitrou
Copy link
Member

pitrou commented Jul 4, 2013

Thank you. This should be fixed now. Please reopen if not.

@pitrou pitrou closed this as completed Jul 4, 2013
@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

1 participant