-
Notifications
You must be signed in to change notification settings - Fork 555
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test t/io/sem fails on Mac PPC64 #22415
Comments
Can you determine at what commit or release of Perl this test failure first appeared on your machine? |
Do you see these failures in older perls? There hasn't been a substantive change here since November 2020. |
I notice that the OP's system is big-endian, and wonder whether the problem is somehow related to the "64 bit big-endian semctl SETVAL bug" mentioned a few lines earlier (line 53) in t/io/sem.t. |
I had noticed that same remark, but have not yet had time to dig into this any further. Given how long it takes to build stuff on a 2GHz machine from 20 years ago, I'd like to strategically build as few distinct versions of Perl as I can to get the answer to jkeenan's and tonycoz' questions above. Does anyone know offhand where I'd look to find out what version of Perl was being worked on when those last substantive changes were made? And do you want to hear about odd-numbered development releases or just the even-numbered production releases? |
The change I was thinking of was d43c116 which was first in 5.34.
5.even.0 seems reasonable to me. If you're working from a git checkout
I don't think so, with the change that fixed that (64d7628 first in 5.20.0) we always go through the |
I have now verified that 5.34.0 produces the same result. Currently waiting for free time in which to test 5.32.1. |
(A comment on my earlier concern about build times... I found that the actual builds go quite quickly indeed; but the tests, on the other hand, take a very long time simply because there are so many of them.) |
Differing somewhat from 5.34.0, 5.32.1 produces similar results to every subsequent version, but the specifics of the failure are different:
Further investigation reveals that the difference in expected values was merely a change in the constants in the test file, and test 9 didn't fail because it didn't exist yet. Ultimately, there was in fact no change. I will continue testing my way backwards in time to see if this ever does change at any point. My immediately-next tests will be of 5.20.0 and 5.18.0, bracketing when that big-endian-64-bit bug was stated to have been addressed... assuming that the test existed at that time. If it didn't, I will copy the version from 5.34.0 (which is identical to the current one, save for using slightly older syntax at the Config import). At some version, I will not be able to go further backwards, as 64-bit Mac OS was not a thing before a certain point and I can't see how Perl could have been ready to compile for it in advance. “No time machines were harmed in the making of this software”, etc. ...It will be interesting to see just how far it can be pushed back, around or even past that point in history. |
Well, 5.18.0 suffers a wide array of failures in this test, as well as raising numerous compiler warnings that were improved away in later editions of Perl. (I did indeed need to copy the t/io/sem.t from Perl 5.34, as well as editing it to set |
You can run just the
If you've already built perl you can just do:
which is safe for this test (but not all tests) You might be running into the limits of testing older versions. With blead or 5.40, please add the two lines as follows:
and run:
The result should include a dump of
which we can use to see if any semaphore values are coming through at all. If this isn't returning something some sort of sensible we'll need to:
|
Tonycoz, your suspicions were correct -- I tried your variable-dumping recommendation, using the test file from 5.40.0 with the minimum changes needed to make it parse in each case, on numerous versions of perl going back to 5.12.5; and everything since 5.20.0 seems to produce a vector composed entirely of nulls for the semaphore values. I’m pretty sure that’s not what it’s supposed to look like. |
Unsurprisingly, 5.41.2 gave exactly the same results as well. |
I have also verified that 32-bit PPC builds behave correctly in this test. |
Are you building perl from tarballs or from a git clone? |
Apologies, I did not see your reply until now. I had been building from tarballs, but I was recently introduced to the wonders of Porting/bisect.pl so I will be running that tonight. |
I had significant delays due to an ill-advised Git upgrade that went awry, but I finally have a result here. Unsurprisingly, bisection reveals the first commit that returns all zeroes on the semaphore tests on my 64-bit Power Mac to be #64d7628235943ff18939a1ff98ace513aeb5260c, from December 2013. The log message says, “Fixes the case where on 64bit big-endian boxes, calls to semctl(id,semnum,SETVAL,$wantedval) will ignore the passed in $wantedval, and always use 0”. The actual content of the commit is almost entirely contained in the test case. The code change to Perl itself consists of seven lines added to, and two removed from, doio.c. The evidence suggests this fix was not entirely correct, but I do not know what details would explain how or why. I am insufficiently versed in Git to extract what doio.c looked like before vs. after that commit. |
Putting this here to get a fancy github link: 64d7628
To get a diff, try diff --git doio.c doio.c
index 98e2c42024..b39c5876bd 100644
--- doio.c
+++ doio.c
@@ -2155,11 +2155,16 @@ Perl_do_ipcctl(pTHX_ I32 optype, SV **mark, SV **sp)
#ifdef Semctl
union semun unsemds;
+ if(cmd == SETVAL) {
+ unsemds.val = PTR2nat(a);
+ }
+ else {
#ifdef EXTRA_F_IN_SEMUN_BUF
- unsemds.buff = (struct semid_ds *)a;
+ unsemds.buff = (struct semid_ds *)a;
#else
- unsemds.buf = (struct semid_ds *)a;
+ unsemds.buf = (struct semid_ds *)a;
#endif
+ }
ret = Semctl(id, n, cmd, unsemds);
#else
/* diag_listed_as: sem%s not implemented */ Otherwise: git show 64d7628235943ff18939a1ff98ace513aeb5260c:doio.c >doio-after-commit.c
git show 64d7628235943ff18939a1ff98ace513aeb5260c^:doio.c >doio-before-commit.c |
The change itself looks correct to me, though it's technically undefined behaviour (making a pointer from a random integer, even if we don't dereference it). Before the change the integer (like the test
on PPC64, but because SETVAL expects the 4 byte/32-bit After the change it would convert the integer to a pointer and then back to and integer and store that value in I suspect a compiler bug. You might try:
|
I copied the function’s code into a new file, looked up all of the preprocessor macros and equates, and produced four copies of the function body, revealing what the C compiler actually sees for each of the four command values used by Perl. I haven't finished yet – for one thing, I have yet to produce a fifth copy following the logic if any other value is used – but already I've found two “showstopper” bugs in that commit that make it only work by accident, if it works at all. I have yet to check whether they are still present in blead. I will report further when I get it all finished. |
Yup, both problems are still present in blead. First, on doio.c line 3194, the command tested for is SETVAL, rather than the expected IPC_SET. Second, on line 3195, the pointer to the SV is cast to an integer instead of an IV being extracted from the SV. From what I can see, that's not how it's supposed to work at this level of abstraction. (Compare with the similar operation at lines 3173-3176, carried out for unrecognized commands.) |
Fixing both of those things caused the semaphore test to pass without a hitch. An attempt to copy the patch by hand follows:
|
I should add that my patch-and-verify was carried out against blead. Is there a way to turn that post into a pull request without having to start over from scratch? |
d'oh! That should have been |
If you step through the original code for the mis-behaving call, does it go via the From my understanding of the code
If this code is being executed on your system it should be setting unsemds.val correctly. If this code isn't being executed on your system we have another problem. Note that this code has been tested on other 64-bit big-endian hardware. As written your suggested change invokes magic twice on the parameter (changing your suggestion to SvIV_nomg() could still invoke overload magic a second time). |
I had examined my changes in light of the way the code looked at the time of the old edit we were discussing, and in my enthusiasm did not notice just what else had changed in the rest of the function between then and now. Worse, I now see I completely misunderstood the usage of this function; the test for SETVAL was correct. I still maintain that the value in the SV was the intended operand and not its raw address, but I missed the bit you point out where it had already been extracted (a peril of getting the revisions muddled while flipping between files). Back to the drawing board. You are completely correct that the changes I proposed should not have fixed the problem as comprehensively as is the case. I am currently investigating further. |
There seems no way for me to debug this properly, as GDB does not appear to have ever been built to understand 64-bit PowerPC under Darwin – only under Linux, which apparently has different calling conventions – and I can't find mention of any other applicable debugger, but as far as I can tell, you're almost certainly right about it being a deficiency in the compiler. The only reason my "fix" seemed to work is that it covered up the compiler bug with a control-flow bug, making everything be treated as pointers regardless of what was supposed to be done, and therefore not suffering alignment issues because the sizes matched. In short, this is not a Perl bug. Marking it closed. |
Description
It was easy enough, with minor tweaks to CFLAGS, to build Perl in 64-bit mode on my Power Mac G5 – but something is not quite right in the semaphore handling. Running the test suite produces failures with no obvious cause at steps 6, 7, and 9 of t/io/sem.t. The exact output is:
Steps to Reproduce
From the top-level directory,
Expected behavior
All three sub-tests are apparently supposed to return the value 192.
Perl configuration
The text was updated successfully, but these errors were encountered: