Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

is-prime test failure in rakudo-star-2013.02 on OS X 10.8.2 #3111

Closed
p6rt opened this issue Apr 23, 2013 · 10 comments
Closed

is-prime test failure in rakudo-star-2013.02 on OS X 10.8.2 #3111

p6rt opened this issue Apr 23, 2013 · 10 comments

Comments

@p6rt
Copy link

@p6rt p6rt commented Apr 23, 2013

Migrated from rt.perl.org#117731 (status was 'resolved')

Searchable as RT117731$

@p6rt
Copy link
Author

@p6rt p6rt commented Apr 23, 2013

From mykle@mykle.com

Hi. on the perl6 IRC channel, it was suggested i open a bug on this test failure, in is-prime.t :

mykle@​K9-2 ~/Documents/netxposure/tmp/rakudo-star-2013.02/rakudo » make t/spec/S32-num/is-prime.t
t/spec/S32-num/is-prime.t ..
1..35
not ok 1 - 45724385972894572891 is not prime
not ok 2 - 45724385972894572891 is still not prime
not ok 3 - 45724385972894572891 is still not prime
not ok 4 - 45724385972894572891 is still not prime
not ok 5 - Method form gets primes < 100 correct
# got​: '2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100'
# expected​: '2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97'
not ok 6 - Sub form gets primes < 100 correct
# got​: '2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 97'
# expected​: '2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83 89 97'
ok 7 - 45724385972894572891 is still not prime
ok 8 - 45724385972894572891 is still not prime
ok 9 - 45724385972894572891 is still not prime
ok 10 - 45724385972894572891 is still not prime
ok 11 - 2801 is a prime (method)
ok 12 - 2801 is a prime (sub)
ok 13 - 104743 is a prime (method)
ok 14 - 104743 is a prime (sub)
ok 15 - 105517 is a prime (method)
ok 16 - 105517 is a prime (sub)
ok 17 - 1300129 is a prime (method)
ok 18 - 1300129 is a prime (sub)
ok 19 - 15485867 is a prime (method)
ok 20 - 15485867 is a prime (sub)
ok 21 - 179424691 is a prime (method)
ok 22 - 179424691 is a prime (sub)
ok 23 - 32416187773 is a prime (method)
ok 24 - 32416187773 is a prime (sub)
ok 25 - 0 is not a prime (method)
ok 26 - 0 is not a prime (sub)
ok 27 - 32416187771 is not a prime (method)
ok 28 - 32416187771 is not a prime (sub)
ok 29 - 32416187772 is not a prime (method)
ok 30 - 32416187772 is not a prime (sub)
ok 31 - 32416187775 is not a prime (method)
ok 32 - 32416187775 is not a prime (sub)
ok 33 - 170141183460469231731687303715884105727 is prime
ok 34 - 170141183460469231731687303715884105725 is not prime
ok 35 - M13 is prime
# Looks like you failed 6 tests of 35
Failed 6/35 subtests

Test Summary Report
-------------------
t/spec/S32-num/is-prime.t (Wstat​: 0 Tests​: 35 Failed​: 6)
Failed tests​: 1-6
Files=1, Tests=35, 1 wallclock secs ( 0.03 usr 0.01 sys + 1.03 cusr 0.12 csys = 1.19 CPU)
Result​: FAIL

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented Apr 28, 2013

From @Util

On Tue Apr 23 13​:48​:44 2013, mykle@​mykle.com wrote​:

Hi. on the perl6 IRC channel, it was suggested i open a bug on this
test failure, in is-prime.t :
--snip--

t/spec/S32-num/is-prime.t (Wstat​: 0 Tests​: 35 Failed​: 6)
Failed tests​: 1-6

I have experienced the same bug on OS X 10.8.3.
I have found the problem in NQP, and am testing a solution.
( nqp_bigint.ops / nqp_bigint_is_prime )

--
Bruce Gray (Util on IRC and PerlMonks)

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented Apr 28, 2013

The RT System itself - Status changed from 'new' to 'open'

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented Apr 30, 2013

From @Util

Here is more info on the symptom. When I run the code below, it always
gives incorrect output. The output is usually as shown below in Run#​1,
but sometimes as in Run#​2. The two runs below really did happen with
exactly the same code, back-to-back, run within 1 second of each other.

(Lines wrapped for clarity)
### Run#​1
  $ ./perl6 -e 'my @​a = (1..200).grep({ .is-prime }); say @​a;'
  2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83
  89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167
  171
  172
  173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188
  189 190 191 192 193 194 195 196 197 198 199 200

### Run#​2
  $ ./perl6 -e 'my @​a = (1..200).grep({ .is-prime }); say @​a;'
  2 3 5 7 11 13 17 19 23 29 31 37 41 43 47 53 59 61 67 71 73 79 83
  89 97 101 103 107 109 113 127 131 137 139 149 151 157 163 167
  172
  173 174 175 176 177 178 179 180 181 182 183 184 185 186 187 188
  189 190 191 192 193 194 195 196 197 198 199 200

Note that at 172, *even* numbers start creeping in. Definitely not prime!
Also, the second run omitted 171, even though the code was identical.

### Vitals​:
  OS X 10.8.3
  Rakudo ef73eb9 (Apr 22)
  NQP bfb3669 (Apr 20)
  Parrot 8874a86 (Apr 16) RELEASE_5_3_0
  GCC i686-apple-darwin11-llvm-gcc-4.2 (GCC) 4.2.1 (Based on Apple
Inc. build 5658) (LLVM build 2336.11.00)

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented Apr 30, 2013

From @Util

This is the NQP program I was using to probe the problem.

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented Apr 30, 2013

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented Apr 30, 2013

From @Util

The fix is in commit 16fa719.
I would appreciate testing by anyone who has experienced the problem.

The problem was in src/vm/parrot/ops/nqp_bigint.ops .
Original line​:
- mp_prime_is_prime(a, $3, (int *) &$1);
Replacement lines​:
+ int result;
+ mp_prime_is_prime(a, $3, &result);
+ $1 = result;

My take/guess on what was happening​:

On my system, C's (int)s are 32-bit, but Parrot's I-registers are
64-bit. When we pass "(int *) &$1" as a argument to be populated by
mp_prime_is_prime, we pass an address of something larger (64) than the
function is expecting to receive (32), and lying to say that it is in
fact the expected size (32). mp_prime_is_prime then only writes over
the low half of the register.

If the high half of the register happens to be 0, then everything works
as expected. When the program runs long enough for the register to be
2**32 or higher, then only the bottom half is set to 0 or 1, and you
get a register value like 0xcafe0001 or 0xcafe0000.

If anyone thinks my analysis is in error, even after this ticket is
closed, please let me know. It would enhance my understanding of 32/64
bit issues.

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented May 3, 2013

From @nwc10

On Tue Apr 30 15​:44​:27 2013, util wrote​:

The fix is in commit 16fa719.
I would appreciate testing by anyone who has experienced the problem.

Yes, fixes the problem for me on FreeBSD and OS X.
I applied the patch to the earliest revisions that I saw failing on
both platforms and it resolves it.

The problem was in src/vm/parrot/ops/nqp_bigint.ops .
Original line​:
- mp_prime_is_prime(a, $3, (int *) &$1);
Replacement lines​:
+ int result;
+ mp_prime_is_prime(a, $3, &result);
+ $1 = result;

My take/guess on what was happening​:

On my system, C's (int)s are 32-bit, but Parrot's I-registers are
64-bit. When we pass "(int *) &$1" as a argument to be populated by
mp_prime_is_prime, we pass an address of something larger (64) than the
function is expecting to receive (32), and lying to say that it is in
fact the expected size (32). mp_prime_is_prime then only writes over
the low half of the register.

If the high half of the register happens to be 0, then everything works
as expected. When the program runs long enough for the register to be
2**32 or higher, then only the bottom half is set to 0 or 1, and you
get a register value like 0xcafe0001 or 0xcafe0000.

If anyone thinks my analysis is in error, even after this ticket is
closed, please let me know. It would enhance my understanding of 32/64
bit issues.

I think your analysis is spot on. We had quite a few problems like
this in the perl 5 core code. I think that it was mostly Jarkko
hitting them first and fixing them, because he was one of the first
people building on Alpha, when everyone else was still on 32 bit.

Any cast to a pointer of a different size (or potentially a different
size) is a red flag and a potential bug. If the cast-to value is
smaller (such as this case), writing won't completely change the
pointed-to variable. If it's larger, then writing will scribble on
something else. IIRC the bugs tend to show up more easily on
big-endian systems, because on a little endian system (for example)
the least significant 4 bytes of an 8 byte value start at the same
address as the 8 byte value. On a big endian system, the the address
of the least significant 4 bytes differ by 4, so passing a pointer
to the wrong sized thing on a big-endian system will often read 0
(from the more significant 4 bytes) where the programmer was
expecting something non-zero, resulting in immediately visible buggy
behaviour.

It's also strange that the bug had all the symptoms of a GC bug
(in that when I ran parrot with -G it went away), but with hindsight
I suspect that that is due to whether heap memory is being re-used, or
fresh zeroed memory is being allocated by the OS.

Nicholas Clark

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented Nov 27, 2014

From @usev6

As this was resolved back in 2013, I'm closing the ticket.

Loading

@p6rt
Copy link
Author

@p6rt p6rt commented Nov 27, 2014

@usev6 - Status changed from 'open' to 'resolved'

Loading

@p6rt p6rt closed this Nov 27, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
1 participant