-
Notifications
You must be signed in to change notification settings - Fork 560
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sort{$a<=>$b} fails to sort occasionally #15768
Comments
From tails.saito@gmail.comThis is a bug report for perl from tails.saito@gmail.com, Greetings, sort{$a<=>$b} fails to sort a list of strings numerically in some cases. Code: Output: However, if I use {0;$a<=>$b} instead of {$a<=>$b}, Code: Output: I confirmed the above matter happens with: I noticed the above matter does NOT happen with: Thank you for your attention. Flags: Site configuration information for perl 5.22.2: Configured by ASSI at Sat Apr 30 15:59:30 CEST 2016. Summary of my perl5 (revision 5 version 22 subversion 2) configuration: Platform: @INC for perl 5.22.2: Environment for perl 5.22.2: |
From @jkeenanOn Tue, 13 Dec 2016 03:33:59 GMT, tails.saito@gmail.com wrote:
Here's my understanding of the problem. When I saw these two strings: ##### ... my suspicions were aroused. I wondered if, when perl performed its string-to-number conversion, the resulting number was "too big". In 'perldoc perlnumber' (for perl-5.24.0), we have: ##### On my machine, at least, your strings, when converted to numbers have 17 decimal digits and at least 53 binary digits. ##### Now, as the attachment demonstrates, it appears that the numeric comparison ("spaceship") operator, '<=>', can *by itself* handle comparisons of numbers in that range okay. But when used inside a sort block, it fails with big numbers. Here's the output of the attachment when run on my machine: ##### # Failed test 'strings holding numbers sorted as numbers (returning numbers) (some numbers in bigint range)' Test Summary Report 130335-sort-string-numerically.t (Wstat: 256 Tests: 10 Failed: 1) I do not know enough about the built-in 'sort' function to say why this is so. Thank you very much. -- |
From @jkeenan# perl my @intstrings; @intstrings = ( @numsorteds = sort { $a <=> $b } @numstrings; ##### @numstrings = ( @sorteds = sort @numstrings; @numsorteds = sort { $a <=> $b } @numstrings; is( ('300000' <=> '300001'), -1, "strings holding numbers numerically compared: -1"); is( ('20000000000000001' <=> '20000000000000002'), -1, done_testing(); |
The RT System itself - Status changed from 'new' to 'open' |
From @tonycozOn Tue, 13 Dec 2016 14:38:05 -0800, jkeenan wrote:
You've got the right idea here. It looks like the issue is in S_sv_ncmp in pp_sort.c When the compiler sees code like: @x = sort { $a <=> $b } @y it marks the sort op to perform the comparison internally rather than using the <=> op. Unfortunately that internal comparison is doing a simple SvNV() <=> SvNV() comparison rather than the more careful comparison that the real <=> op does. Working on a patch. Tony |
From @hvdsOn Mon, 12 Dec 2016 19:33:59 -0800, tails.saito@gmail.com wrote:
There is an optimization that detects specific common sort subs such as { $a <=> $b } and substitutes a faster equivalent written in C; in this case that uses the function S_sv_ncmp() in pp_sort.c. A quick glance at that shows that it is using its internally defined macro SvNSIV() to extract the numeric value, which always prefers an NV (floating point value) to an IV (integer value); this appears to be the opposite way round to the implementation of the spaceship operator (Perl_do_ncmp() in pp.c), but I'm not sure why. In any case it does explain the results you get - your NV clearly doesn't have enough precision to distinguish the two values - and also why your workaround (dodging the optimization by inserting "0;" in the sort sub) avoids the problem. It's been working this way since 2005; before that it used NV only. That suggests it might be tricky to change. I'd class it as clearly a bug though. Hugo |
From zefram@fysh.orgJames E Keenan via RT wrote:
The main built-in numeric comparison for sorting, S_sv_ncmp() in Bizarrely, it seems that the "0" element is the one that's not The logic for whether IV comparisons are OK on a specific value for But regardless of that per-value logic, pp_sort() errs in falling back -zefram |
From @tonycozOn Tue, 13 Dec 2016 15:13:22 -0800, tonyc wrote:
Attached. Tony |
From @tonycoz0001-perl-130335-fix-numeric-comparison-for-sort-s-built-.patchFrom f1ec0e92b0c10828bda3762df4706fa4d01c6573 Mon Sep 17 00:00:00 2001
From: Tony Cook <tony@develop-help.com>
Date: Wed, 14 Dec 2016 14:24:08 +1100
Subject: (perl #130335) fix numeric comparison for sort's built-in compare
For non-'use integer' this would always compare as NVs, but with
64-bit integers and non-long doubles, integers can have more
significant digits, making the sort <=> replacement less precise
than the <=> operator.
Use the same code to perform the comparison that <=> does, which
happens to be handily broken out into Perl_do_ncmp().
---
pp_sort.c | 12 ++++--------
t/lib/warnings/9uninit | 2 +-
t/op/sort.t | 15 ++++++++++++++-
3 files changed, 19 insertions(+), 10 deletions(-)
diff --git a/pp_sort.c b/pp_sort.c
index 68e65f9..4ffe224 100644
--- a/pp_sort.c
+++ b/pp_sort.c
@@ -1888,20 +1888,16 @@ S_sortcv_xsub(pTHX_ SV *const a, SV *const b)
static I32
S_sv_ncmp(pTHX_ SV *const a, SV *const b)
{
- const NV nv1 = SvNSIV(a);
- const NV nv2 = SvNSIV(b);
+ I32 cmp = do_ncmp(a, b);
PERL_ARGS_ASSERT_SV_NCMP;
-#if defined(NAN_COMPARE_BROKEN) && defined(Perl_isnan)
- if (Perl_isnan(nv1) || Perl_isnan(nv2)) {
-#else
- if (nv1 != nv1 || nv2 != nv2) {
-#endif
+ if (cmp == 2) {
if (ckWARN(WARN_UNINITIALIZED)) report_uninit(NULL);
return 0;
}
- return nv1 < nv2 ? -1 : nv1 > nv2 ? 1 : 0;
+
+ return cmp;
}
static I32
diff --git a/t/lib/warnings/9uninit b/t/lib/warnings/9uninit
index c8b843f..77a93ce 100644
--- a/t/lib/warnings/9uninit
+++ b/t/lib/warnings/9uninit
@@ -651,8 +651,8 @@ Use of uninitialized value $m1 in sort at - line 6.
Use of uninitialized value $g1 in sort at - line 6.
Use of uninitialized value $m1 in sort at - line 7.
Use of uninitialized value $g1 in sort at - line 7.
-Use of uninitialized value $m1 in sort at - line 7.
Use of uninitialized value $g1 in sort at - line 7.
+Use of uninitialized value $m1 in sort at - line 7.
Use of uninitialized value $a in subtraction (-) at - line 8.
Use of uninitialized value $b in subtraction (-) at - line 8.
Use of uninitialized value $m1 in sort at - line 9.
diff --git a/t/op/sort.t b/t/op/sort.t
index cd1c6eb..96fad1c 100644
--- a/t/op/sort.t
+++ b/t/op/sort.t
@@ -7,7 +7,7 @@ BEGIN {
set_up_inc('../lib');
}
use warnings;
-plan(tests => 196);
+plan(tests => 197);
# these shouldn't hang
{
@@ -1147,3 +1147,16 @@ pass "no crash when sort block deletes *a and *b";
@a = sort { *a = sub { 1 }; $a <=> $b } 0 .. 1;
ok(a(), "*a wasn't localized inadvertantly");
}
+
+SKIP:
+{
+ eval { require Config; 1 }
+ or skip "Cannot load Config", 1;
+ $Config::Config{ivsize} == 8
+ or skip "this test can only fail with 64-bit integers", 1;
+ # sort's built-in numeric comparison wasn't careful enough in a world
+ # of integers with more significant digits than NVs
+ my @in = ( "0", "20000000000000001", "20000000000000000" );
+ my @out = sort { $a <=> $b } @in;
+ is($out[1], "20000000000000000", "check sort order");
+}
--
2.1.4
|
From @jkeenanOn Wed, 14 Dec 2016 03:25:24 GMT, tonyc wrote:
Created local branch with your patch. PASS on my test program. Please apply unless Hugo or Zefram finds something amiss in the C code. Thank you very much. -- |
From takeshi@ix.netcom.comI saw a similar issue. On Mon, 12 Dec 2016 19:33:59 -0800, tails.saito@gmail.com wrote:
-- |
From tails.saito@gmail.comHi, 2016-12-16 14:16 GMT+09:00 Takeshi Kitahara via RT <perlbug-followup@perl.org>:
Where "my @in = (3,1,4,1,5,9,2); I guess this is not a bug because we should not lose the access to the |
From @jkeenanOn Wed, 14 Dec 2016 12:52:55 GMT, jkeenan wrote:
No objection was raised, so I applied Tony's patch in commit 427fbfe. Resolving ticket. Thank you very much. -- |
@jkeenan - Status changed from 'open' to 'resolved' |
Migrated from rt.perl.org#130335 (status was 'resolved')
Searchable as RT130335$
The text was updated successfully, but these errors were encountered: