-
-
Notifications
You must be signed in to change notification settings - Fork 706
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix issue 12183 : Remove use of floating point in std.algorithm.quickSortImpl #1946
Conversation
|
Copy pasting what I put in the bug report: You can't expect library code to work around such an issue. BTW, I think your fix is wrong. For starters, changing default behavior is unacceptable. It would be much better to simply change the cast to "double": In this case, there is absolutely nothing that warrants using "real" level precision: "depth" is nothing more than a rough estimate to begin with. If anything, I'd think passing around an 80 bit real to be less efficient than a standard double. Why don't you go with that fix? |
|
Thanks for your feedback and the better solution, I updated the PR accordingly. I don't expect phobos to work around that, I'm just trying to limit the impact of this where it's possible/convenient. |
|
Shouldn't you change private void quickSortImpl(alias less, Range)(Range r, real depth)Having it still accept a real in such conditions seems... odd. |
|
I'm trying to wrap my head around this. Are we changing library code because a tool fails to work? If there's an issue with Valgrind then someone should a file a bug in Valgrind's bug tracker. |
I mostly agree, but it was also my understanding that If I'm not mistaken, then the change would be legit, valgrind or no valgrind. Working around the valgrind issue would the cherry on the cake. |
|
Duh, I forgot about the prototype, thanks @monarchdodra . @AndrejMitrovic : I went to the KDE bugtracker, only to find that bug already reported, and developpers saying "tons of work / little benefit, not happening". |
|
Both the limitations page and that bug report talk about valgrind's implementation of 80 bit reals as using 64 bit doubles. Neither of them talk about failure to translate the instruction. |
|
|
|
The PR in question is #1735 . Xinok says that implicit cast from long to double are not allowed. However, neither dmd or gdc seems to have a problem with it, and the tester pass on the 10 platform without the cast. |
|
@Geod24 The idea behind real must have been that 64 bit integer (length) can't exactly fit in double's 52bits of mantissa. Considering that arrays of such sizes are at best a very distant possibility I see no problem with it. |
|
I have an issue with why quicksortImpl is using floating point to begin with. As a general rule, floating point should not be used in library code unless the user is actually wanting a floating point result. The reason is that FP can be awfully slow on some machines. It looks to me like |
| @@ -9606,7 +9606,7 @@ void swapAt(R)(R r, size_t i1, size_t i2) | |||
| } | |||
| } | |||
|
|
|||
| private void quickSortImpl(alias less, Range)(Range r, real depth) | |||
| private void quickSortImpl(alias less, Range)(Range r, double depth) | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
size_t depth
depth < 1.0 replace with depth == 0
depth *= (2.0/3.0) replace with depth = (depth * 2) / 3
will do the same thing, without need for floating point.
|
Oops, silly me pushed the wrong button. |
|
@WalterBright Thanks for your feedback, I updated the code & title accordingly. |
|
The commit message still says 'use double' - shouldn't that reflect the pull request's name? |
| { | ||
| HeapSortImpl!(less, Range).heapSort(r); | ||
| return; | ||
| } | ||
| depth *= (2.0/3.0); | ||
| depth = (depth * 2) / 3; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, do we need to care about possible overflow here on the *2 ? We could write it as:
depth = (cast(ptrdiff_t)depth < 0) ? (depth / 3) * 2 : (depth * 2) / 3;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Simpler:
depth = depth >= depth.max / 2 ? (depth / 3) * 2 : (depth * 2) / 3;
The impact is very likely to be immeasurable because the actual work involved is a lot more than the branch itself. @Geod24 please make this change and let's get this in. Thanks!
|
I didn't consider a 2^31 elements array to be a valid use case actually (and I'm not even speaking of 64-bits). So that would introduce a branching for no benefit IMHO. |
I agree that array with 2^31 elements on 32bit system is only theoretically possible with T.sizeof == 1. Even in this case the app has to run in a mode where > 2G of address space is available, namely as 32bit app on a 64bit OS. In this specific case it then still has to be extremely lucky in getting 2Gb of contiguous address space range in a single mmap call. And most importantly even if it did happen in this incredibly unlikely corner case - it just falls back to heap sort, no formatting the hard drive etc. |
|
... let's keep in mind the relative costs of using a floating point operation here. that's one multiplication for every pivot-partition. I'll be damned if that multiplication has any kind of performance repercussion. Also, the "2.0 / 3.0" is just a "seems to work good" formula. If I think we are way over-thinking this. We simply change real to double, and call it a day. If we do change it to
Such a change can easily change the "depth balance" leading to either premature, or late, algorithm strategy change. In this case, there can be some very serious performance implication. |
- Prevent valgrind from aborting (some x87 FP operation are not supported). - Can be slower on machines that have a soft floating point ABI.
|
@andralex : Updated according to your comments. @monarchdodra : I understand your concern, but as you say it looks more like a "seems to work good formula". According to the comments, @Xinok 's changes were introduced to guarantee an upper-bound and didn't affect standard input. I however tested Vladimir Panteleev inputs: And both version exhibited time same result (Debian x86_64). There is still room for improvement (precise formula / benchmarking), but IMHO this is outside the topic of that PR. |
|
This is ready. What's with the test failure? |
|
Auto-merge toggled on |
|
Looks like master is broken. |
Fix issue 12183 : Remove use of floating point in std.algorithm.quickSortImpl
https://d.puremagic.com/issues/show_bug.cgi?id=12183