-
Notifications
You must be signed in to change notification settings - Fork 15.5k
Description
| Bugzilla Link | 20837 |
| Resolution | FIXED |
| Resolved on | Nov 23, 2021 04:42 |
| Version | unspecified |
| OS | All |
| Attachments | Evidence for quadratic worst case., A patch that fixes the quadratic behaviour by implementing introsort. |
| Reporter | LLVM Bugzilla Contributor |
| CC | @Dushistov,@hfinkel,@hiraditya,@danlark1,@mclow,@nico,@sebpop,@zamazan4ik |
Extended Description
The C++ standard mandates that std::sort has complexity O(N log(N)), but libc++'s implementation takes O(N^2) in the worst case.
As proof I've attached a program that constructs a worst case input for several sizes. It also instruments the number of comparisons used to sort these worst cases and prints the results. The technique used is described in the 1999 paper "A Killer Adversary for Quicksort" by M. D. McIlroy.
Compiling and running:
$ clang++ -O2 -m64 -march=native -std=c++11 -stdlib=libc++ main.cpp -nodefaultlibs -lc++ -lc++abi -lm -lc -lgcc_s -lgcc && ./a.out
N: comparisons
100: 2479
200: 10229
400: 40729
800: 161729
1600: 448698
3200: 1413898
6400: 5264298
This is clearly quadratic. To illustrate this defect more, these are the comparison counts given by compiling using libstdc++:
$ clang++ -O2 -m64 -march=native -std=c++11 main.cpp && ./a.out
N: comparisons
100: 1742
200: 4260
400: 10035
800: 22784
1600: 51049
3200: 112386
6400: 244850
Inspecting the source of shows the cause of the issue: there is no introsort implemented, only quicksort (albeit with non-trivial optimizations, but nothing preventing the worst case). I've written a patch that augments the current implementation to make it work as an introspective sort, switching to heapsort if the recursion depth exceeds 2*floor(log(N)). This post can only have one attachment, so I'll post it as an attachment to a comment.
Having not contributed to libc++ before I'm not 100% familiar with all coding styles/(un)written rules. My changes are rather minimal though, so I've just followed what style could be found in context. And of course I knowingly submit my code under the libc++ licenses, the MIT license and the UIUC License.