The Go sort library implements quicksort, which has a worst-case runtime of
O(n^2), and not introsort, with a worst case runtime of O(n*log(n)), as,
for example, GNU's libstdc++ does. Introsort almost always uses the same
algorithm as quicksort, but when pathological cases are detected that lead
to O(n^2) runtime of quicksort (by keeping track of the recursion depth),
heapsort is used instead, since it is guaranteed O(n*log(n)), though on
average slower than quicksort. This is obviously not of the highest
priority, especially given the rarity of pathological cases for the
quicksort implementation used in Go, but they do exist, and can, for
example, but exploited in denial of service attacks.
Owner changed to r...@golang.org.
Status changed to Accepted.
Comment 2 by noelgrandin:
If you're going to replace the sort algorith, Timsort appears to be gaining traction in
Python and Java.
Comment 3 by mimenarrator:
Hmm, looks like same average and guaranteed runtimes as introsort. I wonder if anyone
has done any empirical analysis of which is superior and under what conditions.
Timsort is stable and Introsort is not (because introsort relies on quicksort).
So, I would prefer Timsort.
Comment 5 by mimenarrator:
Ideally both would be implemented if Introsort is faster, one for use when stability
is necessary and one for use when it is not, while only Timsort would need to be
implemented if it was faster.
Comment 6 by gshimano:
I will be curious to see implementation in Go of any of the parallel sorting algorithms.
Any reason that's not done yet?
Comment 7 by jemfinch:
Because it's derived from merge sort, timsort requires at least O(n/2) extra memory.
That may be cheap in Python (where the lists being sorted are essentially arrays of
pointers to much larger objects), but would not be as universally acceptable in Go.
Comment 8 by l4zycod3r:
> Because it's derived from merge sort, timsort requires at least O(n/2) extra memory.
This sort uses pointers to values instead of real storage memory:
Look at http://svn.python.org/projects/python/trunk/Objects/listsort.txt :
"+ timsort can require a temp array containing as many as N//2 pointers,
which means as many as 2*N extra bytes on 32-bit boxes. It can be
expected to require a temp array this large when sorting random data; on
data with significant structure, it may get away without using any extra
heap memory. This appears to be the strongest argument against it, but
compared to the size of an object, 2 temp bytes worst-case (also expected-
case for random data) doesn't scare me much."
the extra memory is a big deal.
criteria for something to replace sort.Sort
1. in place
2. no memory allocation
3. guaranteed n log n time worst case
4. not significantly more complicated than the existing implementation
i don't know of any such algorithms so i'm happy with quicksort for now.
but if someone does, fine.
Labels changed: added priority-low.
Status changed to HelpWanted.
Comment 10 by pgmmpk:
Timsort is now available at package dashboard.
Comment 11 by jckanis:
AFAIK what distinguishes Timsort from other sorting algorithms is that it can take a lot
more advantage of already sorted (or reverse-sorted) runs within the collection, to
minimise the number of comparisons it has to do. Timsort does a lot fewer comparisons
when sorting data that already has some structure. There is (I guess) some time overhead
to do this, so it really pays off when the comparison is expensive, for example because
it involves calling a user-defined function or calling interpreted code. This is why it
is a good choice for Python and why (I think) Java uses Timsort when sorting objects,
but quicksort/introsort when sorting primitive types.
Comment 12 by zzyz666:
We can Fisher–Yates shuffle the elements in linear time and statistically avoid worst
This issue was closed by revision 21e49cb.
Status changed to Fixed.
@hatahet, @rsc, regarding 21e49cb, wasn't it simpler to just do shuffling instead? just like zzyz666 pointed out.
@LopezCatherine, this bug is 6 years old. Much has changed in 6 years. Open a new bug instead if something is still suboptimal, but search for open duplicates first.