-
Notifications
You must be signed in to change notification settings - Fork 3.1k
[Backport] Clean implementation of sorts for scala.util.Sorting. #4557
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Removed code based on Sun JDK sorts and implemented new (basic) sorts from scratch. Deferred to Java Arrays.sort whenever practical. Behavior of `scala.util.Sorting` should be unchanged, but changed documentation to specify when the Java methods are being used (as they're typically very fast). A JUnit test is provided. Performance is important for sorts. Everything is better with this patch, though it could be better yet, as described below. Below are sort times (in microseconds, SEM < 5%) for various 1024-element arrays of small case classes that compare on an int field (quickSort), or int arrays that use custom ordering (stableSort). Note: "degenerate" means there are only 16 values possible, so there are lots of ties. Times are all with fresh data (no re-using cache from run to run). Results: ``` random sorted reverse degenerate big:64k tiny:16 Old Sorting.quickSort 234 181 178 103 25,700 1.4 New Sorting.quickSort 170 27 115 74 18,600 0.8 Old Sorting.stableSort 321 234 236 282 32,600 2.1 New Sorting.stableSort 239 16 194 194 25,100 1.2 java.util.Arrays.sort 124 4 8 105 13,500 0.8 java.util.Arrays.sort|Box 126 15 13 112 13,200 0.9 ``` The new versions are uniformly faster, but uniformly slower than Java sorting. scala.util.Sorting has use cases that don't map easily in to Java unless everything is pre-boxed, but the overhead of pre-boxing is minimal compared to the sort. A snapshot of some of my benchmarking code is below. (Yes, lots of repeating myself--it's dangerous not to when trying to get somewhat accurate benchmarks.) ``` import java.util.Arrays import java.util.Comparator import math.Ordering import util.Sorting import reflect.ClassTag val th = ichi.bench.Thyme.warmed() case class N(i: Int, j: Int) {} val a = Array.fill(1024)( Array.tabulate(1024)(i => N(util.Random.nextInt, i)) ) var ai = 0 val b = Array.fill(1024)( Array.tabulate(1024)(i => N(i, i)) ) var bi = 0 val c = Array.fill(1024)( Array.tabulate(1024)(i => N(1024-i, i)) ) var ci = 0 val d = Array.fill(1024)( Array.tabulate(1024)(i => N(util.Random.nextInt(16), i)) ) var di = 0 val e = Array.fill(16)( Array.tabulate(65536)(i => N(util.Random.nextInt, i)) ) var ei = 0 val f = Array.fill(65535)( Array.tabulate(16)(i => N(util.Random.nextInt, i)) ) var fi = 0 val o = new Ordering[N]{ def compare(a: N, b: N) = if (a.i < b.i) -1 else if (a.i > b.i) 1 else 0 } for (s <- Seq("one", "two", "three")) { println(s) th.pbench{ val x = a(ai).clone; ai = (ai+1)%a.length; Sorting.quickSort(x)(o); x(x.length/3) } th.pbench{ val x = b(bi).clone; bi = (bi+1)%b.length; Sorting.quickSort(x)(o); x(x.length/3) } th.pbench{ val x = c(ci).clone; ci = (ci+1)%c.length; Sorting.quickSort(x)(o); x(x.length/3) } th.pbench{ val x = d(di).clone; di = (di+1)%d.length; Sorting.quickSort(x)(o); x(x.length/3) } th.pbench{ val x = e(ei).clone; ei = (ei+1)%e.length; Sorting.quickSort(x)(o); x(x.length/3) } th.pbench{ val x = f(fi).clone; fi = (fi+1)%f.length; Sorting.quickSort(x)(o); x(x.length/3) } } def ix(ns: Array[N]) = { val is = new Array[Int](ns.length) var i = 0 while (i < ns.length) { is(i) = ns(i).i i += 1 } is } val p = new Ordering[Int]{ def compare(a: Int, b: Int) = if (a > b) 1 else if (a < b) -1 else 0 } for (s <- Seq("one", "two", "three")) { println(s) val tag: ClassTag[Int] = implicitly[ClassTag[Int]] th.pbench{ val x = ix(a(ai)); ai = (ai+1)%a.length; Sorting.stableSort(x)(tag, p); x(x.length/3) } th.pbench{ val x = ix(b(bi)); bi = (bi+1)%b.length; Sorting.stableSort(x)(tag, p); x(x.length/3) } th.pbench{ val x = ix(c(ci)); ci = (ci+1)%c.length; Sorting.stableSort(x)(tag, p); x(x.length/3) } th.pbench{ val x = ix(d(di)); di = (di+1)%d.length; Sorting.stableSort(x)(tag, p); x(x.length/3) } th.pbench{ val x = ix(e(ei)); ei = (ei+1)%e.length; Sorting.stableSort(x)(tag, p); x(x.length/3) } th.pbench{ val x = ix(f(fi)); fi = (fi+1)%f.length; Sorting.stableSort(x)(tag, p); x(x.length/3) } } for (s <- Seq("one", "two", "three")) { println(s) th.pbench{ val x = a(ai).clone; ai = (ai+1)%a.length; Arrays.sort(x, o); x(x.length/3) } th.pbench{ val x = b(bi).clone; bi = (bi+1)%b.length; Arrays.sort(x, o); x(x.length/3) } th.pbench{ val x = c(ci).clone; ci = (ci+1)%c.length; Arrays.sort(x, o); x(x.length/3) } th.pbench{ val x = d(di).clone; di = (di+1)%d.length; Arrays.sort(x, o); x(x.length/3) } th.pbench{ val x = e(ei).clone; ei = (ei+1)%e.length; Arrays.sort(x, o); x(x.length/3) } th.pbench{ val x = f(fi).clone; fi = (fi+1)%f.length; Arrays.sort(x, o); x(x.length/3) } } def bx(is: Array[Int]): Array[java.lang.Integer] = { val Is = new Array[java.lang.Integer](is.length) var i = 0 while (i < is.length) { Is(i) = java.lang.Integer.valueOf(is(i)) i += 1 } Is } def xb(Is: Array[java.lang.Integer]): Array[Int] = { val is = new Array[Int](Is.length) var i = 0 while (i < is.length) { is(i) = Is(i).intValue i += 1 } is } val q = new Comparator[java.lang.Integer]{ def compare(a: java.lang.Integer, b: java.lang.Integer) = o.compare(a.intValue, b.intValue) } for (s <- Seq("one", "two", "three")) { println(s) val tag: ClassTag[Int] = implicitly[ClassTag[Int]] th.pbench{ val x = bx(ix(a(ai))); ai = (ai+1)%a.length; Arrays.sort(x, q); xb(x)(x.length/3) } th.pbench{ val x = bx(ix(b(bi))); bi = (bi+1)%b.length; Arrays.sort(x, q); xb(x)(x.length/3) } th.pbench{ val x = bx(ix(c(ci))); ci = (ci+1)%c.length; Arrays.sort(x, q); xb(x)(x.length/3) } th.pbench{ val x = bx(ix(d(di))); di = (di+1)%d.length; Arrays.sort(x, q); xb(x)(x.length/3) } th.pbench{ val x = bx(ix(e(ei))); ei = (ei+1)%e.length; Arrays.sort(x, q); xb(x)(x.length/3) } th.pbench{ val x = bx(ix(f(fi))); fi = (fi+1)%f.length; Arrays.sort(x, q); xb(x)(x.length/3) } } ```
Backport of #4534 |
adriaanm
added a commit
that referenced
this pull request
Jun 16, 2015
Clean implementation of sorts for scala.util.Sorting.
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Removed code based on Sun JDK sorts and implemented new (basic) sorts from
scratch. Deferred to Java Arrays.sort whenever practical. Behavior of
scala.util.Sorting
should be unchanged, but changed documentation tospecify when the Java methods are being used (as they're typically very fast).
A JUnit test is provided.
Performance is important for sorts. Everything is better with this patch,
though it could be better yet, as described below.
Below are sort times (in microseconds, SEM < 5%) for various 1024-element
arrays of small case classes that compare on an int field (quickSort), or
int arrays that use custom ordering (stableSort). Note: "degenerate"
means there are only 16 values possible, so there are lots of ties.
Times are all with fresh data (no re-using cache from run to run).
Results:
The new versions are uniformly faster, but uniformly slower than Java sorting. scala.util.Sorting has use cases that don't map easily in to Java unless everything is pre-boxed, but the overhead of pre-boxing is minimal compared to the sort.
A snapshot of some of my benchmarking code is below.
(Yes, lots of repeating myself--it's dangerous not to when trying to get
somewhat accurate benchmarks.)