Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clean implementation of sorts for scala.util.Sorting. #4534

Merged
merged 1 commit into from Jun 16, 2015

Conversation

@Ichoran
Copy link
Contributor

@Ichoran Ichoran commented May 31, 2015

Removed code based on Sun JDK sorts and implemented new (basic) sorts from
scratch. Deferred to Java Arrays.sort whenever practical. Behavior of
scala.util.Sorting should be unchanged, but changed documentation to
specify when the Java methods are being used (as they're typically very fast).

A JUnit test is provided.

Performance is important for sorts. Everything is better with this patch,
though it could be better yet, as described below.

Below are sort times (in microseconds, SEM < 5%) for various 1024-element
arrays of small case classes that compare on an int field (quickSort), or
int arrays that use custom ordering (stableSort). Note: "degenerate"
means there are only 16 values possible, so there are lots of ties.
Times are all with fresh data (no re-using cache from run to run).

Results:

                        random  sorted  reverse  degenerate  big:64k  tiny:16
Old Sorting.quickSort     234     181      178         103    25,700      1.4
New Sorting.quickSort     170      27      115          74    18,600      0.8

Old Sorting.stableSort    321     234      236         282    32,600      2.1
New Sorting.stableSort    239      16      194         194    25,100      1.2

java.util.Arrays.sort     124       4        8         105    13,500      0.8
java.util.Arrays.sort|Box 126      15       13         112    13,200      0.9

The new versions are uniformly faster, but uniformly slower than Java sorting. scala.util.Sorting has use cases that don't map easily in to Java unless everything is pre-boxed, but the overhead of pre-boxing is minimal compared to the sort.

A snapshot of some of my benchmarking code is below.

(Yes, lots of repeating myself--it's dangerous not to when trying to get
somewhat accurate benchmarks.)

import java.util.Arrays
import java.util.Comparator
import math.Ordering
import util.Sorting
import reflect.ClassTag

val th = ichi.bench.Thyme.warmed()

case class N(i: Int, j: Int) {}
val a = Array.fill(1024)( Array.tabulate(1024)(i => N(util.Random.nextInt, i)) )
var ai = 0
val b = Array.fill(1024)( Array.tabulate(1024)(i => N(i, i)) )
var bi = 0
val c = Array.fill(1024)( Array.tabulate(1024)(i => N(1024-i, i)) )
var ci = 0
val d = Array.fill(1024)( Array.tabulate(1024)(i => N(util.Random.nextInt(16), i)) )
var di = 0
val e = Array.fill(16)( Array.tabulate(65536)(i => N(util.Random.nextInt, i)) )
var ei = 0
val f = Array.fill(65535)( Array.tabulate(16)(i => N(util.Random.nextInt, i)) )
var fi = 0

val o = new Ordering[N]{ def compare(a: N, b: N) = if (a.i < b.i) -1 else if (a.i > b.i) 1 else 0 }
for (s <- Seq("one", "two", "three")) {
  println(s)
  th.pbench{ val x = a(ai).clone; ai = (ai+1)%a.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = b(bi).clone; bi = (bi+1)%b.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = c(ci).clone; ci = (ci+1)%c.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = d(di).clone; di = (di+1)%d.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = e(ei).clone; ei = (ei+1)%e.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = f(fi).clone; fi = (fi+1)%f.length; Sorting.quickSort(x)(o); x(x.length/3) }
}

def ix(ns: Array[N]) = {
  val is = new Array[Int](ns.length)
  var i = 0
  while (i < ns.length) {
    is(i) = ns(i).i
    i += 1
  }
  is
}
val p = new Ordering[Int]{ def compare(a: Int, b: Int) = if (a > b) 1 else if (a < b) -1 else 0 }
for (s <- Seq("one", "two", "three")) {
  println(s)
  val tag: ClassTag[Int] = implicitly[ClassTag[Int]]
  th.pbench{ val x = ix(a(ai)); ai = (ai+1)%a.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(b(bi)); bi = (bi+1)%b.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(c(ci)); ci = (ci+1)%c.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(d(di)); di = (di+1)%d.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(e(ei)); ei = (ei+1)%e.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(f(fi)); fi = (fi+1)%f.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
}

for (s <- Seq("one", "two", "three")) {
  println(s)
  th.pbench{ val x = a(ai).clone; ai = (ai+1)%a.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = b(bi).clone; bi = (bi+1)%b.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = c(ci).clone; ci = (ci+1)%c.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = d(di).clone; di = (di+1)%d.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = e(ei).clone; ei = (ei+1)%e.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = f(fi).clone; fi = (fi+1)%f.length; Arrays.sort(x, o); x(x.length/3) }
}

def bx(is: Array[Int]): Array[java.lang.Integer] = {
  val Is = new Array[java.lang.Integer](is.length)
  var i = 0
  while (i < is.length) {
    Is(i) = java.lang.Integer.valueOf(is(i))
    i += 1
  }
  Is
}
def xb(Is: Array[java.lang.Integer]): Array[Int] = {
  val is = new Array[Int](Is.length)
  var i = 0
  while (i < is.length) {
    is(i) = Is(i).intValue
    i += 1
  }
  is
}
val q = new Comparator[java.lang.Integer]{
  def compare(a: java.lang.Integer, b: java.lang.Integer) = o.compare(a.intValue, b.intValue)
}
for (s <- Seq("one", "two", "three")) {
  println(s)
  val tag: ClassTag[Int] = implicitly[ClassTag[Int]]
  th.pbench{ val x = bx(ix(a(ai))); ai = (ai+1)%a.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(b(bi))); bi = (bi+1)%b.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(c(ci))); ci = (ci+1)%c.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(d(di))); di = (di+1)%d.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(e(ei))); ei = (ei+1)%e.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(f(fi))); fi = (fi+1)%f.length; Arrays.sort(x, q); xb(x)(x.length/3) }
}
@scala-jenkins scala-jenkins added this to the 2.11.7 milestone May 31, 2015
@adriaanm
Copy link
Member

@adriaanm adriaanm commented Jun 1, 2015

Excellent!

Looks like most of the binary incompatibilities are because the private specialized methods are being made protected:

        // specialized members have to be overridable.
        if (m.isPrivate)
          m.resetFlag(PRIVATE).setFlag(PROTECTED)

It sounds like we should be able to restrict this to members that are actually overridable? (if (m.isOverridableMember)) /cc @retronym, @lrytz, @VladUreche

EDIT: well, something inspired by isOverridableMember, since it's not exactly what we need here (maybe m.isPrivate && !((m hasFlag FINAL) || m.owner.isEffectivelyFinal)?)

@retronym
Copy link
Member

@retronym retronym commented Jun 1, 2015

@adriaanm IIUC, the specialized subclass itself needs to override the formerly-private, ala: https://gist.github.com/retronym/69d44047de3ded4fae57

case a: Array[Byte] => if (ord eq Ordering.Byte) java.util.Arrays.sort(a) else mergeSort[Byte](a, 0, a.length, ord)
case a: Array[Short] => if (ord eq Ordering.Short) java.util.Arrays.sort(a) else mergeSort[Short](a, 0, a.length, ord)
case a: Array[Boolean] => if (ord eq Ordering.Boolean) booleanSort(a) else mergeSort[Boolean](a, 0, a.length, ord)
case a: Array[Unit] =>

This comment has been minimized.

@retronym

retronym Jun 1, 2015
Member

This case is unreachable, as runtime type tests of arrays are covariant:

scala> (Array(()): Any) match { case _: Array[AnyRef] => "Array[AnyRef]" case _: Array[Unit] => "Array[Unit]" }
res1: String = Array[AnyRef]

scala> (Array(()): Any) match { case _: Array[Unit] => "Array[Unit]" case _: Array[AnyRef] => "Array[AnyRef]" }
res4: String = Array[Unit]

scala> (Array(""): Any) match { case _: Array[Unit] => "Array[Unit]" case _: Array[AnyRef] => "Array[AnyRef]" }
res5: String = Array[AnyRef]

This comment has been minimized.

@Ichoran

Ichoran Jun 1, 2015
Author Contributor

Yeah, I realized that but forgot to delete the line. I meant to leave a comment there saying that it wasn't worth putting Array[Unit] first and potentially slowing everyone else down just to catch things that nobody ought to try to sort anyway.

This comment has been minimized.

@adriaanm

adriaanm Jun 1, 2015
Member

Hmm, that was my copy/paste from https://github.com/scala/scala/blob/2.11.x/src/library/scala/runtime/ScalaRunTime.scala#L83

I think it shouldn't be covariant since we're supposed to mimic Java semantics for Array in instanceof checks, right?

This comment has been minimized.

@adriaanm

adriaanm Jun 1, 2015
Member

Haha, I'm going to bed now. Of course Array covariance is there thanks to Java.

This comment has been minimized.

@paulp

paulp Jun 19, 2015
Contributor

array pattern must respect array invariance

In which you yourself state in the category of things to be done:

implement invariant matching for arrays (including updating the analyses)

// TODO: add upper bound: T <: AnyRef, propagate to callers below (not binary compatible)
// Maybe also rename all these methods to `sort`.
@inline private def sort[T](a: Array[T], ord: Ordering[T]): Unit = a match {
case _: Array[AnyRef] => java.util.Arrays.sort(a, ord) // Use original a because match to Array[AnyRef] is a lie; it means Array[T] s.t. T not primitive (can even be value class)

This comment has been minimized.

@retronym

retronym Jun 1, 2015
Member

Java's array sort treats a null ordering as an instruction to use the natural ordering (ie, by casting elements to Comparable) Rather than silently allowing this for null orderings (albeit unlikely), we should throw an NPE here.

scala> val temp = Array("2", "1"); java.util.Arrays.sort(temp, null)
temp: Array[String] = Array(1, 2)

Old behaviour of this wrapper:

scala> scala.util.Sorting.stableSort(List("2", "1"))(implicitly, null)
java.lang.NullPointerException
  at scala.util.Sorting$$anonfun$stableSort$2.apply(Sorting.scala:69)

This comment has been minimized.

@Ichoran

Ichoran Jun 1, 2015
Author Contributor

Good catch. Will fix.

@Ichoran
Copy link
Contributor Author

@Ichoran Ichoran commented Jun 1, 2015

@adriaanm @retronym - Can I just mark everything as whitelisted since it's not part of the public API? That's the usual strategy isn't it?

@adriaanm
Copy link
Member

@adriaanm adriaanm commented Jun 1, 2015

Yep, I think it's safe to whitelist these synthetics. They're in an object. I still think we could keep the specialized members private since there can't be a subclass for the module class of which they are a member.

@adriaanm
Copy link
Member

@adriaanm adriaanm commented Jun 1, 2015

I guess we could go from 13 whitelisted to 12 if we skipped the default argument for mergeSort, but I don't think it's worth changing.

@Ichoran
Copy link
Contributor Author

@Ichoran Ichoran commented Jun 1, 2015

@adriaanm - If we change the compiler to be more private here, it will probably cause other nominal binary compatibility issues as things that were visible disappear. It's something to do for 2.12, I guess; I agree that there's no reason for them to be protected.

@Ichoran Ichoran force-pushed the Ichoran:sorting-reimpl branch from 48501f2 to d8f396c Jun 1, 2015
@adriaanm
Copy link
Member

@adriaanm adriaanm commented Jun 1, 2015

@Ichoran, agreed -- we can't do this in 2.11, but would be small win for binary compatibility in objects in 2.12.

Removed code based on Sun JDK sorts and implemented new (basic) sorts from
scratch.  Deferred to Java Arrays.sort whenever practical.  Behavior of
`scala.util.Sorting` should be unchanged, but changed documentation to
specify when the Java methods are being used (as they're typically very fast).

A JUnit test is provided.

Performance is important for sorts.  Everything is better with this patch,
though it could be better yet, as described below.

Below are sort times (in microseconds, SEM < 5%) for various 1024-element
arrays of small case classes that compare on an int field (quickSort), or
int arrays that use custom ordering (stableSort).  Note: "degenerate"
means there are only 16 values possible, so there are lots of ties.
Times are all with fresh data (no re-using cache from run to run).

Results:

```
                        random  sorted  reverse  degenerate  big:64k  tiny:16
Old Sorting.quickSort     234     181      178         103    25,700      1.4
New Sorting.quickSort     170      27      115          74    18,600      0.8

Old Sorting.stableSort    321     234      236         282    32,600      2.1
New Sorting.stableSort    239      16      194         194    25,100      1.2

java.util.Arrays.sort     124       4        8         105    13,500      0.8
java.util.Arrays.sort|Box 126      15       13         112    13,200      0.9
```

The new versions are uniformly faster, but uniformly slower than Java sorting.  scala.util.Sorting has use cases that don't map easily in to Java unless everything is pre-boxed, but the overhead of pre-boxing is minimal compared to the sort.

A snapshot of some of my benchmarking code is below.

(Yes, lots of repeating myself--it's dangerous not to when trying to get
somewhat accurate benchmarks.)

```
import java.util.Arrays
import java.util.Comparator
import math.Ordering
import util.Sorting
import reflect.ClassTag

val th = ichi.bench.Thyme.warmed()

case class N(i: Int, j: Int) {}
val a = Array.fill(1024)( Array.tabulate(1024)(i => N(util.Random.nextInt, i)) )
var ai = 0
val b = Array.fill(1024)( Array.tabulate(1024)(i => N(i, i)) )
var bi = 0
val c = Array.fill(1024)( Array.tabulate(1024)(i => N(1024-i, i)) )
var ci = 0
val d = Array.fill(1024)( Array.tabulate(1024)(i => N(util.Random.nextInt(16), i)) )
var di = 0
val e = Array.fill(16)( Array.tabulate(65536)(i => N(util.Random.nextInt, i)) )
var ei = 0
val f = Array.fill(65535)( Array.tabulate(16)(i => N(util.Random.nextInt, i)) )
var fi = 0

val o = new Ordering[N]{ def compare(a: N, b: N) = if (a.i < b.i) -1 else if (a.i > b.i) 1 else 0 }
for (s <- Seq("one", "two", "three")) {
  println(s)
  th.pbench{ val x = a(ai).clone; ai = (ai+1)%a.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = b(bi).clone; bi = (bi+1)%b.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = c(ci).clone; ci = (ci+1)%c.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = d(di).clone; di = (di+1)%d.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = e(ei).clone; ei = (ei+1)%e.length; Sorting.quickSort(x)(o); x(x.length/3) }
  th.pbench{ val x = f(fi).clone; fi = (fi+1)%f.length; Sorting.quickSort(x)(o); x(x.length/3) }
}

def ix(ns: Array[N]) = {
  val is = new Array[Int](ns.length)
  var i = 0
  while (i < ns.length) {
    is(i) = ns(i).i
    i += 1
  }
  is
}
val p = new Ordering[Int]{ def compare(a: Int, b: Int) = if (a > b) 1 else if (a < b) -1 else 0 }
for (s <- Seq("one", "two", "three")) {
  println(s)
  val tag: ClassTag[Int] = implicitly[ClassTag[Int]]
  th.pbench{ val x = ix(a(ai)); ai = (ai+1)%a.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(b(bi)); bi = (bi+1)%b.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(c(ci)); ci = (ci+1)%c.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(d(di)); di = (di+1)%d.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(e(ei)); ei = (ei+1)%e.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
  th.pbench{ val x = ix(f(fi)); fi = (fi+1)%f.length; Sorting.stableSort(x)(tag, p); x(x.length/3) }
}

for (s <- Seq("one", "two", "three")) {
  println(s)
  th.pbench{ val x = a(ai).clone; ai = (ai+1)%a.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = b(bi).clone; bi = (bi+1)%b.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = c(ci).clone; ci = (ci+1)%c.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = d(di).clone; di = (di+1)%d.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = e(ei).clone; ei = (ei+1)%e.length; Arrays.sort(x, o); x(x.length/3) }
  th.pbench{ val x = f(fi).clone; fi = (fi+1)%f.length; Arrays.sort(x, o); x(x.length/3) }
}

def bx(is: Array[Int]): Array[java.lang.Integer] = {
  val Is = new Array[java.lang.Integer](is.length)
  var i = 0
  while (i < is.length) {
    Is(i) = java.lang.Integer.valueOf(is(i))
    i += 1
  }
  Is
}
def xb(Is: Array[java.lang.Integer]): Array[Int] = {
  val is = new Array[Int](Is.length)
  var i = 0
  while (i < is.length) {
    is(i) = Is(i).intValue
    i += 1
  }
  is
}
val q = new Comparator[java.lang.Integer]{
  def compare(a: java.lang.Integer, b: java.lang.Integer) = o.compare(a.intValue, b.intValue)
}
for (s <- Seq("one", "two", "three")) {
  println(s)
  val tag: ClassTag[Int] = implicitly[ClassTag[Int]]
  th.pbench{ val x = bx(ix(a(ai))); ai = (ai+1)%a.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(b(bi))); bi = (bi+1)%b.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(c(ci))); ci = (ci+1)%c.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(d(di))); di = (di+1)%d.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(e(ei))); ei = (ei+1)%e.length; Arrays.sort(x, q); xb(x)(x.length/3) }
  th.pbench{ val x = bx(ix(f(fi))); fi = (fi+1)%f.length; Arrays.sort(x, q); xb(x)(x.length/3) }
}
```
@Ichoran Ichoran force-pushed the Ichoran:sorting-reimpl branch from d8f396c to 9aae16a Jun 2, 2015
private def booleanSort(a: Array[Boolean]): Unit = {
var i = 0
var n = 0
while (i < a.length) {

This comment has been minimized.

@Arneball

Arneball Jun 2, 2015
Contributor

Wouldn't it be beneficial to pull out a.length in a local variable so that we don't need to do arraylength in every iteration?

This comment has been minimized.

@Ichoran

Ichoran Jun 3, 2015
Author Contributor

Not really. The JVM is very good at optimizing this pattern. It knows it's allowed, as array.length is immutable.

@adriaanm
Copy link
Member

@adriaanm adriaanm commented Jun 16, 2015

LGTM -- @Ichoran, you should probably sign the CLA online at some point :-)

adriaanm added a commit that referenced this pull request Jun 16, 2015
Clean implementation of sorts for scala.util.Sorting.
@adriaanm adriaanm merged commit 43a56fb into scala:2.11.x Jun 16, 2015
4 of 5 checks passed
4 of 5 checks passed
cla @Ichoran, please sign the Scala CLA by clicking on 'Details' -->
Details
integrate-ide [997] SUCCESS. Took 88 min.
Details
validate-main [1329] SUCCESS. Took 105 min.
Details
validate-publish-core [1269] SUCCESS. Took 11 min.
Details
validate-test [824] SUCCESS. Took 50 min.
Details
@paulp
Copy link
Contributor

@paulp paulp commented Jun 19, 2015

Since I left this comment on an "outdated diff", once more for general consumption:

array pattern must respect array invariance

In which you (adriaan) yourself state in the category of things to be done:

implement invariant matching for arrays (including updating the analyses)

@paulp
Copy link
Contributor

@paulp paulp commented Jun 19, 2015

That's a good candidate ticket to be abandoned when you declare jira bankruptcy. Correctness deteriorates over time.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

6 participants
You can’t perform that action at this time.