New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Champ HashMap/Set filter/filterNot methods should be optimized to not perform any hashes or equality of keys #11143
Labels
Milestone
Comments
Also, that test should be a JUnit or scalacheck test, not a partest I reckon. |
joshlemer
changed the title
Champ HashMap filter/filterNot methods should be optimized to not perform any hashes
Champ HashMap filter/filterNot methods should be optimized to not perform any hashes or equality of keys
Sep 12, 2018
Actually I'll post the test here for future records, and remove the old tests anyways import scala.collection.immutable.{OldHashMap => HashMap}
object Test extends App {
case class Collision(value: Int) extends Ordered[Collision] {
def compare(that: Collision) = value compare that.value
override def hashCode = value / 5
}
def testCorrectness[T: Ordering](n: Int, mkKey: Int => T): Unit = {
val o = implicitly[Ordering[T]]
val s = HashMap.empty[T, Unit] ++ (0 until n).map(x => mkKey(x) -> (()))
for (i <- 0 until n) {
val ki = mkKey(i)
val a = s.filter(kv => o.lt(kv._1, ki))
val b = s.filterNot(kv => o.lt(kv._1, ki))
require(a.size == i && (0 until i).forall(i => a.contains(mkKey(i))))
require(b.size == n - i && (i until n).forall(i => b.contains(mkKey(i))))
}
}
// this tests the structural sharing of the new filter
// I could not come up with a simple test that tests structural sharing when only parts are reused, but
// at least this fails with the old and passes with the new implementation
def testSharing(): Unit = {
val s = HashMap.empty[Int, Unit] ++ (0 until 100).map(_ -> (()))
require(s.filter(_ => true) eq s)
require(s.filterNot(_ => false) eq s)
}
// this tests that neither hashCode nor equals are called during filter
def testNoHashing(): Unit = {
var hashCount = 0
var equalsCount = 0
case class HashCounter(value: Int) extends Ordered[HashCounter] {
def compare(that: HashCounter) = value compare that.value
override def hashCode = {
hashCount += 1
value
}
override def equals(that: Any) = {
equalsCount += 1
that match {
case HashCounter(value) => this.value == value
case _ => false
}
}
}
val s = HashMap.empty[HashCounter, Unit] ++ (0 until 100).map(k => HashCounter(k) -> (()))
val hashCount0 = hashCount
val equalsCount0 = equalsCount
val t = s.filter(_._1 < HashCounter(50))
require(hashCount == hashCount0)
require(equalsCount == equalsCount0)
}
// this tests correctness of filter and filterNot for integer keys
testCorrectness[Int](100, identity _)
// this tests correctness of filter and filterNot for keys with lots of collisions
// this is necessary because usually collisions are rare so the collision-related code is not thoroughly tested
testCorrectness[Collision](100, Collision.apply _)
testSharing()
testNoHashing()
} |
Ditto for HashSet import scala.collection.immutable.{OldHashSet => HashSet}
object Test extends App {
case class Collision(value: Int) extends Ordered[Collision] {
def compare(that:Collision) = value compare that.value
override def hashCode = value / 5
}
def testCorrectness[T : Ordering](n: Int, mkKey: Int => T): Unit = {
val o = implicitly[Ordering[T]]
val s = HashSet.empty[T] ++ (0 until n).map(mkKey)
for (i <- 0 until n) {
val ki = mkKey(i)
val a = s.filter(o.lt(_,ki))
val b = s.filterNot(o.lt(_,ki))
require(a.size == i && (0 until i).forall(i => a.contains(mkKey(i))))
require(b.size == n - i && (i until n).forall(i => b.contains(mkKey(i))))
}
}
// this tests the structural sharing of the new filter
// I could not come up with a simple test that tests structural sharing when only parts are reused, but
// at least this fails with the old and passes with the new implementation
def testSharing(): Unit = {
val s = HashSet.empty[Int] ++ (0 until 100)
require(s.filter(_ => true) eq s)
require(s.filterNot(_ => false) eq s)
}
// this tests that neither hashCode nor equals are called during filter
def testNoHashing(): Unit = {
var hashCount = 0
var equalsCount = 0
case class HashCounter(value:Int) extends Ordered[HashCounter] {
def compare(that:HashCounter) = value compare that.value
override def hashCode = {
hashCount += 1
value
}
override def equals(that:Any) = {
equalsCount += 1
that match {
case HashCounter(value) => this.value == value
case _ => false
}
}
}
val s = HashSet.empty[HashCounter] ++ (0 until 100).map(HashCounter)
val hashCount0 = hashCount
val equalsCount0 = equalsCount
val t = s.filter(_<HashCounter(50))
require(hashCount == hashCount0)
require(equalsCount == equalsCount0)
}
// this tests correctness of filter and filterNot for integer keys
testCorrectness[Int](100, identity _)
// this tests correctness of filter and filterNot for keys with lots of collisions
// this is necessary because usually collisions are rare so the collision-related code is not thoroughly tested
testCorrectness[Collision](100, Collision.apply _)
testSharing()
testNoHashing()
} |
joshlemer
changed the title
Champ HashMap filter/filterNot methods should be optimized to not perform any hashes or equality of keys
Champ HashMap/Set filter/filterNot methods should be optimized to not perform any hashes or equality of keys
Sep 13, 2018
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
there is currently a test for this for the
OldHashMap
implementation, found here. This requirement should probably be brought forward to the new hash collections if possible (Edit: it's definitely possible!)The text was updated successfully, but these errors were encountered: