Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consider using of tzcnt instruction for better performance in foreach loop and iterator of bitsets #11418

plokhotnyuk opened this Issue Feb 27, 2019 · 0 comments


None yet
2 participants
Copy link

plokhotnyuk commented Feb 27, 2019

This instruction has a low latency and throughput on most contemporary CPUs.

Also it is implemented as intrinsic and turned on by default in JDK 8:

$ java -XX:+PrintFlagsFinal -version | grep Trailing
     bool UseCountTrailingZerosInstruction          = true                                {ARCH product}
java version "1.8.0_192"
Java(TM) SE Runtime Environment (build 1.8.0_192-b12)
Java HotSpot(TM) 64-Bit Server VM (build 25.192-b12, mixed mode)

Below is a snippet how it can be used:

  override def foreach[U](f: Int => U): Unit = {
    val n = nwords
    var i = 0
    while (i < n) {
      var w = word(i)
      while (w != 0) {
        f(java.lang.Long.numberOfTrailingZeros(w) + (i << 6))
        w &= w - 1
      i += 1

Inspired by the Daniel Lemire's blog post.

@SethTisue SethTisue added this to the Backlog milestone Feb 27, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.