Conversation
- Remove Array.tabulate and .foreach in hot paths
as they require Autoboxing.
- Adjustments based on javap source inspection and
JMH benchmarks.
```text
Java Master This
Huffman.Encode 8.57ms 8.68ms 8.27ms
Huffman.Decode 7.37ms 5.64ms 5.62ms
Clock.Encode 185ns 312ns 192ns
Clock.Decode 206ns 231ns 220ns
```
| object Encoder: | ||
|
|
||
| def encode(centis: Array[Int], startTime: Int): Array[Byte] = | ||
| if centis.isEmpty then return Array.emptyByteArray |
There was a problem hiding this comment.
I'm all for optimizing hot paths, but that's not one. It's invoked only once per game. So I think we should use the idiomatic isEmpty, which btw is supposed to be inlined:
@`inline` def isEmpty: Boolean = xs.length == 0| var i = 0 | ||
| while i < len do | ||
| writeSigned(values(i), writer) | ||
| i += 1 |
There was a problem hiding this comment.
This is not a hot path, it runs once per game. I don't think we should be re-implementing scala's foreach where it can't possibly make a difference.
There was a problem hiding this comment.
Once per ply -- and yes, it does make a difference in the OverallEncodingTest.testEncode bench. (260->190ns/op)
|
b733ef4 |
|
ffe3e19 as expected within statistical error |
|
Also I'm rather confused because I just tried benchmarking master 877666f and got this: 877666f Showing similar/better perf than this PR. Using I'm sure this PR should be better, I just don't understand how it can't be measured on my machine. |
|
That's odd -- I get consistently better benchmarks on my machine, most of the changes here I benchmarked individually, with the exception of a couple I did see a separate odd quirk in |
- remove @inline annotations which do nothing - better line spacing for BitMask field
According to my JMH benches, these do matter. idk
|
|
||
| object LinearEstimator: | ||
|
|
||
| @inline def encode(dest: Array[Int], startTime: Int): Unit = |
There was a problem hiding this comment.
since scala3 we have an inline keyword that guarantees inlining https://docs.scala-lang.org/scala3/guides/macros/inline.html#inline-methods
|
@ornicar will investigate. Definitely not expected. |
Major improvement for Clock encode, some small speed up for move encoding as well.