Skip to content

Memoize KSTypeImpl.hashCode()#2896

Open
shmuelr wants to merge 1 commit into
google:mainfrom
shmuelr:ksptype-hashcode
Open

Memoize KSTypeImpl.hashCode()#2896
shmuelr wants to merge 1 commit into
google:mainfrom
shmuelr:ksptype-hashcode

Conversation

@shmuelr
Copy link
Copy Markdown

@shmuelr shmuelr commented Apr 27, 2026

Cache KSTypeImpl.hashCode() to avoid reopening an Analysis API session on every call.

hashCode() previously called type.fullyExpand().hashCode() which opens a fresh analyze { } session per invocation. Because KSTypeImpl is used as a HashMap key in hot paths
(propertyAsMemberOfCache, functionAsMemberOfCache), this fires hundreds of thousands of times per KSP task.

Fix: private val cachedHashCode: Int by lazy { type.fullyExpand().hashCode() }

Benchmark

Integration bench: full KotlinSymbolProcessing.execute() over 40 test classes (generics, type aliases, inheritance chains), with the processor calling .hashCode() 1.2M times per run in a tight loop. 5 warmup iterations, 8 stable measurement iterations, fresh JVM per variant.

Variant Median Delta
No cache (baseline) ~400 ms
by lazy { } ~130 ms -68%

Consistent across 8 independent runs in varied execution order. The three cached strategies tested (two-field primitive, lazy(NONE), lazy {}) were indistinguishable at wall-clock, within ±15ms of each other across all runs.

Fixes #2576

@shmuelr shmuelr marked this pull request as ready for review April 27, 2026 19:41
@jaschdoc
Copy link
Copy Markdown
Collaborator

jaschdoc commented May 1, 2026

Hi @shmuelr, thank you for the pull request! Do you have any data that shows this is implementation is preferable to just wrapping it in lazy val? If not, let's start with that, since it's a bit simpler and then we can always move to a in-depth implementation if the numbers show that. Cheers! :)

@shmuelr
Copy link
Copy Markdown
Author

shmuelr commented May 1, 2026

Hi @shmuelr, thank you for the pull request! Do you have any data that shows this is implementation is preferable to just wrapping it in lazy val? If not, let's start with that, since it's a bit simpler and then we can always move to a in-depth implementation if the numbers show that. Cheers! :)

Thanks for the review @jaschdoc!

I mirrored the Java String.hashcode and avoided Kotlin's Lazy to avoid the Volatile field & Synchronized block in LazyJVM.kt, given that the hashcode is idempotent it felt overkill to have thread safety here.

I'm happy to benchmark this, what's the best way to go about that? (I've been testing this in a primary project I'm working on, but it's a huge project and susceptible to noise)

@jaschdoc
Copy link
Copy Markdown
Collaborator

jaschdoc commented May 5, 2026

Aha, I think lazy is the best way to go if we don't know the impact yet. In terms of benchmarking, there's not really a good answer there. This is also something we are evaluating, so I cannot help you with that. But any performance numbers you can provide in a reproducible way would be a great help. I know it might be a big ask, but performance improvements are always tricky if they cannot be measured.

I will leave the PR open and come back if I figure out a way to measure the impact of this in a stable way.

@shmuelr
Copy link
Copy Markdown
Author

shmuelr commented May 10, 2026

I took a pass at benchmarking this, I ran the KSP test suite (along with an added 1000 iteration loop) and measuring the timing of the calls (3 warmup passes, then 10 measurement passes)

The raw measurements of the hashcode showed the string.hashcode pattern (with the int field) to be 5x faster then lazy(NONE), but this was nanoseconds difference on my machine Macbook M2 pro (0.39 vs 2.07 ns/call). (this makes sense to me, the direct int usage vs the calls to UnsafeLazyImpl.value + Int? boxing).

But, in the end this only accounted for 1% difference in the hashcode bechmarking speed. The string pattern was 38% faster than the current hashcode calculations, and lazy(None) was 37% faster.

I'm open to changing this to Lazy(None) { } if you want, personally I'd rather get the extra performance but I do understand there is more cognitive load & maintenance for this code now.

@jaschdoc
Copy link
Copy Markdown
Collaborator

Thanks for the measurements! Could you clarify with code snippets exactly what each implementation consists of? It's also unclear to me if you benchmarked the baseline / current implementation.

@shmuelr
Copy link
Copy Markdown
Author

shmuelr commented May 11, 2026

I ran a test locally that sets up these types and calls hashcode in a loop. I pulled the latest codebase as the baseline and compared to 3 variants

  1. Baseline
  2. The String.hashcode pattern (this pr) which uses the int field to store the hash
  3. A Lazy.None approach, 47f1929 (this is what I assumed you meant by using a lazy val)

Hammering it now a bunch, 2 & 3 seem to be neck & neck (~1-3% variance) with the winner alternating.
I think I'm just gonna go with the lazy val, it doesn't seem like theres a material difference in practice

@jaschdoc
Copy link
Copy Markdown
Collaborator

jaschdoc commented May 11, 2026

Cool. What is the performance of the default constructor for lazy? I.e.,

private val cachedHashCode: Int by lazy {
  // fully expanded type etc...
}

@shmuelr
Copy link
Copy Markdown
Author

shmuelr commented May 11, 2026

It's not super clear to me, the initial runs showed a ~10% slowdown with lazy {} vs lazy(None) {}, but the more I run this the more the gap closes (feels like JVM hotspot optimizations are helping here in the test loops)

@jaschdoc
Copy link
Copy Markdown
Collaborator

I see. Let's go with the lazy block with the default constructor for now. It can be really tricky to benchmark the performance of JVM programs without a proper setup exactly due to JIT compilation. This is still something we can improve in the future when there's better metrics for performance. Did you have numbers for how it compares to the baseline? A table with each implementation, it's min, mean, and max timings (after warmup) as well as a relative improvement over the baseline (three columns total) would be really helpful!

@shmuelr
Copy link
Copy Markdown
Author

shmuelr commented May 14, 2026

I see. Let's go with the lazy block with the default constructor for now. It can be really tricky to benchmark the performance of JVM programs without a proper setup exactly due to JIT compilation. This is still something we can improve in the future when there's better metrics for performance. Did you have numbers for how it compares to the baseline? A table with each implementation, it's min, mean, and max timings (after warmup) as well as a relative improvement over the baseline (three columns total) would be really helpful!

Sgtm, updated code & pr description, thank you for the review!

Copy link
Copy Markdown
Collaborator

@jaschdoc jaschdoc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the patch! Could I ask you to squash the changes into a single commit?

@shmuelr shmuelr force-pushed the ksptype-hashcode branch from 77cd68e to 614a981 Compare May 19, 2026 13:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[KSP2][Performance]: Memoize KSTypeImpl.fullyExpand() or KSTypeImpl.hashCode()?

2 participants