Skip to content

Unsafe convergent cache for traces#1101

Closed
RaasAhsan wants to merge 4 commits intotypelevel:series/2.xfrom
RaasAhsan:feature/unsafe-trace-cache
Closed

Unsafe convergent cache for traces#1101
RaasAhsan wants to merge 4 commits intotypelevel:series/2.xfrom
RaasAhsan:feature/unsafe-trace-cache

Conversation

@RaasAhsan
Copy link
Copy Markdown

@RaasAhsan RaasAhsan commented Aug 19, 2020

This should be the last PR to have tracing ready for 2.2.0. The current version of tracing relies on ConcurrentHashMap to keep a cache of traces. As described in #1076, ConcurrentHashMap offers thread safety at the cost of synchronization via read barriers on the hot read path. Because the cache will eventually converge to an optimal set, it's OK if we don't have safe publication (different threads may not see the same cache), as long as the objects we do see are all safely initialized. Safe initialization means that an object values initialized in a constructor are visible to threads who observe a shared reference to the object. This property is implicitly guaranteed with the usual synchronization primitives that offer thread safety.

The Java memory model offers another method of achieving safe initialization that is much cheaper than synchronization: final fields. The semantics of final fields dictate that as long as this doesn't escape during the constructor of an object and that its final fields are set before the constructor completes, then any thread who observes a shared reference of the initialized object will observe fully initialized values for its final fields, and that guarantee extends transitively.

We leverage these semantics to build a thread-unsafe, convergent cache that achieves a higher read throughput than ConcurrentHashMap.

TODO:

  • Tests
  • Benchmarks
  • Detailed documentation about the memory model guarantees we are leveraging here
  • Collisions beneath mask
  • Max buffer sizes
  • Rewrite the implementation in Scala, verifying field modifiers

Closes #1076.

return this.array[hash];
}

public Buffer grow() {
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TODO: pass the key value pair that caused the original collision here to insert

}

public Node<K, V> put(K k, V v) {
int hash = k.hashCode() & this.mask;
Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHM and HM have another way of generating the hash, will look into that

@retronym
Copy link
Copy Markdown

Storing strong references to the lambda classes as the key of the frame cache smells like a potential classloader leak when the lambda class is loaded from a classloader with a shorter lifecycle than cats-eval's classloader.

java.lang.ClassValue is designed to avoid such leaks. It's also designed to be extremely fast for lookups.

@djspiewak
Copy link
Copy Markdown
Member

java.lang.ClassValue is designed to avoid such leaks. It's also designed to be extremely fast for lookups.

TIL!

@RaasAhsan
Copy link
Copy Markdown
Author

java.lang.ClassValue is designed to avoid such leaks. It's also designed to be extremely fast for lookups.

Didn't know that either, thanks! We've also noticed that Class.hashCode is a pretty costly operation, would be awesome if this could make that cost cheaper

@retronym
Copy link
Copy Markdown

Didn't know that either, thanks! We've also noticed that Class.hashCode is a pretty costly operation, would be awesome if this could make that cost cheaper

That doesn't sound right to me, it statically resolves to Object.hashCode, which is a JIT treats as an intrinsic and amounts to a read from the identity hash code which is computed once and stored in the object header. Profiler's can sometimes give misleading answers here due to safe-point biasing problem, try async-profiler and/or Java Flight Recorder with -XX:+DebugNonSafePoints. It's also a good idea to setup a microbenchmark in JMH to investigate these questions (JMH now integrates with async-profiler and JFR with -prof async and -prof:jfr)

@RaasAhsan
Copy link
Copy Markdown
Author

I'll take a look at those! We did a bit of digging and arrived at the same conclusion about the hashCode being cached in the object header. The aforementioned cost was relative to its absence. I have a JMH benchmark sitting around for a simpler version of what's in this PR, and keying the array by Class.hashCode as opposed to arbitrary integers lowered throughput by 50%, which I thought was odd too

@RaasAhsan
Copy link
Copy Markdown
Author

I forgot to mention, the likely implementation for this cache actually won't retain any class references. We're acknowledging that tracing is an imprecise approximation, so in order to speed up reads, we'll only keep one value per array index. This means that we're effectively ignoring collisions, and to lower collision probability, we can grow the size of the array.

@RaasAhsan
Copy link
Copy Markdown
Author

Going to do this on CE3

@RaasAhsan RaasAhsan closed this Feb 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Replace CHM in tracing cache with thread unsafe hash table

3 participants