Skip to content

FastThreadLocal's check with virtual threads improvement #15449

@franz1981

Description

@franz1981

#15222 has introduced a way to enable virtual threads (which cannot extends FastThreadLocal) to use many FastThredLocalThread optimizations e.g. adaptive unshared magazines, FastThreadLocal::get, etc etc.

This has introduced a binary search vs the thread's id to verify a thread belong to the known set of registered fast thread local threads.
Since loom enable to create many virtual threads which are not fast thread local thread and we can still have a some which are event loops (see https://micronaut.io/2025/06/30/transitioning-to-virtual-threads-using-the-micronaut-loom-carrier/ and https://github.com/franz1981/Netty-VirtualThread-Scheduler) we would like a O(1) way to detect if a virtual thread doesn't belong to this set (ideally, but we need to measure it - which can be tricky!): in https://github.com/franz1981/netty/tree/4.2_not_fasttl_bench I've implemented a microbench which doesn't stress the branch predictor but still stress the log2(event loops) comparisons for such (negative) test.

I was thinking we could implement such in this way (used in openjdk/jdk#18309):

  • create an array of (fast thread local) Thread's id ordered based on their hashing value
  • store in a long (8 bytes) a bit for each id in the array (hash(id) % 64-th bit position)
  • checking for id presence check first the (hash(id) % 64)-th bit: if there's none, it returns immediately
  • if the bit is set, count the previous other bits (it's a single ASM instruction) to know the position where to start with searching in the array of ids
  • etc etc

This should grant that for decently distributed thread ids and a cheap enough hashing we can have O(1)-ish for both; with slow-path w for loops search (starting from the computed position). Differently from Open addressing with linear probing we cannot just stop searching till the first "empty" is found - since the long id "bloom filter"(ish, again :P ) doesn't allow it (the order of bit set is there to allow the array of ids to be compressed - there are no empty slots, like for hash maps).

wdyt?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions