Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ArC RequestContext - get rid of LazyValue for container lifecycle events #28330

Merged

Conversation

mkouba
Copy link
Contributor

@mkouba mkouba commented Sep 30, 2022

  • to speed up the request context activation/deactivation in majority of cases

@quarkus-bot quarkus-bot bot added the area/arc Issue related to ARC (dependency injection) label Sep 30, 2022
@quarkus-bot

This comment has been minimized.

@mkouba mkouba force-pushed the arc-req-context-get-rid-of-lazyvalues branch from 36739bd to 5c1bef0 Compare October 3, 2022 08:01
@mkouba
Copy link
Contributor Author

mkouba commented Oct 3, 2022

I've also added the RequestContextActivationBenchmark but the results are a bit surprising. I can see a small perf drop in the 2.12 branch, which is expected - we added some trace logging in the RequestContext. However, I can see another unexpected small perf drop in the main branch (i.e. before the commit in this pull request). Moreover, this commit does not help even though we evidently get rid of several volatile read operations.

UPDATE: I did replace the synchronized block with VarHandle.compareAndSet() and the results are much better!

In any case, I think that these changes make the code more readable. So I'd like to merge this anyway.

@mkouba mkouba force-pushed the arc-req-context-get-rid-of-lazyvalues branch from 5c1bef0 to 6e78bda Compare October 3, 2022 11:46
@mkouba mkouba requested a review from Ladicek October 3, 2022 11:52
Copy link
Contributor

@Ladicek Ladicek left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I'd just call it IS_VALID instead of VALUE, but that's nitpicking.

@mkouba mkouba added the triage/waiting-for-ci Ready to merge when CI successfully finishes label Oct 3, 2022
@mkouba
Copy link
Contributor Author

mkouba commented Oct 3, 2022

@franz1981 yet another piece of the performance puzzle :-)

@franz1981
Copy link
Contributor

franz1981 commented Oct 3, 2022

@mkouba

Just a thing

UPDATE: I did replace the synchronized block with VarHandle.compareAndSet() and the results are much better!

Did you tried with an volatile int field to replace the boolean one and AtomicIntegerFieldUpdater to perform the compareAndSet?
Try it if possible:

  • boolean atomic ops shouldn't exists
  • I don't know what var handle does here because of that(!)

@mkouba
Copy link
Contributor Author

mkouba commented Oct 3, 2022

I did notice that AtomicBoolean is using int but I thought that it's way too much... I mean the VarHandle API is not exactly friendly.

boolean atomic ops shouldn't exists

Why is that? Lookup.findVarHandle() does not seem to forbid boolean values.

@franz1981
Copy link
Contributor

franz1981 commented Oct 3, 2022

Why is that? Lookup.findVarHandle() does not seem to forbid boolean values.

Because there are no atomic ops in any instruction sets involving such tiny types in any CPU archs- meaning that something is happening to make it happen...
With int updater we can see if it brings any improvement/change and what happen

@Ladicek
Copy link
Contributor

Ladicek commented Oct 3, 2022

FTR, my OpenJDK 11 includes atomic ops for booleans (as conversions to/from bytes, see java.lang.invoke.VarHandleBooleans and jdk.internal.misc.Unsafe) and the MethodHandles.Lookup.findVarHandle javadoc seems to agree. I'm no expert for sure, but I don't see why this shouldn't work.

@franz1981
Copy link
Contributor

franz1981 commented Oct 3, 2022

I don't see why this shouldn't work.

not saying it isn't work, but I don't know what it does under the hood :)
ie maybe the class field boolean used with var handle become an int (?) - it's something that worth knowing and use the right type if perf is different

@Ladicek
Copy link
Contributor

Ladicek commented Oct 3, 2022

The field doesn't become an int, but the CAS operation is implemented as if there was an int (with bit masking to avoid changing the nearby memory).

I guess it would be fine to make the field int directly. Currently, there's likely some padding after the boolean, so the object size should stay intact, I guess.

@franz1981
Copy link
Contributor

franz1981 commented Oct 3, 2022

but the CAS operation is implemented as if there was an int (with bit masking to avoid changing the nearby memory).

The field must be aligned as int to make it work, while in the field declaration (maybe) here is boolean (and JOL can show how is packed in the class layout), unless is declared volatile (and I didn't checked TBH).
If you have a reference of the assembly or what the JIT does here, would be great (on JDK source code too)

In C++ sadly makes the ops on types that doesn't support the right "grain" to become a locked one (for real), that's why I'm asking

I don't know how RequestContextState's isValid is declared here

@mkouba
Copy link
Contributor Author

mkouba commented Oct 3, 2022

Ok, I can't say I completely understand all the comments above but let's use int. One last question - VarHandle or AtomicIntegerFieldUpdater? It seems that both use Unsafe under the hood...

@Ladicek
Copy link
Contributor

Ladicek commented Oct 3, 2022

My understanding has been that VarHandle is the modern replacement for Atomic*FieldUpdater, but I might easily be wrong about that :-)

@franz1981
Copy link
Contributor

franz1981 commented Oct 3, 2022

I've verified that the ASM produced for AtomicBoolean vs VarHande is exactly the same (for JDK 11), I'll update with more info just for the record ie -> it's fine VarHandle (yay)

And "mystery" solved too:

  • var handle allow to leave boolean field at the byte granulary and just perform compare and set ops using that granularity
  • AtomicInteger has been updated to use VarHandle but still at integer granulary (their asm diff slightly)

Their perf is exactly the same, because moving byte/int register(s) to memory won't make much difference (cache invalidation is still with much higher granularity anyway and hw barriers are way more costy)

hence

The field doesn't become an int, but the CAS operation is implemented as if there was an int (with bit masking to avoid changing the nearby memory).

the CAS is implemented as native byte (I was wrong, x86 cannot do it with bit-level granularity, my faulty memory, sorry) and no masking is happening if not the one implemented on ASM level eg

lock cmpxchg %r8b,0xc(%r12,%r10,8)

vs

lock cmpxchg %r8d,0xc(%r12,%r10,8)

the b vs d letter change which part of the full register to use ie 1 byte vs 4 bytes

@quarkus-bot

This comment has been minimized.

- get rid of LazyValue for container lifecycle events
- get rid of the synchronized block in the destroy() method
- this should speed up the request context a little bit in majority of cases
@mkouba mkouba force-pushed the arc-req-context-get-rid-of-lazyvalues branch from 6e78bda to 88e9c41 Compare October 3, 2022 17:37
@quarkus-bot

This comment has been minimized.

@quarkus-bot
Copy link

quarkus-bot bot commented Oct 4, 2022

Failing Jobs - Building 88e9c41

Status Name Step Failures Logs Raw logs
✔️ JVM Tests - JDK 11
✔️ JVM Tests - JDK 17
JVM Tests - JDK 17 MacOS M1 Set up runner ⚠️ Check → Logs Raw logs
✔️ JVM Tests - JDK 18

@Ladicek
Copy link
Contributor

Ladicek commented Oct 4, 2022

Ah my bad, I only looked at the Java code, and there's apparently an intrinsic for the boolean (or byte) case too. Nice! However, if using AtomicBoolean ends up with the same code (the optimizer can figure out that the identity of the object is never used? Nice!), then that's probably the best.

@mkouba mkouba merged commit 0bcebb3 into quarkusio:main Oct 4, 2022
@quarkus-bot quarkus-bot bot added this to the 2.14 - main milestone Oct 4, 2022
@quarkus-bot quarkus-bot bot removed the triage/waiting-for-ci Ready to merge when CI successfully finishes label Oct 4, 2022
@Sanne
Copy link
Member

Sanne commented Oct 6, 2022

Nice!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/arc Issue related to ARC (dependency injection)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants