Interrupt is broken on JVM 20+ #296

bbatsov · 2023-07-03T06:31:21Z

nREPL relied on Thread.stop to kill the session thread when needed and the function was finally removed from Java in Java 20 (after being deprecated for ages). The current nREPL implementation tries Tread.interrupt first and fallbacks on Thread.stop if that fails.

Any ideas for fixing this on Java 20+ are welcome!

See clojure-emacs/cider#3351 for more details.

The text was updated successfully, but these errors were encountered:

socksy · 2023-07-03T11:22:20Z

For reference, this is how it was solved in jshell: openjdk/jdk#10166

Looks like they override the jdk internal(!) MethodVisitor to add a check to a new static method on every single JVM jump instruction. If a new allStop boolean is true, then it generates a ThreadDeath object and throws it. They then instrument all code to be run in their execution environment with it. Then instead of Thread.stop they set this allStop value to be true, then call Thread.interrupt on the original thread. So basically, reimplementing

Few things about that:

:\
ThreadDeath is explicitly deprecated too, so that doesn't sound like a super long term solution. I'm surprised they didn't use the same mechanism that .interrupt uses, but maybe that's not possible from the bytecode (I'm no JVM expert...)
it's not clear what performance impact this might have. I would be personally surprised if I was running code in the repl and it was dramatically slower in running than the final JAR — I would expect it to mess with my code
depending on internal classes is something jshell can do because it's part of the JDK, but I guess it's not something nREPL should be doing?
perhaps this could be done on a clojure level instead, which would certainly make it easier to reason about, but doesn't help you if you're using java interop and the looping is happening there instead

bbatsov · 2023-07-03T11:47:07Z

@socksy Thanks for sharing this! I'm hoping we'd be able to come up with something simpler and more future-proof, although all my research so far indicates that'd be pretty hard. In essence Thread.interrupt expects that the threads you'd be interrupting will be cooperating, but in our case we have no idea what the user is evaluating in the session thread that we'd like to interrupt...

bsless · 2023-07-03T13:15:38Z

I found an official JDK guide regarding this deprecation. Anything there you could use?

bbatsov · 2023-07-04T07:41:05Z

@bsless I don't see anything that can help us there. As we're running user code on the session eval thread we can't really add code there that would make it interruptible.

danielcompton · 2023-07-04T07:59:13Z

@bbatsov Perhaps it's worth bringing this up on the JDK mailing list to see if they would reconsider removing it, perhaps putting it behind a flag?

bbatsov · 2023-07-04T08:01:31Z

I'm guessing that ship has sailed if there's already a release without it. I've never heard of the JDK team going back on a removal.

socksy · 2023-07-04T12:28:00Z

@bbatsov yeah, .interrupt only works if the user code respects it, which is why jshell is injecting in a boolean check in the jvm bytecode for every looping instruction — would that be possible here?

Alternatively, if you wanted to be (even more) hacky, perhaps we could write a fake Thread.stop library using pthreads directly in a JNI based library? 🙃

bbatsov · 2024-05-07T05:27:18Z

FYI - @alexander-yakushev proposed a potential fix for this in #318

@PEZ and @danielcompton - please, take a look, as the fix will require some client co-operation as it involves loading a JVMTI agent.

PEZ · 2024-05-07T06:32:06Z

Thanks for headsup, @bbatsov!

In the PR you say that CIDER will inject this as jack-in params. Can you share what that would look like. As Calva's jack-in is quite similar, I think we can consider to do whatever CIDER does.

alexander-yakushev · 2024-05-07T06:36:34Z

@PEZ

For tools.deps, this means adding -J-Djdk.attach.allowAttachSelf to the launch command. For Leiningen, it will be something similar to update :jvm-opts.

cichli · 2024-05-07T16:37:23Z

I looked at this issue recently too and came up with a POC that uses the JDWP agent and JDI to forcibly stop threads. The JDWP native lib has a way of stopping threads and is part of the JVM, so using it avoids the need to build our own native code. Unfortunately none of that lib is accessible via JNI (as far as I can tell), so we need to start the JDWP agent and communicate with it via JDI.

Just like the native agent solution, the JVM must be invoked with -Djdk.attach.allowAttachSelf. It also requires the JDWP agent to be started with -agentlib:jdwp=transport=dt_socket,server=y,suspend=n – it can’t be loaded dynamically after VM startup. Fortunately, running the JDWP agent introduces very little overhead.

One big tradeoff is that JDWP only supports a single transport. As it stands now the POC would prevent the use of JDWP debuggers completely. This could be mitigated by either:

Only connecting to the JDWP agent when we need to stop a thread, and then disconnecting afterwards – this would mean a JDWP debugger could still be used, but interrupting REPL evals would not work while said debugger is connected.
Keeping the connection persistently but adding an agent to nREPL that acts as a proxy to the JDWP agent using its own socket – anyone who wants to use a debugger would connect to that instead.

It's a bigger imposition on users than #318 but less code and less build complexity. What do you think @alexander-yakushev @bbatsov?

alexander-yakushev · 2024-05-07T21:41:17Z

Thank you, Michael! If only you had shown your POC earlier, I wouldn't bother with the agent, and now we wouldn't have the trouble of choice 😄.

I like the fact about JDWP approach that we don't have to compile and ship any native code. And I don't like that approach having a runtime impact. If I understand the numbers in the article that you linked correctly, then the runtime overhead from JDWP is ~6% which is significant (not arguing whether it is big or not, but it's not within a measurement error). What also bothers me is that the overhead exists even if the debugger is disabled which means it comes from the capabilities requested by JDWP in its Agent_OnLoad section. Those caps influence how JVM behaves in regard to optimizations, tweak JIT and inliner, so you get a differently behaving environment, even if slightly.

Here's the comparison of two solutions as a table:

	Custom agent	JDWP
Uses non-conventional Java features	Yes	Yes
Requires compiling native code for multiple platforms	Yes	No
Needs new Clojure code to be loaded	Yes, ~50 lines	Yes, ~30 lines
Requires additional Java options	-Djdk.attach.allowAttachSelf	-Djdk.attach.allowAttachSelf -agentlib:jdwp=transport=dt_socket,server=y,suspend=n
Has runtime overhead	None that I'm aware of	Yes
Has impact on VM behavior if not used	No, or minimal	Yes
Has additional restrictions	Yes, only works on platforms that we compiled agent for	Yes, conflicts with Java debugger

bbatsov added Bug Help Wanted Java labels Jul 3, 2023

bbatsov mentioned this issue Jul 3, 2023

cider-interrupt does not stop whatever was being evaluated clojure-emacs/cider#3351

Closed

PEZ mentioned this issue Oct 8, 2023

Inform about that Interrupt Evaluation does not awlays work on JVM 20+ BetterThanTomorrow/calva#2324

Open

alexander-yakushev mentioned this issue May 6, 2024

[session] Introduce custom JVMTI agent to restore Thread.stop() on JDK20+ #318

Open

7 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Interrupt is broken on JVM 20+ #296

Interrupt is broken on JVM 20+ #296

bbatsov commented Jul 3, 2023

socksy commented Jul 3, 2023

bbatsov commented Jul 3, 2023

bsless commented Jul 3, 2023

bbatsov commented Jul 4, 2023

danielcompton commented Jul 4, 2023

bbatsov commented Jul 4, 2023

socksy commented Jul 4, 2023

bbatsov commented May 7, 2024

PEZ commented May 7, 2024

alexander-yakushev commented May 7, 2024 •

edited

cichli commented May 7, 2024

alexander-yakushev commented May 7, 2024

Interrupt is broken on JVM 20+ #296

Interrupt is broken on JVM 20+ #296

Comments

bbatsov commented Jul 3, 2023

socksy commented Jul 3, 2023

bbatsov commented Jul 3, 2023

bsless commented Jul 3, 2023

bbatsov commented Jul 4, 2023

danielcompton commented Jul 4, 2023

bbatsov commented Jul 4, 2023

socksy commented Jul 4, 2023

bbatsov commented May 7, 2024

PEZ commented May 7, 2024

alexander-yakushev commented May 7, 2024 • edited

cichli commented May 7, 2024

alexander-yakushev commented May 7, 2024

alexander-yakushev commented May 7, 2024 •

edited