Android: LlmModule Closeable lifecycle and ModelRunner crash fix #19012
Android: LlmModule Closeable lifecycle and ModelRunner crash fix #19012psiddh merged 10 commits intopytorch:mainfrom
Conversation
LlmModule: Add ReentrantReadWriteLock to prevent use-after-free when close() races with generate(). generate()/prefill()/load() acquire the read lock; close() acquires the write lock after calling stop(). The fair lock ensures close() drains in-flight operations. stop() remains lock-free for cross-thread interrupt capability. LlmModule now implements Closeable. resetNative() is deprecated and delegates to close(). ModelRunner: Add early return after loadMethod() failure to prevent cascading crash in warmup/inference loops. This commit was authored with the help of Claude.
Add 3 LlmModule lifecycle tests: - testUseAfterCloseThrows: generate() after close() throws IllegalStateException - testStopAfterCloseIsNoOp: stop() after close() is safe - testCloseIsIdempotent: double close() is safe Fix ModelRunner: call module.destroy() on both error and success paths to prevent native memory leaks. This commit was authored with the help of Claude.
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19012
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New Failure, 2 Pending, 2 Unrelated FailuresAs of commit 078f425 with merge base 66e4656 ( NEW FAILURE - The following job has failed:
BROKEN TRUNK - The following jobs failed but were present on the merge base:👉 Rebase onto the `viable/strict` branch to avoid these failures
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
This PR needs a
|
There was a problem hiding this comment.
Pull request overview
This PR focuses on preventing lifecycle-related crashes in the Android benchmarking and LLM extension codepaths by tightening native resource cleanup and aligning JNI naming with the Java wrapper pattern.
Changes:
- Ensure benchmark modules are destroyed on both success and load-failure paths.
- Update LlmModule lifecycle handling (introduce
close(), add locking, and wrap native calls with closed-state checks). - Rename the LLM JNI stop binding to
stopNativeand add instrumentation tests covering close/stop behavior.
Reviewed changes
Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.
| File | Description |
|---|---|
extension/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/ModelRunner.java |
Adds early-return metrics on load failure and ensures module.destroy() is called on success/failure paths. |
extension/android/jni/jni_layer_llama.cpp |
Renames JNI registration from stop to stopNative to match the Java wrapper API. |
extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java |
Introduces Closeable, adds read/write locking and closed-state checks, and splits stop() vs stopNative(). |
extension/android/executorch_android/src/androidTest/java/org/pytorch/executorch/LlmModuleInstrumentationTest.kt |
Adds lifecycle instrumentation tests for use-after-close, stop-after-close, and idempotent close. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
|
Regarding concurrent operations, we don't want to allow concurrent load, generate, or prefill, right? The runtime itself is not thread safe for concurrent execution of the same method, and there is shared KV cache state. |
Remove ReentrantReadWriteLock — the native runtime is not thread-safe for concurrent execution, so the lock gives a false sense of safety without changing the caller's obligations. The caller contract is: call stop(), wait for generate() to return, then close(). Plain boolean mDestroyed + checkNotDestroyed() catches same-thread programmer errors with a clean IllegalStateException. stop() uses the native atomic flag for cross-thread interrupt — that is the intended mechanism. This commit was authored with the help of Claude.
Fair point — removed the ReentrantReadWriteLock. The native runtime isn't thread-safe for concurrent execution, so Open to suggestions. |
Ensures module.destroy() runs even if forward() or etdump() throws. This commit was authored with the help of Claude.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
volatile is needed because stop() is callable from any thread and reads
mDestroyed — without it, a cross-thread close() write may not be visible.
Also fix ambiguous {@link #generate} to reference a specific overload.
This commit was authored with the help of Claude.
The caller contract requires single-threaded access: call stop(), wait for generate() to return, then close(). In that sequence stop() always runs before close(), so cross-thread visibility of mDestroyed is not needed. Plain boolean is sufficient for same-thread programmer error detection. This commit was authored with the help of Claude.
stop() does not need a Java wrapper or mDestroyed guard. The cross-thread safety is handled entirely by the native atomic flag in the C++ runner. Single-threaded lifecycle contract means callers coordinate stop() before close() — same as every other method. This commit was authored with the help of Claude.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Approving to unblock. Thanks for adding the close support - looks good.
Regarding locks, I think it might be worth keeping the lock, just changing to a single lock (as opposed to a RW lock). We do this in the non-LLM module, and it catches a lot of accidental unsafe concurrency.
It's very easy for users to accidentally violate thread safety when calling inference from app code. My other comment on locks was mainly that we probably want to avoid allowing the concurrency that RW lock allowed.
| * @param llmCallback callback object to receive results. | ||
| */ | ||
| public void generate(String prompt, LlmCallback llmCallback) { | ||
| checkNotDestroyed(); |
There was a problem hiding this comment.
Did you intend to remove these checks? From the PR description, it sounds like the intent was to catch use after close.
Serializes access to non-thread-safe native state. Prevents accidental concurrent calls (e.g. double-tap, rotation during generation) from causing native crashes — second caller blocks until the first finishes. - generate()/prefill()/load()/resetContext() use lock() - close() uses tryLock() — fails fast, no ANR risk - stop() stays lock-free (native atomic flag for cross-thread interrupt) This commit was authored with the help of Claude.
3634913 to
0f67b86
Compare
Thanks the intent of adding locks is to ensure that the app is safe from crashes, not concurrency point of view (as underlying native runtime is not thread-safe), but with all the error propagation now in place, hopefully it surfaces the right context / error scenario to the app, (This will also serve App developer all the error context, so that they can triage and correct their flow) |
This commit was authored with the help of Claude.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
If close() is called from an LlmCallback during generate() (same thread, re-entrant lock acquisition), detect via getHoldCount() > 1 and throw instead of freeing native resources mid-call. This commit was authored with the help of Claude.
Summary
Closeableso modules can be used with try-with-resources. Addclose()with explicitcaller contract. Deprecate
resetNative()in favor ofclose(). AddReentrantLockto serialize access tonon-thread-safe native state, matching
Module.java's pattern.loadMethod()failure to prevent cascading crash in warmup/inferenceloops. Fix native memory leak by wrapping benchmark body in try/finally for
module.destroy().LlmModule lifecycle design
ReentrantLockserializes all operations — prevents accidental concurrent access (e.g. double-tap, rotationduring generation) from causing native crashes
generate()/prefill*()/load()/resetContext()acquire the lockclose()usestryLock()— fails fast withIllegalStateExceptioninstead of blocking (no ANR risk)stop()remains a bare native method — lock-free, uses C++ atomic flag for cross-thread interruptstop(), wait forgenerate()to return, thenclose()Tests added
testUseAfterCloseThrows:generate()afterclose()throwsIllegalStateExceptiontestCloseIsIdempotent: doubleclose()is safeTest plan
LlmModuleInstrumentationTesttests passThis PR was authored with the help of Claude.