Android: LlmModule Closeable lifecycle and ModelRunner crash fix by psiddh · Pull Request #19012 · pytorch/executorch

psiddh · 2026-04-20T22:28:27Z

Summary

LlmModule: Implement Closeable so modules can be used with try-with-resources. Add close() with explicit
caller contract. Deprecate resetNative() in favor of close(). Add ReentrantLock to serialize access to
non-thread-safe native state, matching Module.java's pattern.
ModelRunner: Add early return after loadMethod() failure to prevent cascading crash in warmup/inference
loops. Fix native memory leak by wrapping benchmark body in try/finally for module.destroy().

LlmModule lifecycle design

ReentrantLock serializes all operations — prevents accidental concurrent access (e.g. double-tap, rotation
during generation) from causing native crashes
generate()/prefill*()/load()/resetContext() acquire the lock
close() uses tryLock() — fails fast with IllegalStateException instead of blocking (no ANR risk)
stop() remains a bare native method — lock-free, uses C++ atomic flag for cross-thread interrupt
Caller contract: call stop(), wait for generate() to return, then close()

Tests added

testUseAfterCloseThrows: generate() after close() throws IllegalStateException
testCloseIsIdempotent: double close() is safe

Test plan

Existing LlmModuleInstrumentationTest tests pass
New lifecycle tests pass
ModelRunner benchmark handles load failures gracefully

This PR was authored with the help of Claude.

LlmModule: Add ReentrantReadWriteLock to prevent use-after-free when close() races with generate(). generate()/prefill()/load() acquire the read lock; close() acquires the write lock after calling stop(). The fair lock ensures close() drains in-flight operations. stop() remains lock-free for cross-thread interrupt capability. LlmModule now implements Closeable. resetNative() is deprecated and delegates to close(). ModelRunner: Add early return after loadMethod() failure to prevent cascading crash in warmup/inference loops. This commit was authored with the help of Claude.

Add 3 LlmModule lifecycle tests: - testUseAfterCloseThrows: generate() after close() throws IllegalStateException - testStopAfterCloseIsNoOp: stop() after close() is safe - testCloseIsIdempotent: double close() is safe Fix ModelRunner: call module.destroy() on both error and success paths to prevent native memory leaks. This commit was authored with the help of Claude.

pytorch-bot · 2026-04-20T22:28:31Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19012

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

❌ 1 New Failure, 2 Pending, 2 Unrelated Failures

As of commit 078f425 with merge base 66e4656 ():

NEW FAILURE - The following job has failed:

pull / test-qnn-passes-linux / linux-job (gh)
backends/qualcomm/tests/test_passes.py::TestPasses::test_mha_to_sha

BROKEN TRUNK - The following jobs failed but were present on the merge base:

👉 Rebase onto the `viable/strict` branch to avoid these failures

pull / unittest / macos / macos-job (gh) (trunk failure)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv2_model
pull / unittest-editable / macos / macos-job (gh) (trunk failure)
export/tests/test_target_recipes.py::TestTargetRecipes::test_mv2_model

This comment was automatically generated by Dr. CI and updates every 15 minutes.

github-actions · 2026-04-20T22:29:15Z

This PR needs a `release notes:` label

If your change should be included in the release notes (i.e. would users of this library care about this change?), please use a label starting with release notes:. This helps us keep track and include your important work in the next release notes.

To add a label, you can comment to pytorchbot, for example
@pytorchbot label "release notes: none"

For more information, see
https://github.com/pytorch/pytorch/wiki/PyTorch-AutoLabel-Bot#why-categorize-for-release-notes-and-how-does-it-work.

Copilot

Pull request overview

This PR focuses on preventing lifecycle-related crashes in the Android benchmarking and LLM extension codepaths by tightening native resource cleanup and aligning JNI naming with the Java wrapper pattern.

Changes:

Ensure benchmark modules are destroyed on both success and load-failure paths.
Update LlmModule lifecycle handling (introduce close(), add locking, and wrap native calls with closed-state checks).
Rename the LLM JNI stop binding to stopNative and add instrumentation tests covering close/stop behavior.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 3 comments.

File	Description
`extension/benchmark/android/benchmark/app/src/main/java/org/pytorch/minibench/ModelRunner.java`	Adds early-return metrics on load failure and ensures `module.destroy()` is called on success/failure paths.
`extension/android/jni/jni_layer_llama.cpp`	Renames JNI registration from `stop` to `stopNative` to match the Java wrapper API.
`extension/android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java`	Introduces `Closeable`, adds read/write locking and closed-state checks, and splits `stop()` vs `stopNative()`.
`extension/android/executorch_android/src/androidTest/java/org/pytorch/executorch/LlmModuleInstrumentationTest.kt`	Adds lifecycle instrumentation tests for use-after-close, stop-after-close, and idempotent close.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

GregoryComer · 2026-04-20T22:42:12Z

Regarding concurrent operations, we don't want to allow concurrent load, generate, or prefill, right? The runtime itself is not thread safe for concurrent execution of the same method, and there is shared KV cache state.

Remove ReentrantReadWriteLock — the native runtime is not thread-safe for concurrent execution, so the lock gives a false sense of safety without changing the caller's obligations. The caller contract is: call stop(), wait for generate() to return, then close(). Plain boolean mDestroyed + checkNotDestroyed() catches same-thread programmer errors with a clean IllegalStateException. stop() uses the native atomic flag for cross-thread interrupt — that is the intended mechanism. This commit was authored with the help of Claude.

psiddh · 2026-04-21T01:44:55Z

Regarding concurrent operations, we don't want to allow concurrent load, generate, or prefill, right? The runtime itself is not thread safe for concurrent execution of the same method, and there is shared KV cache state.

Fair point — removed the ReentrantReadWriteLock. The native runtime isn't thread-safe for concurrent execution, so
the lock gives a false sense of safety without changing the caller's obligations. Now it's just plain mDestroyed +
checkNotDestroyed() to catch same-thread programmer errors, plus Closeable with an explicit contract: call stop(),
wait for generate() to return, then close(). Related lifecycle tests kept.

Open to suggestions.

Ensures module.destroy() runs even if forward() or etdump() throws. This commit was authored with the help of Claude.

Copilot

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

volatile is needed because stop() is callable from any thread and reads mDestroyed — without it, a cross-thread close() write may not be visible. Also fix ambiguous {@link #generate} to reference a specific overload. This commit was authored with the help of Claude.

The caller contract requires single-threaded access: call stop(), wait for generate() to return, then close(). In that sequence stop() always runs before close(), so cross-thread visibility of mDestroyed is not needed. Plain boolean is sufficient for same-thread programmer error detection. This commit was authored with the help of Claude.

stop() does not need a Java wrapper or mDestroyed guard. The cross-thread safety is handled entirely by the native atomic flag in the C++ runner. Single-threaded lifecycle contract means callers coordinate stop() before close() — same as every other method. This commit was authored with the help of Claude.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

GregoryComer

Approving to unblock. Thanks for adding the close support - looks good.

Regarding locks, I think it might be worth keeping the lock, just changing to a single lock (as opposed to a RW lock). We do this in the non-LLM module, and it catches a lot of accidental unsafe concurrency.

It's very easy for users to accidentally violate thread safety when calling inference from app code. My other comment on locks was mainly that we probably want to avoid allowing the concurrency that RW lock allowed.

GregoryComer · 2026-04-21T17:56:47Z

   * @param llmCallback callback object to receive results.
   */
  public void generate(String prompt, LlmCallback llmCallback) {
-    checkNotDestroyed();


Did you intend to remove these checks? From the PR description, it sounds like the intent was to catch use after close.

Serializes access to non-thread-safe native state. Prevents accidental concurrent calls (e.g. double-tap, rotation during generation) from causing native crashes — second caller blocks until the first finishes. - generate()/prefill()/load()/resetContext() use lock() - close() uses tryLock() — fails fast, no ANR risk - stop() stays lock-free (native atomic flag for cross-thread interrupt) This commit was authored with the help of Claude.

psiddh · 2026-04-21T20:18:16Z

Approving to unblock. Thanks for adding the close support - looks good.

Regarding locks, I think it might be worth keeping the lock, just changing to a single lock (as opposed to a RW lock). We do this in the non-LLM module, and it catches a lot of accidental unsafe concurrency.

It's very easy for users to accidentally violate thread safety when calling inference from app code. My other comment on locks was mainly that we probably want to avoid allowing the concurrency that RW lock allowed.

Thanks the intent of adding locks is to ensure that the app is safe from crashes, not concurrency point of view (as underlying native runtime is not thread-safe), but with all the error propagation now in place, hopefully it surfaces the right context / error scenario to the app, (This will also serve App developer all the error context, so that they can triage and correct their flow)

This commit was authored with the help of Claude.

Copilot

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

If close() is called from an LlmCallback during generate() (same thread, re-entrant lock acquisition), detect via getHoldCount() > 1 and throw instead of freeing native resources mid-call. This commit was authored with the help of Claude.

psiddh added 2 commits April 20, 2026 13:26

psiddh requested review from Gasoonjia and kirklandsign as code owners April 20, 2026 22:28

Copilot AI review requested due to automatic review settings April 20, 2026 22:28

meta-cla Bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Apr 20, 2026

psiddh requested review from GregoryComer and mergennachin April 20, 2026 22:28

Copilot started reviewing on behalf of psiddh April 20, 2026 22:29 View session

psiddh changed the title ~~Followup crash prevention~~ Android: LlmModule thread safety (Closeable + ReadWriteLock) and ModelRunner crash fix Apr 20, 2026

psiddh mentioned this pull request Apr 20, 2026

Android ET APIs : Path to Stable (From Experimental) #18950

Open

Copilot AI reviewed Apr 20, 2026

View reviewed changes

psiddh changed the title ~~Android: LlmModule thread safety (Closeable + ReadWriteLock) and ModelRunner crash fix~~ Android: LlmModule Closeable lifecycle and ModelRunner crash fix Apr 21, 2026

ModelRunner: wrap benchmark body in try/finally for destroy()

14e18a4

Ensures module.destroy() runs even if forward() or etdump() throws. This commit was authored with the help of Claude.

Copilot AI review requested due to automatic review settings April 21, 2026 01:46

Copilot started reviewing on behalf of psiddh April 21, 2026 01:46 View session

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Comment thread ...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java

Comment thread ...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java Outdated

psiddh added 2 commits April 20, 2026 19:10

Copilot AI review requested due to automatic review settings April 21, 2026 02:11

Copilot started reviewing on behalf of psiddh April 21, 2026 02:12 View session

Copilot AI reviewed Apr 21, 2026

View reviewed changes

psiddh changed the title ~~Android: LlmModule Closeable lifecycle and ModelRunner crash fix~~ Title: Android: LlmModule Closeable lifecycle and ModelRunner crash fix Apr 21, 2026

psiddh changed the title ~~Title: Android: LlmModule Closeable lifecycle and ModelRunner crash fix~~ Android: LlmModule Closeable lifecycle and ModelRunner crash fix Apr 21, 2026

GregoryComer approved these changes Apr 21, 2026

View reviewed changes

psiddh force-pushed the followup-crash-prevention branch from 3634913 to 0f67b86 Compare April 21, 2026 20:13

Fix google-java-format lint

841b443

This commit was authored with the help of Claude.

Copilot AI review requested due to automatic review settings April 21, 2026 20:19

Copilot started reviewing on behalf of psiddh April 21, 2026 20:19 View session

Copilot AI reviewed Apr 21, 2026

View reviewed changes

Comment thread ...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java

Comment thread ...android/executorch_android/src/main/java/org/pytorch/executorch/extension/llm/LlmModule.java

psiddh merged commit 54b0148 into pytorch:main Apr 21, 2026
166 of 171 checks passed

Conversation

psiddh commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

LlmModule lifecycle design

Tests added

Test plan

Uh oh!

pytorch-bot Bot commented Apr 20, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/19012

❌ 1 New Failure, 2 Pending, 2 Unrelated Failures

Uh oh!

github-actions Bot commented Apr 20, 2026

This PR needs a release notes: label

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

GregoryComer commented Apr 20, 2026

Uh oh!

psiddh commented Apr 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

GregoryComer left a comment • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

GregoryComer Apr 21, 2026

Choose a reason for hiding this comment

Uh oh!

psiddh commented Apr 21, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

psiddh commented Apr 20, 2026 •

edited

Loading

pytorch-bot Bot commented Apr 20, 2026 •

edited

Loading

This PR needs a `release notes:` label

psiddh commented Apr 21, 2026 •

edited

Loading

GregoryComer left a comment •

edited

Loading