ADFA-4388 | embedding model crash by jatezzz · Pull Request #1429 · appdevforall/CodeOnTheGo

jatezzz · 2026-06-19T18:52:32Z

Description

Crash Before Fix:
FATAL EXCEPTION: Llm-RunLoop
SIGABRT
decode: cannot decode batches with this context (calling encode() instead)

Error Message After Fix:
The selected model 'all-MiniLM-L6-v2-ggml-model-f16.gguf' is an embedding model
designed for semantic search and similarity tasks. It cannot be used for chat or
text generation.

Please select a chat/instruct model instead (e.g., models with 'chat', 'instruct', 'conversational' in their name).

Details

telegram-cloud-document-1-5022011912393590501.mp4

Ticket

ADFA-4388

Observation

This fix was developed based on consistent crash reproduction with embedding models (e.g., all-MiniLM-L6-v2). The crash occurred 100% of the time when users selected
embedding models for chat, resulting in SIGABRT in the Llm-RunLoop thread.

The multi-layer approach ensures that:

Native layer catches the issue first (pooling type check)
Kotlin layer provides fallback validation
UI layer gracefully handles exceptions and displays helpful guidance

The app now correctly identifies incompatible model types and guides users to select appropriate chat/instruct models, eliminating the crash entirely.

claude

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

_{Tip: disable this comment in your organization's Code Review settings.}

coderabbitai · 2026-06-19T18:59:26Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 282295a5-9280-4076-9aed-6699fcf94e57

📥 Commits

Reviewing files that changed from the base of the PR and between 999e7a3 and 595ac56.

📒 Files selected for processing (3)

app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt
app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt
resources/src/main/res/values/strings.xml

🚧 Files skipped from review as they are similar to previous changes (2)

resources/src/main/res/values/strings.xml
app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt

📝 Walkthrough

Fixed a crash caused by selecting embedding models for chat/text generation by rejecting incompatible models earlier in the load path.
Added layered validation across native, Kotlin, and UI code so embedding models and other unsupported formats are detected before they reach generation.
Introduced ModelLoadResult to distinguish successful loads, user-facing rejections, and unexpected failures.
Updated local model loading flows to preserve and surface specific rejection messages instead of collapsing them into a generic failure state.
Improved chat startup error handling so the UI now shows clearer guidance when a local model cannot be used for chat.
Added user-facing strings explaining unsupported model formats and explicitly directing users to chat/instruct models.
Risk: several validation checks rely on filename heuristics and format inference, so unusual or misnamed models may be rejected or misclassified.
Risk: native and Kotlin error-message matching introduces some duplication and string-based branching, which can be brittle over time.
Best-practice note: the new layered rejection path reduces crash risk, but the extra validation logic should be kept in sync across native and Kotlin layers to avoid inconsistent behavior.

Walkthrough

This PR adds structured model-load outcomes, stronger native and Kotlin validation, repository and ViewModel handling for load results, UI expansion after model selection, and a native library version bump.

Changes

Model Loading and Validation Flow

Layer / File(s)	Summary
Native model load, context creation, and JNI utilities `llama-impl/src/main/cpp/llama-android.cpp`	Adds GGUF file validation in model loading, clamps context size to the model training context, rejects embedding models after context creation, and exports JNI helpers for pooling type and model description.
Native completion safety checks `llama-impl/src/main/cpp/llama-android.cpp`	Adds pooling-type guards, null and tokenization validation, KV-capacity checks, decode error handling, and generation-loop safety checks in `completion_init` and `completion_loop`.
Kotlin bridge load and send validation `llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt`	Adds native bindings for model description and pooling type, updates `load()` with native validation and dry-run checks, and strengthens `send()` with pooling checks, context-window validation, and generation-loop guards.
Engine format rules and load results `app/src/main/java/com/itsaky/androidide/agent/model/ModelLoadResult.kt`, `app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`	Introduces `ModelLoadResult`, centralizes supported extension and keyword constants, refactors engine model loading to return structured outcomes, and updates format validation and family detection.
Repository and ViewModel load result handling `app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`, `app/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.kt`, `resources/src/main/res/values/strings.xml`	Updates repository and ViewModel model-loading branches to handle `ModelLoadResult`, propagate load errors, update agent state, reorganize the LiveData state block, and add model-load error strings.
Chat repository setup for local LLM loading `app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt`	Refactors local-LLM setup into helper methods that decide when to reload, invoke `initModelFromFile`, handle `ModelLoadResult`, and construct `LocalLlmRepositoryImpl` on success.
UI auto-expand and library version bump `app/src/main/java/com/itsaky/androidide/agent/fragments/AiSettingsFragment.kt`, `app/src/main/java/com/itsaky/androidide/utils/DynamicLibraryLoader.kt`	Schedules bottom-sheet expansion and agent-tab selection after model loading in `AiSettingsFragment`, removes an inline comment from the status view lookup, and bumps `LLAMA_LIB_VERSION` from `5` to `8`.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant AiSettingsFragment
  participant AiSettingsViewModel
  participant LlmInferenceEngine
  participant LLamaAndroid
  participant NativeCPP

  User->>AiSettingsFragment: selects model file
  AiSettingsFragment->>AiSettingsViewModel: loadModelFromUri(uri)
  AiSettingsViewModel->>LlmInferenceEngine: initModelFromFile(uri)
  LlmInferenceEngine->>LlmInferenceEngine: validateModelFormat(displayName)
  alt unsupported format or embedding filename
    LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Rejected / Failed
    AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Error(message)
  else accepted
    LlmInferenceEngine->>LLamaAndroid: load(modelPath)
    LLamaAndroid->>NativeCPP: load_model()
    NativeCPP->>NativeCPP: GGUF validation
    LLamaAndroid->>NativeCPP: new_context()
    NativeCPP->>NativeCPP: pooling-type check
    LLamaAndroid->>NativeCPP: completion_init()
    alt validation failure
      NativeCPP-->>LLamaAndroid: throw IllegalStateException
      LLamaAndroid-->>LlmInferenceEngine: exception
      LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Failed / Rejected
      AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Error(message)
    else success
      LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Loaded
      AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Loaded
      AiSettingsFragment->>AiSettingsFragment: postDelayed expand bottom sheet and switch to TAB_AGENT
    end
  end

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

itsaky-adfa
Daniel-ADFA

Poem

🐇 Hop hop, the model path is clear,
GGUF bells are ringing near.
The agent tab now leaps in sight,
And bunny paws type through the night.
A softer hop, a safer load,
On carrot trails the changes flowed.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title is concise and matches the main change: preventing crashes from embedding models.
Description check	✅ Passed	The description directly explains the embedding-model crash and the layered fix in the PR.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch fix/ADFA-4388-embedding-model-crash

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands.}

coderabbitai

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`:
- Around line 343-344: The error message check in LlmInferenceEngine.kt at the
if statement checking for "embedding model" is case-sensitive and will miss
error messages with different casing like "Embedding model". Make the message
comparison case-insensitive by converting the error message to lowercase before
performing the contains check, or by using a case-insensitive comparison method
that ignores the case parameter. This ensures that all variations of the
embedding model error message are properly detected and the user receives
appropriate guidance.

In `@llama-impl/src/main/cpp/llama-android.cpp`:
- Around line 814-821: The validation check for `batch->n_tokens < 0` in the
llama-android.cpp decode function is unreachable because it is placed after a
code path that already requires `batch->n_tokens > 0`, meaning negative values
will never reach this validation block. Move the negative `n_tokens` validation
earlier in the function, before any conditional logic that assumes a positive
token count, to ensure the check is actually executed when the batch object has
been corrupted with a negative count value.
- Around line 993-997: The batch null-check block containing the LOGe statement
and return nullptr is positioned too late in the function. Move this entire
safety check block to the very beginning of the function implementation, before
any code that dereferences or uses the batch parameter, to ensure the
null/invalid batch is caught before any potential crash occurs from
dereferencing batch in subsequent operations.

In `@llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt`:
- Around line 276-291: The broad catch block around the context validation logic
is catching the intentionally-thrown IllegalStateException about message length
being too long and re-wrapping it with a generic "Failed to validate message
length" error, which loses the specific user-facing error detail. Modify the
exception handling to either rethrow the IllegalStateException that is
explicitly thrown with the "Message is too long for the model's context window"
message, or restructure the try-catch to only catch exceptions from specific
operations like tokenize() and model_n_ctx() calls rather than catching all
exceptions after the explicit throw statement.
- Around line 192-197: The new_batch() and new_sampler() calls in the
initialization sequence do not properly clean up previously allocated native
resources when allocation fails. If new_sampler() fails after new_batch()
succeeds, the batch handle is leaked. Wrap these allocation calls with proper
resource cleanup by implementing a try-catch block that ensures all allocated
native resources (batch, sampler, model, and context handles) are freed before
re-throwing the exception when either new_batch() or new_sampler() fails. Use
the corresponding free functions to release the handles in the correct order.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 78d630fd-79f4-49b1-81c7-a75f68d14029

📥 Commits

Reviewing files that changed from the base of the PR and between 7acac35 and 8200a6b.

📒 Files selected for processing (6)

app/src/main/java/com/itsaky/androidide/agent/fragments/AiSettingsFragment.kt
app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt
app/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.kt
app/src/main/java/com/itsaky/androidide/utils/DynamicLibraryLoader.kt
llama-impl/src/main/cpp/llama-android.cpp
llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt

Added multi-layer protection to detect and reject embedding models: **Native Layer (C++):** - Check pooling_type in new_context() - reject if not LLAMA_POOLING_TYPE_NONE - Added get_pooling_type() JNI function for Kotlin validation - Clear error messages explaining embedding vs generative models **Kotlin Layer:** - Validate model during load() in LLamaAndroid.kt - Catch IllegalStateException and wrap with user-friendly message - File format validation for ONNX, PyTorch, TensorFlow, etc. **UI Layer:** - Proper exception handling in AiSettingsViewModel - Display error in ModelLoadingState.Error instead of crashing - Keep bottom sheet expanded after file picker to show error/status **Infrastructure:** - Rebuilt llama.cpp AAR with updated native code (v8) - Updated LLAMA_LIB_VERSION to 8 in DynamicLibraryLoader The app now gracefully handles embedding models with clear error messages instead of crashing with SIGABRT. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

- Removed inline comments explaining bottom sheet behavior - Extracted file extension strings to named constants (EXT_*) - Extracted keyword strings to named constants (KEYWORD_*) - Improved code maintainability and reduced duplication Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

Reject embedding/Mini models (all-MiniLM, mpnet, e5, *embed*) up front by filename instead of only warning, so they never reach the native loader. The deep C++ pooling_type check remains as a second line of defense. Introduce a ModelLoadResult sealed type (Loaded/Rejected/Failed) so callers can distinguish an unsupported/embedding model from a generic load failure; migrate AiSettingsViewModel, LocalLlmRepositoryImpl and ChatViewModel to it, and extract ChatViewModel's local-LLM setup into focused helpers. Move all user-facing model-load error messages to the resources module strings.xml. Free native model/context/batch handles on load failure paths to avoid leaks, and harden batch null/negative-token guards in the JNI layer.

coderabbitai

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`:
- Line 296: The `initModelFromFile` API now returns `ModelLoadResult`, so update
the remaining boolean-style consumer in `LocalLlmIntegrationTest` to stop
passing the result into `assertTrue(...)`. Instead, use the `initModelFromFile`
result directly, assert it is `ModelLoadResult.Loaded`, and then verify the
loaded metadata on that returned value so the test matches the new
`LlmInferenceEngine` contract.
- Around line 326-329: The URI/display-name lookup is happening before the
load-result error handling, so exceptions from resolveModelDisplayName(...) can
escape instead of being converted to ModelLoadResult.Failed. Move the
modelUri.toUri() and resolveModelDisplayName(context, modelUri) work inside the
same try block in the load flow, and use a safe fallback display name in the
catch path so stale or revoked content URIs are handled by the existing failure
result.

In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`:
- Around line 75-95: The load result handling in LocalLlmRepositoryImpl
currently hides the rejection guidance by mapping ModelLoadResult.Rejected to
the generic failure status. Update the success/rejected/failed branch around
engine.initModelFromFile so that rejected loads surface result.message via
AgentState.Error or the displayed status text instead of
agent_local_model_loaded_failure, while keeping the existing handling for
ModelLoadResult.Failed and the final state updates in place.

In `@app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt`:
- Line 184: The local model load handling in ChatViewModel is collapsing
Rejected/Failed into a plain boolean, which causes retrieveAgentResponse to show
only the generic “local model is not loaded” message. Update
handleLocalModelLoadResult and its callers so the specific
ModelLoadResult.message is propagated back to the chat flow instead of returning
just false, and make sure the failure path in retrieveAgentResponse posts that
detailed reason for the UI.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1f6674b7-5585-4fac-80e5-c2d0df49add5

📥 Commits

Reviewing files that changed from the base of the PR and between 8200a6b and fa0ccd7.

📒 Files selected for processing (8)

app/src/main/java/com/itsaky/androidide/agent/model/ModelLoadResult.kt
app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt
app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt
app/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.kt
app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt
llama-impl/src/main/cpp/llama-android.cpp
llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
resources/src/main/res/values/strings.xml

✅ Files skipped from review due to trivial changes (1)

app/src/main/java/com/itsaky/androidide/agent/model/ModelLoadResult.kt

🚧 Files skipped from review as they are similar to previous changes (2)

llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
llama-impl/src/main/cpp/llama-android.cpp

Local model rejections/failures previously collapsed to a generic message, hiding the embedding-model guidance the load result carries. - LocalLlmRepositoryImpl.loadModel: emit AgentState.Error(result.message) for Rejected/Failed instead of the generic loaded_failure status. - ChatViewModel: propagate the rejection/failure message via localModelLoadError so retrieveAgentResponse posts the specific reason, falling back to the generic "model not loaded" text when absent.

jatezzz requested review from Daniel-ADFA, dara-abijo-adfa, hal-eisen-adfa, itsaky-adfa and jomen-adfa June 19, 2026 18:52

claude Bot reviewed Jun 19, 2026

View reviewed changes

coderabbitai Bot reviewed Jun 19, 2026

View reviewed changes

hal-eisen-adfa reviewed Jun 19, 2026

View reviewed changes

Comment thread app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt Outdated

hal-eisen-adfa reviewed Jun 19, 2026

View reviewed changes

Comment thread llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt

jomen-adfa approved these changes Jun 22, 2026

View reviewed changes

jatezzz and others added 4 commits June 25, 2026 11:44

chore(ADFA-4388): Remove explanatory comments from Kotlin files

13baf7a

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>

jatezzz force-pushed the fix/ADFA-4388-embedding-model-crash branch from fa0ccd7 to 999e7a3 Compare June 25, 2026 16:44

coderabbitai Bot reviewed Jun 25, 2026

View reviewed changes

jatezzz requested review from hal-eisen-adfa and jomen-adfa June 25, 2026 16:45

jomen-adfa approved these changes Jun 26, 2026

View reviewed changes

Daniel-ADFA approved these changes Jun 26, 2026

View reviewed changes

Merge branch 'stage' into fix/ADFA-4388-embedding-model-crash

595ac56

jatezzz merged commit 939b191 into stage Jun 26, 2026
2 checks passed

jatezzz deleted the fix/ADFA-4388-embedding-model-crash branch June 26, 2026 17:39

Uh oh!

Uh oh!

Conversation

jatezzz commented Jun 19, 2026 • edited by atlassian Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Details

Ticket

Observation

Uh oh!

claude Bot left a comment

Choose a reason for hiding this comment

Claude Code Review

Uh oh!

coderabbitai Bot commented Jun 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Suggested reviewers

Poem

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

jatezzz commented Jun 19, 2026 •

edited by atlassian Bot

Loading

coderabbitai Bot commented Jun 19, 2026 •

edited

Loading