Skip to content

ADFA-4388 | embedding model crash#1429

Merged
jatezzz merged 6 commits into
stagefrom
fix/ADFA-4388-embedding-model-crash
Jun 26, 2026
Merged

ADFA-4388 | embedding model crash#1429
jatezzz merged 6 commits into
stagefrom
fix/ADFA-4388-embedding-model-crash

Conversation

@jatezzz

@jatezzz jatezzz commented Jun 19, 2026

Copy link
Copy Markdown
Collaborator

Description

Crash Before Fix:
FATAL EXCEPTION: Llm-RunLoop
SIGABRT
decode: cannot decode batches with this context (calling encode() instead)

Error Message After Fix:
The selected model 'all-MiniLM-L6-v2-ggml-model-f16.gguf' is an embedding model
designed for semantic search and similarity tasks. It cannot be used for chat or
text generation.

Please select a chat/instruct model instead (e.g., models with 'chat', 'instruct', 'conversational' in their name).

Details

telegram-cloud-document-1-5022011912393590501.mp4

Ticket

ADFA-4388

Observation

This fix was developed based on consistent crash reproduction with embedding models (e.g., all-MiniLM-L6-v2). The crash occurred 100% of the time when users selected
embedding models for chat, resulting in SIGABRT in the Llm-RunLoop thread.

The multi-layer approach ensures that:

  1. Native layer catches the issue first (pooling type check)
  2. Kotlin layer provides fallback validation
  3. UI layer gracefully handles exceptions and displays helpful guidance

The app now correctly identifies incompatible model types and guides users to select appropriate chat/instruct models, eliminating the crash entirely.

@claude claude Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Claude Code Review

This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.

Tip: disable this comment in your organization's Code Review settings.

@coderabbitai

coderabbitai Bot commented Jun 19, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 282295a5-9280-4076-9aed-6699fcf94e57

📥 Commits

Reviewing files that changed from the base of the PR and between 999e7a3 and 595ac56.

📒 Files selected for processing (3)
  • app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt
  • app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt
  • resources/src/main/res/values/strings.xml
🚧 Files skipped from review as they are similar to previous changes (2)
  • resources/src/main/res/values/strings.xml
  • app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt

📝 Walkthrough
  • Fixed a crash caused by selecting embedding models for chat/text generation by rejecting incompatible models earlier in the load path.

  • Added layered validation across native, Kotlin, and UI code so embedding models and other unsupported formats are detected before they reach generation.

  • Introduced ModelLoadResult to distinguish successful loads, user-facing rejections, and unexpected failures.

  • Updated local model loading flows to preserve and surface specific rejection messages instead of collapsing them into a generic failure state.

  • Improved chat startup error handling so the UI now shows clearer guidance when a local model cannot be used for chat.

  • Added user-facing strings explaining unsupported model formats and explicitly directing users to chat/instruct models.

  • Risk: several validation checks rely on filename heuristics and format inference, so unusual or misnamed models may be rejected or misclassified.

  • Risk: native and Kotlin error-message matching introduces some duplication and string-based branching, which can be brittle over time.

  • Best-practice note: the new layered rejection path reduces crash risk, but the extra validation logic should be kept in sync across native and Kotlin layers to avoid inconsistent behavior.

Walkthrough

This PR adds structured model-load outcomes, stronger native and Kotlin validation, repository and ViewModel handling for load results, UI expansion after model selection, and a native library version bump.

Changes

Model Loading and Validation Flow

Layer / File(s) Summary
Native model load, context creation, and JNI utilities
llama-impl/src/main/cpp/llama-android.cpp
Adds GGUF file validation in model loading, clamps context size to the model training context, rejects embedding models after context creation, and exports JNI helpers for pooling type and model description.
Native completion safety checks
llama-impl/src/main/cpp/llama-android.cpp
Adds pooling-type guards, null and tokenization validation, KV-capacity checks, decode error handling, and generation-loop safety checks in completion_init and completion_loop.
Kotlin bridge load and send validation
llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
Adds native bindings for model description and pooling type, updates load() with native validation and dry-run checks, and strengthens send() with pooling checks, context-window validation, and generation-loop guards.
Engine format rules and load results
app/src/main/java/com/itsaky/androidide/agent/model/ModelLoadResult.kt, app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt
Introduces ModelLoadResult, centralizes supported extension and keyword constants, refactors engine model loading to return structured outcomes, and updates format validation and family detection.
Repository and ViewModel load result handling
app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt, app/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.kt, resources/src/main/res/values/strings.xml
Updates repository and ViewModel model-loading branches to handle ModelLoadResult, propagate load errors, update agent state, reorganize the LiveData state block, and add model-load error strings.
Chat repository setup for local LLM loading
app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt
Refactors local-LLM setup into helper methods that decide when to reload, invoke initModelFromFile, handle ModelLoadResult, and construct LocalLlmRepositoryImpl on success.
UI auto-expand and library version bump
app/src/main/java/com/itsaky/androidide/agent/fragments/AiSettingsFragment.kt, app/src/main/java/com/itsaky/androidide/utils/DynamicLibraryLoader.kt
Schedules bottom-sheet expansion and agent-tab selection after model loading in AiSettingsFragment, removes an inline comment from the status view lookup, and bumps LLAMA_LIB_VERSION from 5 to 8.

Sequence Diagram(s)

sequenceDiagram
  participant User
  participant AiSettingsFragment
  participant AiSettingsViewModel
  participant LlmInferenceEngine
  participant LLamaAndroid
  participant NativeCPP

  User->>AiSettingsFragment: selects model file
  AiSettingsFragment->>AiSettingsViewModel: loadModelFromUri(uri)
  AiSettingsViewModel->>LlmInferenceEngine: initModelFromFile(uri)
  LlmInferenceEngine->>LlmInferenceEngine: validateModelFormat(displayName)
  alt unsupported format or embedding filename
    LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Rejected / Failed
    AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Error(message)
  else accepted
    LlmInferenceEngine->>LLamaAndroid: load(modelPath)
    LLamaAndroid->>NativeCPP: load_model()
    NativeCPP->>NativeCPP: GGUF validation
    LLamaAndroid->>NativeCPP: new_context()
    NativeCPP->>NativeCPP: pooling-type check
    LLamaAndroid->>NativeCPP: completion_init()
    alt validation failure
      NativeCPP-->>LLamaAndroid: throw IllegalStateException
      LLamaAndroid-->>LlmInferenceEngine: exception
      LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Failed / Rejected
      AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Error(message)
    else success
      LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Loaded
      AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Loaded
      AiSettingsFragment->>AiSettingsFragment: postDelayed expand bottom sheet and switch to TAB_AGENT
    end
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Suggested reviewers

  • itsaky-adfa
  • Daniel-ADFA

Poem

🐇 Hop hop, the model path is clear,
GGUF bells are ringing near.
The agent tab now leaps in sight,
And bunny paws type through the night.
A softer hop, a safer load,
On carrot trails the changes flowed.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 16.67% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Title check ✅ Passed The title is concise and matches the main change: preventing crashes from embedding models.
Description check ✅ Passed The description directly explains the embedding-model crash and the layered fix in the PR.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.
✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch fix/ADFA-4388-embedding-model-crash

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 5

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`:
- Around line 343-344: The error message check in LlmInferenceEngine.kt at the
if statement checking for "embedding model" is case-sensitive and will miss
error messages with different casing like "Embedding model". Make the message
comparison case-insensitive by converting the error message to lowercase before
performing the contains check, or by using a case-insensitive comparison method
that ignores the case parameter. This ensures that all variations of the
embedding model error message are properly detected and the user receives
appropriate guidance.

In `@llama-impl/src/main/cpp/llama-android.cpp`:
- Around line 814-821: The validation check for `batch->n_tokens < 0` in the
llama-android.cpp decode function is unreachable because it is placed after a
code path that already requires `batch->n_tokens > 0`, meaning negative values
will never reach this validation block. Move the negative `n_tokens` validation
earlier in the function, before any conditional logic that assumes a positive
token count, to ensure the check is actually executed when the batch object has
been corrupted with a negative count value.
- Around line 993-997: The batch null-check block containing the LOGe statement
and return nullptr is positioned too late in the function. Move this entire
safety check block to the very beginning of the function implementation, before
any code that dereferences or uses the batch parameter, to ensure the
null/invalid batch is caught before any potential crash occurs from
dereferencing batch in subsequent operations.

In `@llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt`:
- Around line 276-291: The broad catch block around the context validation logic
is catching the intentionally-thrown IllegalStateException about message length
being too long and re-wrapping it with a generic "Failed to validate message
length" error, which loses the specific user-facing error detail. Modify the
exception handling to either rethrow the IllegalStateException that is
explicitly thrown with the "Message is too long for the model's context window"
message, or restructure the try-catch to only catch exceptions from specific
operations like tokenize() and model_n_ctx() calls rather than catching all
exceptions after the explicit throw statement.
- Around line 192-197: The new_batch() and new_sampler() calls in the
initialization sequence do not properly clean up previously allocated native
resources when allocation fails. If new_sampler() fails after new_batch()
succeeds, the batch handle is leaked. Wrap these allocation calls with proper
resource cleanup by implementing a try-catch block that ensures all allocated
native resources (batch, sampler, model, and context handles) are freed before
re-throwing the exception when either new_batch() or new_sampler() fails. Use
the corresponding free functions to release the handles in the correct order.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 78d630fd-79f4-49b1-81c7-a75f68d14029

📥 Commits

Reviewing files that changed from the base of the PR and between 7acac35 and 8200a6b.

📒 Files selected for processing (6)
  • app/src/main/java/com/itsaky/androidide/agent/fragments/AiSettingsFragment.kt
  • app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt
  • app/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.kt
  • app/src/main/java/com/itsaky/androidide/utils/DynamicLibraryLoader.kt
  • llama-impl/src/main/cpp/llama-android.cpp
  • llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt

Comment thread app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt Outdated
Comment thread llama-impl/src/main/cpp/llama-android.cpp Outdated
Comment thread llama-impl/src/main/cpp/llama-android.cpp
Comment thread llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
Comment thread llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
Comment thread app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt Outdated
Comment thread llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
jatezzz and others added 4 commits June 25, 2026 11:44
Added multi-layer protection to detect and reject embedding models:

**Native Layer (C++):**
- Check pooling_type in new_context() - reject if not LLAMA_POOLING_TYPE_NONE
- Added get_pooling_type() JNI function for Kotlin validation
- Clear error messages explaining embedding vs generative models

**Kotlin Layer:**
- Validate model during load() in LLamaAndroid.kt
- Catch IllegalStateException and wrap with user-friendly message
- File format validation for ONNX, PyTorch, TensorFlow, etc.

**UI Layer:**
- Proper exception handling in AiSettingsViewModel
- Display error in ModelLoadingState.Error instead of crashing
- Keep bottom sheet expanded after file picker to show error/status

**Infrastructure:**
- Rebuilt llama.cpp AAR with updated native code (v8)
- Updated LLAMA_LIB_VERSION to 8 in DynamicLibraryLoader

The app now gracefully handles embedding models with clear error
messages instead of crashing with SIGABRT.

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Removed inline comments explaining bottom sheet behavior
- Extracted file extension strings to named constants (EXT_*)
- Extracted keyword strings to named constants (KEYWORD_*)
- Improved code maintainability and reduced duplication

Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Reject embedding/Mini models (all-MiniLM, mpnet, e5, *embed*) up front by
filename instead of only warning, so they never reach the native loader.
The deep C++ pooling_type check remains as a second line of defense.

Introduce a ModelLoadResult sealed type (Loaded/Rejected/Failed) so callers
can distinguish an unsupported/embedding model from a generic load failure;
migrate AiSettingsViewModel, LocalLlmRepositoryImpl and ChatViewModel to it,
and extract ChatViewModel's local-LLM setup into focused helpers.

Move all user-facing model-load error messages to the resources module
strings.xml. Free native model/context/batch handles on load failure paths
to avoid leaks, and harden batch null/negative-token guards in the JNI layer.
@jatezzz jatezzz force-pushed the fix/ADFA-4388-embedding-model-crash branch from fa0ccd7 to 999e7a3 Compare June 25, 2026 16:44

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 4

🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`:
- Line 296: The `initModelFromFile` API now returns `ModelLoadResult`, so update
the remaining boolean-style consumer in `LocalLlmIntegrationTest` to stop
passing the result into `assertTrue(...)`. Instead, use the `initModelFromFile`
result directly, assert it is `ModelLoadResult.Loaded`, and then verify the
loaded metadata on that returned value so the test matches the new
`LlmInferenceEngine` contract.
- Around line 326-329: The URI/display-name lookup is happening before the
load-result error handling, so exceptions from resolveModelDisplayName(...) can
escape instead of being converted to ModelLoadResult.Failed. Move the
modelUri.toUri() and resolveModelDisplayName(context, modelUri) work inside the
same try block in the load flow, and use a safe fallback display name in the
catch path so stale or revoked content URIs are handled by the existing failure
result.

In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`:
- Around line 75-95: The load result handling in LocalLlmRepositoryImpl
currently hides the rejection guidance by mapping ModelLoadResult.Rejected to
the generic failure status. Update the success/rejected/failed branch around
engine.initModelFromFile so that rejected loads surface result.message via
AgentState.Error or the displayed status text instead of
agent_local_model_loaded_failure, while keeping the existing handling for
ModelLoadResult.Failed and the final state updates in place.

In `@app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt`:
- Line 184: The local model load handling in ChatViewModel is collapsing
Rejected/Failed into a plain boolean, which causes retrieveAgentResponse to show
only the generic “local model is not loaded” message. Update
handleLocalModelLoadResult and its callers so the specific
ModelLoadResult.message is propagated back to the chat flow instead of returning
just false, and make sure the failure path in retrieveAgentResponse posts that
detailed reason for the UI.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro

Run ID: 1f6674b7-5585-4fac-80e5-c2d0df49add5

📥 Commits

Reviewing files that changed from the base of the PR and between 8200a6b and fa0ccd7.

📒 Files selected for processing (8)
  • app/src/main/java/com/itsaky/androidide/agent/model/ModelLoadResult.kt
  • app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt
  • app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt
  • app/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.kt
  • app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt
  • llama-impl/src/main/cpp/llama-android.cpp
  • llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
  • resources/src/main/res/values/strings.xml
✅ Files skipped from review due to trivial changes (1)
  • app/src/main/java/com/itsaky/androidide/agent/model/ModelLoadResult.kt
🚧 Files skipped from review as they are similar to previous changes (2)
  • llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
  • llama-impl/src/main/cpp/llama-android.cpp

Comment thread app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt Outdated
Local model rejections/failures previously collapsed to a generic
message, hiding the embedding-model guidance the load result carries.

- LocalLlmRepositoryImpl.loadModel: emit AgentState.Error(result.message)
  for Rejected/Failed instead of the generic loaded_failure status.
- ChatViewModel: propagate the rejection/failure message via
  localModelLoadError so retrieveAgentResponse posts the specific reason,
  falling back to the generic "model not loaded" text when absent.
@jatezzz jatezzz merged commit 939b191 into stage Jun 26, 2026
2 checks passed
@jatezzz jatezzz deleted the fix/ADFA-4388-embedding-model-crash branch June 26, 2026 17:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants