ADFA-4388 | embedding model crash#1429
Conversation
There was a problem hiding this comment.
Claude Code Review
This repository is configured for manual code reviews. Comment @claude review to trigger a review and subscribe this PR to future pushes, or @claude review once for a one-time review.
Tip: disable this comment in your organization's Code Review settings.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
🚧 Files skipped from review as they are similar to previous changes (2)
📝 Walkthrough
WalkthroughThis PR adds structured model-load outcomes, stronger native and Kotlin validation, repository and ViewModel handling for load results, UI expansion after model selection, and a native library version bump. ChangesModel Loading and Validation Flow
Sequence Diagram(s)sequenceDiagram
participant User
participant AiSettingsFragment
participant AiSettingsViewModel
participant LlmInferenceEngine
participant LLamaAndroid
participant NativeCPP
User->>AiSettingsFragment: selects model file
AiSettingsFragment->>AiSettingsViewModel: loadModelFromUri(uri)
AiSettingsViewModel->>LlmInferenceEngine: initModelFromFile(uri)
LlmInferenceEngine->>LlmInferenceEngine: validateModelFormat(displayName)
alt unsupported format or embedding filename
LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Rejected / Failed
AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Error(message)
else accepted
LlmInferenceEngine->>LLamaAndroid: load(modelPath)
LLamaAndroid->>NativeCPP: load_model()
NativeCPP->>NativeCPP: GGUF validation
LLamaAndroid->>NativeCPP: new_context()
NativeCPP->>NativeCPP: pooling-type check
LLamaAndroid->>NativeCPP: completion_init()
alt validation failure
NativeCPP-->>LLamaAndroid: throw IllegalStateException
LLamaAndroid-->>LlmInferenceEngine: exception
LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Failed / Rejected
AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Error(message)
else success
LlmInferenceEngine-->>AiSettingsViewModel: ModelLoadResult.Loaded
AiSettingsViewModel-->>AiSettingsFragment: ModelLoadingState.Loaded
AiSettingsFragment->>AiSettingsFragment: postDelayed expand bottom sheet and switch to TAB_AGENT
end
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 5
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`:
- Around line 343-344: The error message check in LlmInferenceEngine.kt at the
if statement checking for "embedding model" is case-sensitive and will miss
error messages with different casing like "Embedding model". Make the message
comparison case-insensitive by converting the error message to lowercase before
performing the contains check, or by using a case-insensitive comparison method
that ignores the case parameter. This ensures that all variations of the
embedding model error message are properly detected and the user receives
appropriate guidance.
In `@llama-impl/src/main/cpp/llama-android.cpp`:
- Around line 814-821: The validation check for `batch->n_tokens < 0` in the
llama-android.cpp decode function is unreachable because it is placed after a
code path that already requires `batch->n_tokens > 0`, meaning negative values
will never reach this validation block. Move the negative `n_tokens` validation
earlier in the function, before any conditional logic that assumes a positive
token count, to ensure the check is actually executed when the batch object has
been corrupted with a negative count value.
- Around line 993-997: The batch null-check block containing the LOGe statement
and return nullptr is positioned too late in the function. Move this entire
safety check block to the very beginning of the function implementation, before
any code that dereferences or uses the batch parameter, to ensure the
null/invalid batch is caught before any potential crash occurs from
dereferencing batch in subsequent operations.
In `@llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt`:
- Around line 276-291: The broad catch block around the context validation logic
is catching the intentionally-thrown IllegalStateException about message length
being too long and re-wrapping it with a generic "Failed to validate message
length" error, which loses the specific user-facing error detail. Modify the
exception handling to either rethrow the IllegalStateException that is
explicitly thrown with the "Message is too long for the model's context window"
message, or restructure the try-catch to only catch exceptions from specific
operations like tokenize() and model_n_ctx() calls rather than catching all
exceptions after the explicit throw statement.
- Around line 192-197: The new_batch() and new_sampler() calls in the
initialization sequence do not properly clean up previously allocated native
resources when allocation fails. If new_sampler() fails after new_batch()
succeeds, the batch handle is leaked. Wrap these allocation calls with proper
resource cleanup by implementing a try-catch block that ensures all allocated
native resources (batch, sampler, model, and context handles) are freed before
re-throwing the exception when either new_batch() or new_sampler() fails. Use
the corresponding free functions to release the handles in the correct order.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 78d630fd-79f4-49b1-81c7-a75f68d14029
📒 Files selected for processing (6)
app/src/main/java/com/itsaky/androidide/agent/fragments/AiSettingsFragment.ktapp/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.ktapp/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.ktapp/src/main/java/com/itsaky/androidide/utils/DynamicLibraryLoader.ktllama-impl/src/main/cpp/llama-android.cppllama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
Added multi-layer protection to detect and reject embedding models: **Native Layer (C++):** - Check pooling_type in new_context() - reject if not LLAMA_POOLING_TYPE_NONE - Added get_pooling_type() JNI function for Kotlin validation - Clear error messages explaining embedding vs generative models **Kotlin Layer:** - Validate model during load() in LLamaAndroid.kt - Catch IllegalStateException and wrap with user-friendly message - File format validation for ONNX, PyTorch, TensorFlow, etc. **UI Layer:** - Proper exception handling in AiSettingsViewModel - Display error in ModelLoadingState.Error instead of crashing - Keep bottom sheet expanded after file picker to show error/status **Infrastructure:** - Rebuilt llama.cpp AAR with updated native code (v8) - Updated LLAMA_LIB_VERSION to 8 in DynamicLibraryLoader The app now gracefully handles embedding models with clear error messages instead of crashing with SIGABRT. Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
- Removed inline comments explaining bottom sheet behavior - Extracted file extension strings to named constants (EXT_*) - Extracted keyword strings to named constants (KEYWORD_*) - Improved code maintainability and reduced duplication Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Reject embedding/Mini models (all-MiniLM, mpnet, e5, *embed*) up front by filename instead of only warning, so they never reach the native loader. The deep C++ pooling_type check remains as a second line of defense. Introduce a ModelLoadResult sealed type (Loaded/Rejected/Failed) so callers can distinguish an unsupported/embedding model from a generic load failure; migrate AiSettingsViewModel, LocalLlmRepositoryImpl and ChatViewModel to it, and extract ChatViewModel's local-LLM setup into focused helpers. Move all user-facing model-load error messages to the resources module strings.xml. Free native model/context/batch handles on load failure paths to avoid leaks, and harden batch null/negative-token guards in the JNI layer.
fa0ccd7 to
999e7a3
Compare
There was a problem hiding this comment.
Actionable comments posted: 4
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.kt`:
- Line 296: The `initModelFromFile` API now returns `ModelLoadResult`, so update
the remaining boolean-style consumer in `LocalLlmIntegrationTest` to stop
passing the result into `assertTrue(...)`. Instead, use the `initModelFromFile`
result directly, assert it is `ModelLoadResult.Loaded`, and then verify the
loaded metadata on that returned value so the test matches the new
`LlmInferenceEngine` contract.
- Around line 326-329: The URI/display-name lookup is happening before the
load-result error handling, so exceptions from resolveModelDisplayName(...) can
escape instead of being converted to ModelLoadResult.Failed. Move the
modelUri.toUri() and resolveModelDisplayName(context, modelUri) work inside the
same try block in the load flow, and use a safe fallback display name in the
catch path so stale or revoked content URIs are handled by the existing failure
result.
In
`@app/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.kt`:
- Around line 75-95: The load result handling in LocalLlmRepositoryImpl
currently hides the rejection guidance by mapping ModelLoadResult.Rejected to
the generic failure status. Update the success/rejected/failed branch around
engine.initModelFromFile so that rejected loads surface result.message via
AgentState.Error or the displayed status text instead of
agent_local_model_loaded_failure, while keeping the existing handling for
ModelLoadResult.Failed and the final state updates in place.
In `@app/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.kt`:
- Line 184: The local model load handling in ChatViewModel is collapsing
Rejected/Failed into a plain boolean, which causes retrieveAgentResponse to show
only the generic “local model is not loaded” message. Update
handleLocalModelLoadResult and its callers so the specific
ModelLoadResult.message is propagated back to the chat flow instead of returning
just false, and make sure the failure path in retrieveAgentResponse posts that
detailed reason for the UI.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: 1f6674b7-5585-4fac-80e5-c2d0df49add5
📒 Files selected for processing (8)
app/src/main/java/com/itsaky/androidide/agent/model/ModelLoadResult.ktapp/src/main/java/com/itsaky/androidide/agent/repository/LlmInferenceEngine.ktapp/src/main/java/com/itsaky/androidide/agent/repository/LocalLlmRepositoryImpl.ktapp/src/main/java/com/itsaky/androidide/agent/viewmodel/AiSettingsViewModel.ktapp/src/main/java/com/itsaky/androidide/agent/viewmodel/ChatViewModel.ktllama-impl/src/main/cpp/llama-android.cppllama-impl/src/main/java/android/llama/cpp/LLamaAndroid.ktresources/src/main/res/values/strings.xml
✅ Files skipped from review due to trivial changes (1)
- app/src/main/java/com/itsaky/androidide/agent/model/ModelLoadResult.kt
🚧 Files skipped from review as they are similar to previous changes (2)
- llama-impl/src/main/java/android/llama/cpp/LLamaAndroid.kt
- llama-impl/src/main/cpp/llama-android.cpp
Local model rejections/failures previously collapsed to a generic message, hiding the embedding-model guidance the load result carries. - LocalLlmRepositoryImpl.loadModel: emit AgentState.Error(result.message) for Rejected/Failed instead of the generic loaded_failure status. - ChatViewModel: propagate the rejection/failure message via localModelLoadError so retrieveAgentResponse posts the specific reason, falling back to the generic "model not loaded" text when absent.
Description
Crash Before Fix:
FATAL EXCEPTION: Llm-RunLoop
SIGABRT
decode: cannot decode batches with this context (calling encode() instead)
Error Message After Fix:
The selected model 'all-MiniLM-L6-v2-ggml-model-f16.gguf' is an embedding model
designed for semantic search and similarity tasks. It cannot be used for chat or
text generation.
Please select a chat/instruct model instead (e.g., models with 'chat', 'instruct', 'conversational' in their name).
Details
telegram-cloud-document-1-5022011912393590501.mp4
Ticket
ADFA-4388
Observation
This fix was developed based on consistent crash reproduction with embedding models (e.g., all-MiniLM-L6-v2). The crash occurred 100% of the time when users selected
embedding models for chat, resulting in SIGABRT in the Llm-RunLoop thread.
The multi-layer approach ensures that:
The app now correctly identifies incompatible model types and guides users to select appropriate chat/instruct models, eliminating the crash entirely.