Feature/custom js models#132
Merged
Merged
Conversation
…message expansion
- Modified getSystemMessage() in WebViewBridge to automatically load system message from SystemMessagePreferences when ViewModel is not initialized and current message is empty - Added import for SystemMessagePreferences - This ensures system message is displayed on WebView startup without requiring manual restore - Does not call restoreSystemMessage() as per requirement, but loads directly from app data
Adds a mechanism so a new model's slightly different command syntax
(e.g. "Click('...')" instead of "click(\"...\")") can be supported via
a repo commit to command-patterns.json instead of patching CommandParser.kt
and shipping a new app version.
- CommandParser.CommandType is now public; CommandPatternConfig parses a
remote JSON array of {id, commandType, regex} overrides.
- An override can only attach a new regex to an EXISTING CommandType - the
actual Command construction/execution logic is always the same compiled-in
builder. No new action kind or custom code can be introduced this way.
- WebViewBridge exposes setCommandPatternOverrides()/getCommandPatternOverrides();
the WebView fetches the optional command-patterns.json on window.onAndroidReady()
and pushes it to the bridge.
- CommandPatternOverridesPreferences persists the last received override JSON so
it survives app restarts; PhotoReasoningApplication.onCreate() restores it.
- Added unit tests covering: alternate syntax recognition, unknown commandType
rejection, invalid regex rejection, and clearing overrides. Existing
CommandParserTest cases are unaffected (overrides default to empty).
Execution/queueing/guard logic in PhotoReasoningViewModel/AccessibilityCommandQueue
is intentionally untouched - only the *pattern recognition* layer is now
data-driven and remotely updatable.
…unkt 2) Lets a genuinely new model/provider be added with zero app release: define it in custom-models.json (endpoint, modelName, auth header) and the actual HTTP request is made by JavaScript directly in the WebView (window.onCustomModelRequest, fetch()), not by native networking code. Requires the provider's endpoint to support CORS for browser-style requests - verify this per provider, it is not guaranteed. Native side (additive only, zero changes to any existing ModelOption's behavior): - CustomModelDefinition/CustomModelConfig: data class + JSON parser for custom-models.json, completely separate from ModelOption/GenerativeAiViewModelFactory. - CustomModelRegistry: in-memory active list + active selection, independent of the ModelOption enum. - CustomModelPreferences: persists the models json, the active selection, and a per-model API key (custom models aren't tied to the existing ApiProvider enum/ ApiKeyManager storage). - WebViewBridge.setSelectedModel(id) now falls back to CustomModelRegistry when id isn't a ModelOption - this is the minimal slice of 'decouple model selection from the enum' (Punkt 1) needed for Punkt 2 to be selectable at all. Removed a dead, orphaned addCustomModel() no-op stub from an earlier, abandoned attempt at this. - PhotoReasoningViewModel.reason(): if a custom model is active, delegates to the new reasonWithCustomJsModel(), which builds the request context (system message, db entries, sanitized history, user text, base64 images) and emits it on the new customModelRequestEvents SharedFlow instead of calling any provider itself. - onCustomModelPartialResponse/onCustomModelFinalResponse/onCustomModelError: new public ViewModel methods, called from WebViewBridge once JS has the result. They reuse the exact same replaceAiMessageText/processCommandsIncrementally/ finalizeAiMessage/processCommands/saveChatHistory pipeline every other model already uses, so command execution and persistence behave identically. WebView side (index.html): - custom-models.json is fetched on window.onAndroidReady() (merged the fetch into the pre-existing onAndroidReady - there were two conflicting definitions of it before this commit, the second silently overwriting the first; fixed as part of this change) and merged into the MODELS array / model picker. - window.onCustomModelRequest(payloadJson): builds an OpenAI-compatible chat- completions request, calls fetch(), and either parses SSE streaming chunks or a single JSON response, reporting back via the three bridge callbacks above. - stopGeneration() now also aborts an in-flight custom-model fetch() via AbortController, so Stop works the same way regardless of which model is active. Docs: docs/custom-models.md (format, API key setup, request flow, explicit limitations: CORS must be verified per provider; generation settings sliders are not yet persisted per custom model; the model's API key is necessarily visible to JS to set the auth header, consistent with the existing getAllApiKeys() exposure). Tests: CustomModelConfigTest, CustomModelRegistryTest (pure JVM, no Android context needed). Verified index.html's extracted <script> content with 'node --check' (syntax only, not behavior) and manually traced the Kotlin control flow since I could not run a Gradle build in this environment - please run ./gradlew :app:testDebugUnitTest before merging, and manually verify at least one real custom model end-to-end on a device (CORS support cannot be verified otherwise).
…ocument image-gen gap Generation settings (temperature/top-p/top-k): - GenerationSettingsPreferences was already keyed by an arbitrary string, not by the ModelOption enum - so no new storage was needed (per request: reuse the app's existing data/storage). Only WebViewBridge.getGenerationSettings/saveGenerationSettings needed a fallback to CustomModelRegistry.findById(modelId)?.id when the id isn't a ModelOption, instead of failing and silently no-op'ing. - reasonWithCustomJsModel now loads these settings and includes temperature/top_p (and top_k, only if supportsTopK) in the payload; window.onCustomModelRequest sends them in the request body. Existing settings UI (sliders) needed no changes. Images - found and fixed a real gap, not just confirmed existing behavior: - The current turn's images were already sent correctly (same PuterApiClient.bitmapToBase64DataUri + OpenAI-style image_url content parts every other model uses, gated by supportsScreenshot) - this part was already correct. - But ScreenOperatorAccessibilityService.executeTakeScreenshotCommand's decision of whether to capture a *real* screenshot vs. text-only screen info during the autonomous 'take screenshot after each command' loop checked the stale, native GenerativeAiViewModelFactory.getCurrentModel().supportsScreenshot - it had no idea a custom model could be active. Without this fix, a custom vision model would only ever receive an image on the very first explicit message, never during autonomous operation. Now checks CustomModelRegistry.getActiveModel() first. Image-*generating* models: confirmed and documented that these are NOT supported, for custom or built-in models - window.onCustomModelRequest only implements the chat- completions request/response shape (no images-generations equivalent), and addModelBubble() in index.html always HTML-escapes responses as plain text - there is no image-rendering path for an AI's response anywhere in the app. Documented in docs/custom-models.md rather than silently doing nothing if someone tries it. Could not unit test the WebViewBridge changes (Android Context/SharedPreferences, no Robolectric in this project) or run a real screenshot through the accessibility service - verified by tracing the existing call graph and via the brace-balanced/ node --check syntax checks I can run here. Please verify the autonomous screenshot loop with a real custom vision model on a device before relying on it.
Both bugs had the same root cause: TrialStateDialogs and
PaymentMethodDialog only existed inside the 'else' branch (non-WebView
UI). When the WebView is active those composables were never in the
composition tree, so:
1. Trial-expired popup never showed → fixed by hoisting
TrialStateDialogs *above* the if/else so it always renders as a
native dialog floating over the WebView.
2. Pro button → no billing popup → fixed by having
initiateDonationFromWebView() call launchGooglePlayBilling()
directly, bypassing PaymentMethodDialog (which has no composable
slot in WebView mode).
Additional improvements:
- updateTrialState() now calls window.onTrialStateChanged(isExpired,
isPurchased, message) on the WebView so JS can refresh UI state
(e.g. hide the Pro button) without a reload.
- onPageFinished sends the same event right after onAndroidReady
so the initial state is correct on first load.
- index.html: window.onTrialStateChanged handler added; calls
updateDonationCard() to hide/show the Pro button.
The hardcoded DEFAULT_SYSTEM_MESSAGE_ON_FIRST_START constant and the KEY_FIRST_START_COMPLETED first-start init logic are removed from SystemMessagePreferences. The authoritative default now lives exclusively in index.html as DEFAULT_SYSTEM_MSG, so updating it needs only a web bundle change – no app release required. Flow: - loadSystemMessage() returns "" when nothing is saved yet. - Bridge.getSystemMessage() in JS: Android.getSystemMessage()||DEFAULT_SYSTEM_MSG → shows the HTML default on first launch without any native round-trip. - Bridge.restoreSystemMessage() in JS: calls Android.setSystemMessage(DEFAULT_SYSTEM_MSG) so the default text is persisted and onSystemMessageChanged() fires to update the textarea, exactly like any other setSystemMessage() call. - Users who already have a custom message stored in SharedPreferences are unaffected – their value is returned as-is.
Commit dd2d902 added a webViewInstance?.post{} block inside updateTrialState but accidentally left the original closing brace of the function in place, producing a duplicate '}'. The spurious brace made assembleDebug fail with a Kotlin syntax error.
Contributor
There was a problem hiding this comment.
This PR successfully implements a custom model system that allows adding new AI models via JSON configuration without requiring app releases. The implementation is well-architected and follows secure coding practices.
Key strengths:
- Enforces HTTPS-only endpoints for security
- Graceful degradation when parsing malformed JSON configs
- Clean separation between native and WebView-based model handling
- Consistent error handling and logging throughout
The code is production-ready and can be merged.
You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
No description provided.