Feature/custom js models by Android-PowerUser · Pull Request #132 · Android-PowerUser/ScreenOperator

Android-PowerUser · 2026-06-26T09:58:24Z

No description provided.

…message expansion

- Modified getSystemMessage() in WebViewBridge to automatically load system message from SystemMessagePreferences when ViewModel is not initialized and current message is empty - Added import for SystemMessagePreferences - This ensures system message is displayed on WebView startup without requiring manual restore - Does not call restoreSystemMessage() as per requirement, but loads directly from app data

Adds a mechanism so a new model's slightly different command syntax (e.g. "Click('...')" instead of "click(\"...\")") can be supported via a repo commit to command-patterns.json instead of patching CommandParser.kt and shipping a new app version. - CommandParser.CommandType is now public; CommandPatternConfig parses a remote JSON array of {id, commandType, regex} overrides. - An override can only attach a new regex to an EXISTING CommandType - the actual Command construction/execution logic is always the same compiled-in builder. No new action kind or custom code can be introduced this way. - WebViewBridge exposes setCommandPatternOverrides()/getCommandPatternOverrides(); the WebView fetches the optional command-patterns.json on window.onAndroidReady() and pushes it to the bridge. - CommandPatternOverridesPreferences persists the last received override JSON so it survives app restarts; PhotoReasoningApplication.onCreate() restores it. - Added unit tests covering: alternate syntax recognition, unknown commandType rejection, invalid regex rejection, and clearing overrides. Existing CommandParserTest cases are unaffected (overrides default to empty). Execution/queueing/guard logic in PhotoReasoningViewModel/AccessibilityCommandQueue is intentionally untouched - only the *pattern recognition* layer is now data-driven and remotely updatable.

…unkt 2) Lets a genuinely new model/provider be added with zero app release: define it in custom-models.json (endpoint, modelName, auth header) and the actual HTTP request is made by JavaScript directly in the WebView (window.onCustomModelRequest, fetch()), not by native networking code. Requires the provider's endpoint to support CORS for browser-style requests - verify this per provider, it is not guaranteed. Native side (additive only, zero changes to any existing ModelOption's behavior): - CustomModelDefinition/CustomModelConfig: data class + JSON parser for custom-models.json, completely separate from ModelOption/GenerativeAiViewModelFactory. - CustomModelRegistry: in-memory active list + active selection, independent of the ModelOption enum. - CustomModelPreferences: persists the models json, the active selection, and a per-model API key (custom models aren't tied to the existing ApiProvider enum/ ApiKeyManager storage). - WebViewBridge.setSelectedModel(id) now falls back to CustomModelRegistry when id isn't a ModelOption - this is the minimal slice of 'decouple model selection from the enum' (Punkt 1) needed for Punkt 2 to be selectable at all. Removed a dead, orphaned addCustomModel() no-op stub from an earlier, abandoned attempt at this. - PhotoReasoningViewModel.reason(): if a custom model is active, delegates to the new reasonWithCustomJsModel(), which builds the request context (system message, db entries, sanitized history, user text, base64 images) and emits it on the new customModelRequestEvents SharedFlow instead of calling any provider itself. - onCustomModelPartialResponse/onCustomModelFinalResponse/onCustomModelError: new public ViewModel methods, called from WebViewBridge once JS has the result. They reuse the exact same replaceAiMessageText/processCommandsIncrementally/ finalizeAiMessage/processCommands/saveChatHistory pipeline every other model already uses, so command execution and persistence behave identically. WebView side (index.html): - custom-models.json is fetched on window.onAndroidReady() (merged the fetch into the pre-existing onAndroidReady - there were two conflicting definitions of it before this commit, the second silently overwriting the first; fixed as part of this change) and merged into the MODELS array / model picker. - window.onCustomModelRequest(payloadJson): builds an OpenAI-compatible chat- completions request, calls fetch(), and either parses SSE streaming chunks or a single JSON response, reporting back via the three bridge callbacks above. - stopGeneration() now also aborts an in-flight custom-model fetch() via AbortController, so Stop works the same way regardless of which model is active. Docs: docs/custom-models.md (format, API key setup, request flow, explicit limitations: CORS must be verified per provider; generation settings sliders are not yet persisted per custom model; the model's API key is necessarily visible to JS to set the auth header, consistent with the existing getAllApiKeys() exposure). Tests: CustomModelConfigTest, CustomModelRegistryTest (pure JVM, no Android context needed). Verified index.html's extracted <script> content with 'node --check' (syntax only, not behavior) and manually traced the Kotlin control flow since I could not run a Gradle build in this environment - please run ./gradlew :app:testDebugUnitTest before merging, and manually verify at least one real custom model end-to-end on a device (CORS support cannot be verified otherwise).

…ocument image-gen gap Generation settings (temperature/top-p/top-k): - GenerationSettingsPreferences was already keyed by an arbitrary string, not by the ModelOption enum - so no new storage was needed (per request: reuse the app's existing data/storage). Only WebViewBridge.getGenerationSettings/saveGenerationSettings needed a fallback to CustomModelRegistry.findById(modelId)?.id when the id isn't a ModelOption, instead of failing and silently no-op'ing. - reasonWithCustomJsModel now loads these settings and includes temperature/top_p (and top_k, only if supportsTopK) in the payload; window.onCustomModelRequest sends them in the request body. Existing settings UI (sliders) needed no changes. Images - found and fixed a real gap, not just confirmed existing behavior: - The current turn's images were already sent correctly (same PuterApiClient.bitmapToBase64DataUri + OpenAI-style image_url content parts every other model uses, gated by supportsScreenshot) - this part was already correct. - But ScreenOperatorAccessibilityService.executeTakeScreenshotCommand's decision of whether to capture a *real* screenshot vs. text-only screen info during the autonomous 'take screenshot after each command' loop checked the stale, native GenerativeAiViewModelFactory.getCurrentModel().supportsScreenshot - it had no idea a custom model could be active. Without this fix, a custom vision model would only ever receive an image on the very first explicit message, never during autonomous operation. Now checks CustomModelRegistry.getActiveModel() first. Image-*generating* models: confirmed and documented that these are NOT supported, for custom or built-in models - window.onCustomModelRequest only implements the chat- completions request/response shape (no images-generations equivalent), and addModelBubble() in index.html always HTML-escapes responses as plain text - there is no image-rendering path for an AI's response anywhere in the app. Documented in docs/custom-models.md rather than silently doing nothing if someone tries it. Could not unit test the WebViewBridge changes (Android Context/SharedPreferences, no Robolectric in this project) or run a real screenshot through the accessibility service - verified by tracing the existing call graph and via the brace-balanced/ node --check syntax checks I can run here. Please verify the autonomous screenshot loop with a real custom vision model on a device before relying on it.

…ebViewBridge

Both bugs had the same root cause: TrialStateDialogs and PaymentMethodDialog only existed inside the 'else' branch (non-WebView UI). When the WebView is active those composables were never in the composition tree, so: 1. Trial-expired popup never showed → fixed by hoisting TrialStateDialogs *above* the if/else so it always renders as a native dialog floating over the WebView. 2. Pro button → no billing popup → fixed by having initiateDonationFromWebView() call launchGooglePlayBilling() directly, bypassing PaymentMethodDialog (which has no composable slot in WebView mode). Additional improvements: - updateTrialState() now calls window.onTrialStateChanged(isExpired, isPurchased, message) on the WebView so JS can refresh UI state (e.g. hide the Pro button) without a reload. - onPageFinished sends the same event right after onAndroidReady so the initial state is correct on first load. - index.html: window.onTrialStateChanged handler added; calls updateDonationCard() to hide/show the Pro button.

The hardcoded DEFAULT_SYSTEM_MESSAGE_ON_FIRST_START constant and the KEY_FIRST_START_COMPLETED first-start init logic are removed from SystemMessagePreferences. The authoritative default now lives exclusively in index.html as DEFAULT_SYSTEM_MSG, so updating it needs only a web bundle change – no app release required. Flow: - loadSystemMessage() returns "" when nothing is saved yet. - Bridge.getSystemMessage() in JS: Android.getSystemMessage()||DEFAULT_SYSTEM_MSG → shows the HTML default on first launch without any native round-trip. - Bridge.restoreSystemMessage() in JS: calls Android.setSystemMessage(DEFAULT_SYSTEM_MSG) so the default text is persisted and onSystemMessageChanged() fires to update the textarea, exactly like any other setSystemMessage() call. - Users who already have a custom message stored in SharedPreferences are unaffected – their value is returned as-is.

Commit dd2d902 added a webViewInstance?.post{} block inside updateTrialState but accidentally left the original closing brace of the function in place, producing a duplicate '}'. The spurious brace made assembleDebug fail with a Kotlin syntax error.

amazon-q-developer

This PR successfully implements a custom model system that allows adding new AI models via JSON configuration without requiring app releases. The implementation is well-architected and follows secure coding practices.

Key strengths:

Enforces HTTPS-only endpoints for security
Graceful degradation when parsing malformed JSON configs
Clean separation between native and WebView-based model handling
Consistent error handling and logging throughout

The code is production-ready and can be merged.

You can now have the agent implement changes and create commits directly on your pull request's source branch. Simply comment with /q followed by your request in natural language to ask the agent to make changes.

Android-PowerUser and others added 11 commits June 19, 2026 17:09

Fix webview sync, back button, termux mode, media picker, and system …

2a389c7

…message expansion

Fix WebView, Bridge, keyboard overlay, media picker, and sync issues

c3d0868

Point WebView content URLs to feature/webview-test branch

0eb7544

fix: remove duplicate @JavascriptInterface function declarations in W…

2bc29c4

…ebViewBridge

amazon-q-developer Bot reviewed Jun 26, 2026

View reviewed changes

Android-PowerUser merged commit 3360c24 into main Jun 26, 2026
5 checks passed

Android-PowerUser deleted the feature/custom-js-models branch July 1, 2026 15:04

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature/custom js models#132

Feature/custom js models#132
Android-PowerUser merged 11 commits into
mainfrom
feature/custom-js-models

Android-PowerUser commented Jun 26, 2026

Uh oh!

amazon-q-developer Bot left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

Android-PowerUser commented Jun 26, 2026

Uh oh!

amazon-q-developer Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants