Make local AI models actually chat: honest connector detection + in-browser WebLLM + Nano download#349
Merged
crs48 merged 8 commits intoJun 30, 2026
Conversation
…el connector gaps) Traces the disabled composer to a detection/instantiation split: the dropdown labels WebLLM and Gemini Nano 'available' from shallow probes (navigator.gpu / 'LanguageModel' in globalThis), but neither can be instantiated in a plain browser — WebLLM has no engine-injection path wired up (createWebLLMProvider is never called by the app, @mlc-ai/web-llm is in no package.json) and Nano's probe checks API presence, not availability()/download readiness. provider===null silently leaves ready=false → box disabled, and for WebLLM no setupHint renders. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: xNet Test <test@xnet.dev>
WebLLM was reported available whenever navigator.gpu existed, and Gemini Nano whenever the LanguageModel global existed — but neither can run without, respectively, a host-supplied engine and a fully-downloaded model. Picking such a tier silently disabled the composer. Detection now gates webllm on a new ConnectorEnv.hasWebLLMEngine probe (AND WebGPU) and prompt-api on availability()==='available'. Adds promptApiAvailability + downloadPromptApiModel helpers for the panel's download gesture. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: xNet Test <test@xnet.dev>
The composer's disabled box used to be silent: a failed/null provider was swallowed, and the placeholder said 'Select and configure a model above' even when a model was already selected. Now the build effect .catch-es failures into the visible error line, a not-ready reason renders for every selected-but-not-ready tier, and the placeholder distinguishes 'no model selected' from 'preparing' / 'configure the model above'. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: xNet Test <test@xnet.dev>
Finishes 0174 tier A: a host-supplied @mlc-ai/web-llm engine (lazy-imported in ai-webllm-engine.ts) is wired through resolveProvider and advertised via hasWebLLMEngine, so the 'In-browser model' tier downloads a small model on an explicit 'run' gesture (no surprise download), shows progress, then chats fully on-device. webllm rejoins USABLE_TIERS. Gemini Nano gains a gesture-driven 'Download' button (create({ monitor })) for the 'downloadable' state, re-detecting once ready. CSP connect-src now allows the WebLLM weight/library CDNs.
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: xNet Test <test@xnet.dev>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: xNet Test <test@xnet.dev>
Two validation items are live-hardware smokes (WebGPU download+stream; Dia browser) left unchecked with notes on what was verified in CI. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: xNet Test <test@xnet.dev>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Signed-off-by: xNet Test <test@xnet.dev>
Contributor
🖼️ UI changes in this PRNo visual differences detected in the changed UI. |
Contributor
|
Preview removed for PR #349. |
crs48
added a commit
that referenced
this pull request
Jun 30, 2026
## What & why Follow-up to [#349](#349) (exploration 0252). A **live browser smoke** of the new in-browser WebLLM tier caught a real bug the headless CI couldn't: model **weights are served from `us.aws.cdn.hf.co`** (Hugging Face's Xet CDN, a `*.hf.co` host), but #349's CSP only allowed `huggingface.co` / `*.huggingface.co`. So the model *config* fetch succeeded while the *weight* download was `connect-src`-blocked — and the failure surfaced as a (mis-worded) cloud-key "CORS block" error. ## Changes - **CSP** (`apps/web/index.html`): add `https://*.hf.co` to `connect-src` so the HF Xet/LFS CDN that serves model weights is reachable. - **Error message** (`ai-webllm-engine.ts`): wrap WebLLM init failures in a WebLLM-specific message, so a future failure reads sensibly instead of being rewritten as the generic cloud-key/Ollama CORS hint by `errorMessage`. - Checks off the previously-pending live-WebGPU validation item in 0252. ## Verification (live, in a WebGPU browser) Drove the real app (test-bypass identity → Companion → select "In-browser model" → "Run"): - ✅ Weights streamed from `us.aws.cdn.hf.co` with the progress bar climbing 0 → 100% — **no CSP violation** (confirmed via `securitypolicyviolation` listener; before the fix it fired `connect-src` on `*.hf.co`). - ✅ The composer **enabled**, and the in-tab model replied to two prompts (e.g. *"A local-first app … stores its data locally on the device."*) — inference running fully on-device. - ✅ `apps/web` typecheck clean; `AiChatPanel` tests pass. 🤖 Generated with [Claude Code](https://claude.com/claude-code)
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The AI chat box sat disabled with no explanation for nearly every in-browser option — Gemini Nano, WebLLM, and (expectedly) any keyless cloud tier — in both Chrome and Dia. Root cause (explored in
docs/explorations/0252): a detection/instantiation split. The dropdown labelled a tier "available" from a shallow probe (navigator.gpu,'LanguageModel' in globalThis), but:createWebLLMProviderwas never called by the app and@mlc-ai/web-llmwas in nopackage.json. Picking it silently disabled the composer (the code literally called it "the webllm trap").'downloadable'model left the box dead.null/throwing provider was swallowed (no.catch), so the box stayed disabled with no error and, for WebLLM, no hint.What changed
A — honest detection (
@xnetjs/plugins)ConnectorEnv.hasWebLLMEngineprobe;webllmis available only when WebGPU and a host-supplied engine are present.prompt-apiprobe now readsLanguageModel.availability()and treats only'available'as ready.promptApiAvailability()+downloadPromptApiModel()helpers (+PromptApiAvailability/LanguageModelMonitortypes).D — never a silent dead box (
apps/web).catchon the build effect surfaces failures; a not-ready reason renders for every selected-but-not-ready tier; the placeholder no longer says "select a model" once one is selected.B — WebLLM actually runs (
apps/web)ai-webllm-engine.tslazily imports@mlc-ai/web-llm(code-splits into a chunk fetched only on a "run" gesture — no surprise download), wired throughresolveProvider;webllmrejoinsUSABLE_TIERS; CSPconnect-srcallows the weight/library CDNs.C — Gemini Nano download gesture (
apps/web)Verification
detect.test.ts,providers.test.ts,ai-chat-connector.test.ts,AiChatPanel.test.tsx— 68 pass, with new assertions for engine-gated availability, the Nano probe, and both in-tab gestures. All 780 tests in the touched areas pass; typecheck clean.@mlc-ai/web-llmconfirmed to land in a lazy chunk (import(...), fetched only on the gesture), not the eager entry.Implements exploration 0252 (13/13 implementation items; 6/8 validation, 2 live-hardware smokes pending). Finishes the last mile of
0174tier A.🤖 Generated with Claude Code