Skip to content

Make local AI models actually chat: honest connector detection + in-browser WebLLM + Nano download#349

Merged
crs48 merged 8 commits into
mainfrom
claude/0252-why-the-ai-chat-box-is-disabled-local-model-conn
Jun 30, 2026
Merged

Make local AI models actually chat: honest connector detection + in-browser WebLLM + Nano download#349
crs48 merged 8 commits into
mainfrom
claude/0252-why-the-ai-chat-box-is-disabled-local-model-conn

Conversation

@crs48

@crs48 crs48 commented Jun 30, 2026

Copy link
Copy Markdown
Owner

Why

The AI chat box sat disabled with no explanation for nearly every in-browser option — Gemini Nano, WebLLM, and (expectedly) any keyless cloud tier — in both Chrome and Dia. Root cause (explored in docs/explorations/0252): a detection/instantiation split. The dropdown labelled a tier "available" from a shallow probe (navigator.gpu, 'LanguageModel' in globalThis), but:

  • WebLLM had no engine-injection path wired up — createWebLLMProvider was never called by the app and @mlc-ai/web-llm was in no package.json. Picking it silently disabled the composer (the code literally called it "the webllm trap").
  • Gemini Nano detection checked API presence, not model readiness; a 'downloadable' model left the box dead.
  • A null/throwing provider was swallowed (no .catch), so the box stayed disabled with no error and, for WebLLM, no hint.

What changed

A — honest detection (@xnetjs/plugins)

  • New ConnectorEnv.hasWebLLMEngine probe; webllm is available only when WebGPU and a host-supplied engine are present.
  • prompt-api probe now reads LanguageModel.availability() and treats only 'available' as ready.
  • New promptApiAvailability() + downloadPromptApiModel() helpers (+ PromptApiAvailability / LanguageModelMonitor types).

D — never a silent dead box (apps/web)

  • .catch on the build effect surfaces failures; a not-ready reason renders for every selected-but-not-ready tier; the placeholder no longer says "select a model" once one is selected.

B — WebLLM actually runs (apps/web)

  • New ai-webllm-engine.ts lazily imports @mlc-ai/web-llm (code-splits into a chunk fetched only on a "run" gesture — no surprise download), wired through resolveProvider; webllm rejoins USABLE_TIERS; CSP connect-src allows the weight/library CDNs.

C — Gemini Nano download gesture (apps/web)

  • A "Download Gemini Nano" button triggers the Chrome download (from a user gesture, with progress), then re-detects.

Verification

  • detect.test.ts, providers.test.ts, ai-chat-connector.test.ts, AiChatPanel.test.tsx68 pass, with new assertions for engine-gated availability, the Nano probe, and both in-tab gestures. All 780 tests in the touched areas pass; typecheck clean.
  • Production build succeeds; @mlc-ai/web-llm confirmed to land in a lazy chunk (import(...), fetched only on the gesture), not the eager entry.
  • Not exercised in CI (left unchecked in the doc): the live multi-GB WebGPU download+stream and the Dia browser — both need real hardware. Wiring, bundling, code-split, gesture/progress UI, and detection logic are verified.

Implements exploration 0252 (13/13 implementation items; 6/8 validation, 2 live-hardware smokes pending). Finishes the last mile of 0174 tier A.

🤖 Generated with Claude Code

xNet Test and others added 8 commits June 30, 2026 07:33
…el connector gaps)

Traces the disabled composer to a detection/instantiation split: the dropdown labels WebLLM and Gemini Nano 'available' from shallow probes (navigator.gpu / 'LanguageModel' in globalThis), but neither can be instantiated in a plain browser — WebLLM has no engine-injection path wired up (createWebLLMProvider is never called by the app, @mlc-ai/web-llm is in no package.json) and Nano's probe checks API presence, not availability()/download readiness. provider===null silently leaves ready=false → box disabled, and for WebLLM no setupHint renders.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
WebLLM was reported available whenever navigator.gpu existed, and Gemini Nano whenever the LanguageModel global existed — but neither can run without, respectively, a host-supplied engine and a fully-downloaded model. Picking such a tier silently disabled the composer. Detection now gates webllm on a new ConnectorEnv.hasWebLLMEngine probe (AND WebGPU) and prompt-api on availability()==='available'. Adds promptApiAvailability + downloadPromptApiModel helpers for the panel's download gesture.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
The composer's disabled box used to be silent: a failed/null provider was swallowed, and the placeholder said 'Select and configure a model above' even when a model was already selected. Now the build effect .catch-es failures into the visible error line, a not-ready reason renders for every selected-but-not-ready tier, and the placeholder distinguishes 'no model selected' from 'preparing' / 'configure the model above'.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
Finishes 0174 tier A: a host-supplied @mlc-ai/web-llm engine (lazy-imported in ai-webllm-engine.ts) is wired through resolveProvider and advertised via hasWebLLMEngine, so the 'In-browser model' tier downloads a small model on an explicit 'run' gesture (no surprise download), shows progress, then chats fully on-device. webllm rejoins USABLE_TIERS. Gemini Nano gains a gesture-driven 'Download' button (create({ monitor })) for the 'downloadable' state, re-detecting once ready. CSP connect-src now allows the WebLLM weight/library CDNs.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
Two validation items are live-hardware smokes (WebGPU download+stream; Dia browser) left unchecked with notes on what was verified in CI.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Signed-off-by: xNet Test <test@xnet.dev>
@crs48 crs48 temporarily deployed to pr-349 June 30, 2026 15:06 — with GitHub Actions Inactive
@github-actions

Copy link
Copy Markdown
Contributor

🖼️ UI changes in this PR

No visual differences detected in the changed UI.

CI run

github-actions Bot added a commit that referenced this pull request Jun 30, 2026
@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
Contributor

Preview removed for PR #349.

github-actions Bot added a commit that referenced this pull request Jun 30, 2026
@crs48 crs48 merged commit 5976ab3 into main Jun 30, 2026
16 of 17 checks passed
@crs48 crs48 deleted the claude/0252-why-the-ai-chat-box-is-disabled-local-model-conn branch June 30, 2026 15:17
github-actions Bot added a commit that referenced this pull request Jun 30, 2026
crs48 added a commit that referenced this pull request Jun 30, 2026
## What & why

Follow-up to [#349](#349) (exploration
0252). A **live browser smoke** of the new in-browser WebLLM tier caught
a real bug the headless CI couldn't: model **weights are served from
`us.aws.cdn.hf.co`** (Hugging Face's Xet CDN, a `*.hf.co` host), but
#349's CSP only allowed `huggingface.co` / `*.huggingface.co`. So the
model *config* fetch succeeded while the *weight* download was
`connect-src`-blocked — and the failure surfaced as a (mis-worded)
cloud-key "CORS block" error.

## Changes

- **CSP** (`apps/web/index.html`): add `https://*.hf.co` to
`connect-src` so the HF Xet/LFS CDN that serves model weights is
reachable.
- **Error message** (`ai-webllm-engine.ts`): wrap WebLLM init failures
in a WebLLM-specific message, so a future failure reads sensibly instead
of being rewritten as the generic cloud-key/Ollama CORS hint by
`errorMessage`.
- Checks off the previously-pending live-WebGPU validation item in 0252.

## Verification (live, in a WebGPU browser)

Drove the real app (test-bypass identity → Companion → select
"In-browser model" → "Run"):
- ✅ Weights streamed from `us.aws.cdn.hf.co` with the progress bar
climbing 0 → 100% — **no CSP violation** (confirmed via
`securitypolicyviolation` listener; before the fix it fired
`connect-src` on `*.hf.co`).
- ✅ The composer **enabled**, and the in-tab model replied to two
prompts (e.g. *"A local-first app … stores its data locally on the
device."*) — inference running fully on-device.
- ✅ `apps/web` typecheck clean; `AiChatPanel` tests pass.

🤖 Generated with [Claude Code](https://claude.com/claude-code)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant