v0.19.0
Release Notes — v0.19.0 (April 18, 2026)
✦ AI Streaming UX + Model Selector
Summary
This release fixes the core AI streaming experience and gives users control over which model answers their queries.
Responses now appear token-by-token in real time, a thinking indicator covers the context-scan wait, and a compact model selector in the modal header lets users pick Claude, GPT, Gemini, or any other Copilot-registered model on the fly.
Added
-
AI model selector (inline in modal header)
A compact<select>dropdown appears in the top-right of the AI modal (before the close button) when any Copilot language models are registered in VS Code.- Defaults to Auto (respects
workspai.preferredModelworkspace setting) - Premium Copilot users see their full model roster; free users see their available models
- Selection is remembered per session and sent with every query
- Disabled (grayed out) while a response is streaming
- Defaults to Auto (respects
-
Thinking indicator
Animated three-dot bouncing indicator shown between query submission and the first token arriving.
Replaces the empty modal body + blinking cursor that made users think the extension had frozen during the context-scan phase. -
MarkdownRenderer component (
webview-ui/src/components/MarkdownRenderer.tsx)
Zero-dependency markdown renderer for AI responses supporting:- Headings (H1–H3), bold, italic, bold-italic
- Inline code and fenced code blocks with language label
- Ordered and unordered lists
- Horizontal rules, paragraphs
Fixed
-
Real-time streaming — "all at once" delivery resolved
The extension host previously calledpostMessagehundreds of times per second inside the same event-loop tick. VS Code's IPC layer batched all of these into a single delivery, so the webview received nothing until the stream finished — then the full text appeared at once.
Fixed by introducing a 50 mssetIntervalflush loop inwelcomePanel.ts: tokens accumulate in a string buffer and are delivered as one message per interval, giving the webview ~20 smooth updates per second. -
Streaming render overhead eliminated
MarkdownRendererpreviously ranparseBlocks()(an O(n) operation) on everyrequestAnimationFrameduring streaming, creating an O(n²) bottleneck that worsened as the response grew.
Fixed by rendering the rawcontentprop directly during streaming (no parse, no internal state) and performing a singleparseBlockspass afterisStreamingbecomes false. -
Menu item console warning
workbench.actions.treeView.rapidkitProjects.collapseAllwas referenced incontributes.menusbut not incontributes.commands, producing a VS Code activation warning.
Fixed by removing the manual menu entry and enabling"showCollapseAll": trueon therapidkitProjectsview definition — VS Code renders the collapse button natively with no registration required.
Changed
- Quick-prompt suggestion chips are now hidden while the modal is in a streaming or thinking state (they were showing as disabled, adding visual noise)
listAvailableModels()added toaiService.tsas a public API, used by the model selector to enumerate registered LM chat modelsstreamAIResponse()accepts an optionalpreferredModelIdparameter; when provided, the exact model is used instead of the workspace preference