GitHub - derekshreds/LunaCode: An agentic coding assistant for VS Code, powered by OpenRouter. Set up with just an API key and a model id. Features Standard/Auto/Plan modes, an agentic tool loop (read/edit/grep/run), session history, live cost & cache-hit analytics, and prompt-cache optimization for cheap, fast long sessions.

Luna Code

AI Coding Agent for VS Code

Luna Code is an agentic coding assistant that runs entirely on OpenRouter. Point it at any model OpenRouter supports with just an API key and a model id — no other configuration required.

It's built for fast codebase navigation, efficient agentic sessions, and prompt-cache hit optimization so long sessions stay cheap and fast.

Features

One-line setup. Paste an OpenRouter API key, pick a model, go.
Three modes
- Standard — approve each file edit and shell command before it runs.
- Auto — runs edits and commands autonomously without prompting (commands in lunacode.alwaysDenyCommands are still hard-blocked).
- Plan — read-only research + planning. The agent investigates the code and proposes a concrete plan, but can't edit files or run mutating commands.
Session history. Every conversation is saved per-workspace. Click the history button in the header to browse, reload, or delete prior sessions.
Usage & cost analytics. A live meter shows session-total cost plus the last turn's tokens and cache-hit rate. The usage window (bar-chart button) reports spend and tokens over the last 30 / 60 / 90 days, with a daily cost chart, daily token chart, and a per-model cost/usage breakdown.
Refined "thinking" UX. While the model reasons, an animated Thinking… indicator sits just above the composer; when it finishes it collapses to a quiet "Thought for Ns" marker — no noisy, expandable reasoning blocks.
Agentic tool loop. Reads files, lists/globs/greps the workspace, runs builds & tests, checks language-server diagnostics, and edits code — looping until the task is done. Independent read-only lookups batched into one response run in parallel, and after every edit the file's language-server errors are auto-appended to the tool result so the model fixes breakage without an extra round-trip.
Explore sub-agent. An explore tool delegates open-ended research ("how does auth work here?") to a disposable sub-agent with its own context (lunacode.subagentModel, cheap model recommended). Only the digest returns to the main conversation, keeping it small and cache-friendly.
Surgical reads. A file_outline tool (language-server symbols with line ranges) plus read_file offset/limit paging — the agent pulls 40 relevant lines instead of whole files.
Project memory. A LUNA.md at the workspace root is loaded into the system prompt every session; the agent is instructed to record durable conventions and gotchas there.
Turn checkpoints. Files changed by an agent turn are snapshotted; an ↩ revert chip in the meter restores the last turn's edits (stack of 10).
MCP servers. Connect stdio Model Context Protocol servers via lunacode.mcpServers (or the settings GUI); their tools appear to the agent as mcp__<server>__<tool> with approval gating in Standard mode.
Cache-warmth dot. The meter shows whether the provider prompt cache is still warm (~5 min TTL) — a cold cache means the next message re-writes it at full input price. Optional pre-warm (lunacode.prewarmCache) writes the cache when a session opens so the first message starts warm.
Mid-turn steering. Messages sent while the agent is working are injected into the running task at the next step — no waiting for the turn to end.
Live task checklist. For multi-step work the agent maintains a visible plan (set_tasks) rendered above the composer with per-step progress.
Eager tool execution. Read-only tool calls start running while the model is still streaming the rest of its response, overlapping generation with I/O.
Resilient streaming. Automatic retry with backoff on 429/5xx before any tokens arrive, a hung-stream watchdog, and optional fallback models (lunacode.fallbackModels) via OpenRouter routing — a status line notes when a fallback served the response.
@-file mentions. Type @ in the composer for a fuzzy file picker; the chosen path is inserted so the agent reads exactly the file you meant.
Turn review & commit. The ± chip shows side-by-side diffs of the last turn's edits with one-click git commit (message generated by the cheap summarizer model); ↻/✎ chips retry or edit-and-resend your last message.
Context inspector. Click the session cost in the meter to see exactly what's in the context window: totals vs budget, system-prompt size, the largest items, and the estimated cost of the next (cached) call.
Multi-file patches. An apply_patch tool edits many files in one model round-trip instead of one call per file.
Background processes. start_process / read_process / stop_process let the agent run a dev server, probe it, read the logs, and iterate.
Session budget guardrail. lunacode.sessionBudgetUsd pauses the agent (even in Auto mode) and asks before spending past your limit.
Editor-aware. Each message can carry your active file + selection (lunacode.includeActiveFile); right-click menu adds Fix Problems in This File, Refactor Selection…, and Explain Selection; and every diagnostic's lightbulb offers Fix with Luna Code. Multi-root workspaces pick their working folder via Select Working Folder.
Slash commands. /commit, /review, /tests built in, plus your own templates via lunacode.customCommands — with autocomplete in the composer.
Image paste. Paste screenshots into the composer (up to 3, multimodal models via OpenRouter).
Worktree sandbox. lunacode.worktreeMode runs the agent in a separate git worktree; merge or discard its changes via the command palette.
Format after edit. Optionally run the workspace formatter on every file the agent touches (lunacode.formatAfterEdit).
Calm, readable streaming. Scrolling up pauses auto-follow; long code blocks are height-capped with click-to-expand; a live ~N tok counter shows progress during long silent generations; and an actions menu (⋯) gathers review/revert/retry/edit/export with plain-text labels.
Live tool output. Commands stream their stdout into the tool card as they run (last few lines, click for the full log), background processes show their startup output, and the explore sub-agent's lookups stream into its card so its research is visible.
Monorepo memory. Nested LUNA.md files in subdirectories load alongside the root one, each labeled with its path.
Cache-hit optimized. A stable system-prompt prefix plus rolling cache_control breakpoints maximize provider prompt caching (Anthropic / Gemini via OpenRouter; automatic for OpenAI). The composer shows a live cache hit % and token/cost meter.
Context management. Cache-aware compaction: history stays append-only (so prompt-cache hits keep landing) until a price-aware budget is hit — sized from the model's context window and its input price, so a fully cached turn stays under a target cost (lunacode.autoBudgetCarryCostUsd). A compaction event then supersedes stale duplicate file reads and replaces the oldest turns with a structured checkpoint summary written by a cheap summarizer model (lunacode.summarizerModel), driving the context down to a floor (lunacode.compactionTargetRatio) so events stay rare.
Settings GUI. A gear button in the panel header opens an in-chat settings sheet — models, context/cost budgets, generation, privacy routing, and command allow/deny lists — with instant apply and two-way sync with VS Code's settings editor.
Modern UI. A clean neutral-dark interface with purple accents, streaming responses, collapsible reasoning, tool cards, and inline diff approvals.
Open anywhere. Use Luna Code in the Activity Bar sidebar or pop it out into an editor tab (button in the panel title bar). To dock it on the right like Claude Code, drag the Luna Code icon into the Secondary Side Bar — VS Code remembers the placement. All surfaces share one live session.
Private by default. Every request sends OpenRouter provider.data_collection: "deny", so traffic is only routed to providers that do not store or train on your prompts. An optional stricter Zero-Data-Retention (ZDR) mode is available.
Secure. Your API key is stored in VS Code's encrypted SecretStorage, never in settings or files.

Getting started

Build the extension:
```
npm install
npm run compile
```
Press F5 in VS Code to launch the Extension Development Host.
Click the Luna Code icon in the Activity Bar.
Click Set OpenRouter API Key and paste your key (sk-or-v1-…).
Click the model chip in the header to pick a model (or browse all OpenRouter models).
Type a request and hit Enter.

Keyboard shortcuts

Shortcut	Action
`Ctrl/Cmd + Shift + L`	Focus the Luna Code chat
`Ctrl/Cmd + Shift + K`	Add the current editor selection to chat

Configuration

All settings live under the lunacode.* namespace (Settings → Extensions → Luna Code):

Setting	Default	Description
`lunacode.model`	`deepseek/deepseek-v4-flash`	OpenRouter model id (use the picker / Browse all for current ids).
`lunacode.baseUrl`	`https://openrouter.ai/api/v1`	API base URL (override for proxies).
`lunacode.defaultMode`	`standard`	`standard` \| `auto` \| `plan`.
`lunacode.maxTokens`	`0`	Max tokens per turn. `0` = use the model's full output limit (avoids truncating large `write_file` calls).
`lunacode.temperature`	`0`	Sampling temperature.
`lunacode.enablePromptCaching`	`true`	Insert `cache_control` breakpoints.
`lunacode.dataCollection`	`deny`	`deny` routes only to providers that don't store/train on prompts; `allow` permits all.
`lunacode.zeroDataRetention`	`false`	Stricter: only route to Zero-Data-Retention endpoints.
`lunacode.maxContextTokens`	`180000`	Budget before older context is compacted.
`lunacode.autoApproveCommands`	common read-only cmds	Auto-approved even in Standard mode.
`lunacode.alwaysDenyCommands`	destructive cmds	Always blocked, any mode.

Tools the agent can use

Tool	Mutating	Purpose
`read_file`	no	Read a file (with paging).
`list_dir`	no	List a directory.
`glob`	no	Find files by glob pattern.
`grep`	no	Regex search across the workspace.
`get_diagnostics`	no	Read language-server errors/warnings.
`write_file`	yes	Create/overwrite a file.
`edit_file`	yes	Exact-string targeted edit.
`run_command`	yes	Run a shell command (PowerShell on Windows, sh elsewhere).

Mutating tools are hidden entirely in Plan mode and gated by approval in Standard mode.

How cache optimization works

OpenRouter forwards cache_control breakpoints to providers that support prompt caching. Luna Code:

Keeps the system prompt + tool definitions byte-stable across a session and marks the end of the system prompt as a cache breakpoint.
Places a rolling breakpoint on the latest message each request so the entire accumulated conversation becomes a cached prefix for the next call.
Only ever appends to the message list, never reorders, so prefixes stay valid for cache reuse.

For OpenAI models, caching is automatic and these hints are safely ignored.

Project structure

src/
  extension.ts            activation + commands
  config.ts               settings + SecretStorage for the API key
  modes.ts                Standard / Auto / Plan definitions
  openrouter/
    client.ts             streaming Chat Completions client
    types.ts              message + cache_control types
  agent/
    agent.ts              the agentic tool loop
    systemPrompt.ts       system prompt (stable cache prefix)
    contextManager.ts     cache breakpoints + context compaction
    tools/                read/write/edit/list/glob/grep/run/diagnostics
  webview/
    provider.ts           webview host + approval bridge
    protocol.ts           host <-> webview message types
    ui/                   the webview front-end (main.ts, markdown.ts)
media/
  webview.css             the dark-purple theme

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 12 Commits
media		media
scripts		scripts
src		src
.gitignore		.gitignore
.vscodeignore		.vscodeignore
LICENSE		LICENSE
README.md		README.md
esbuild.js		esbuild.js
package-lock.json		package-lock.json
package.json		package.json
tsconfig.json		tsconfig.json

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Luna Code

AI Coding Agent for VS Code

Features

Getting started

Keyboard shortcuts

Configuration

Tools the agent can use

How cache optimization works

Project structure

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Luna Code

AI Coding Agent for VS Code

Features

Getting started

Keyboard shortcuts

Configuration

Tools the agent can use

How cache optimization works

Project structure

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages