feat(edit): add /api/edit/completions endpoint by markijbema · Pull Request #3516 · Kilo-Org/cloud

markijbema · 2026-05-27T08:35:38Z

Summary

Adds a new proxy endpoint at /api/edit/completions for edit completion requests, backed by Inception's Mercury Edit endpoint.
Mirrors the existing /api/fim/completions auth, balance, organization policy, BYOK, upstream proxy, and usage logging flow while validating Inception's single-user-message edit contract.
Tags usage as edit_completions, reports cached input token discounts, and returns unsupported_edit_model for unsupported model prefixes.

Verification

Manual review of the route against the existing /api/fim/completions route to confirm the auth, balance, org-policy, BYOK, proxying, and usage logging flow remains aligned.
Manual review that the public route, error type, API kind, logs, tests, and PR metadata consistently use edit terminology.

Visual Changes

N/A — backend-only.

Reviewer Notes

Only Inception models prefixed with inception/ are supported for now.
Inception's /v1/edit/completions endpoint accepts exactly one role: "user" message and does not support streaming, so the route rejects unsupported message shapes and stream: true before proxying.
Future work: route through the Kilo gateway instead of calling Inception directly, and add other edit providers once they expose an equivalent endpoint.

Adds a new proxy route at /api/nextedit/completions that mirrors the shape of /api/fim/completions but targets Inception's /v1/edit/completions endpoint. For now only the 'inception/' provider prefix is supported; the route accepts a chat-style messages array (single user message, no system role — Inception bakes the system prompt server-side) and returns the upstream response unchanged. Adds 'nextedit_completions' to GatewayApiKindSchema and a matching 'unsupported_nextedit_model' error type. Pricing reuses Inception's FIM per-token rates.

…tive max_tokens Address kilo-code-bot review: - Wrap parseNextEditUsageFromString in .catch so non-JSON upstream error bodies surface as Sentry events instead of silent unhandled rejections inside after(). - Constrain max_tokens to a positive integer at the schema level so values like -1 can no longer slip past the !max_tokens / >limit guard.

kilo-code-bot · 2026-05-27T08:42:05Z

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Executive Summary

All previously flagged issues have been resolved. The latest commit correctly sets system_prompt_length: 0 in extractEditPromptInfo (removing the misleading total-message-content-length calculation) and adds a covering test.

Resolved Issues (from previous reviews)

File	Issue	Status
`apps/web/src/lib/ai-gateway/llm-proxy-helpers.test.ts`	Test description said "flags an error when usage is absent" but `hasError` was never asserted	✅ Fixed
`apps/web/src/lib/ai-gateway/llm-proxy-helpers.ts`	`system_prompt_length` was set to total message content length rather than 0	✅ Fixed

Files Reviewed (7 files)

apps/web/next.config.mjs — no issues
apps/web/src/app/api/edit/completions/route.ts — no issues
apps/web/src/app/api/edit/completions/route.test.ts — no issues
apps/web/src/lib/ai-gateway/llm-proxy-helpers.ts — no issues
apps/web/src/lib/ai-gateway/llm-proxy-helpers.test.ts — no issues
apps/web/src/lib/proxy-error-types.ts — no issues
packages/db/src/schema-types.ts — no issues

_{Reviewed by claude-sonnet-4.6 · 141,478 tokens}

_{Review guidance: REVIEW.md from base branch main}

When a request is BYOK we already zero `cost_mUsd` so the user is not billed. The cache discount we computed for that same request needs to be zeroed too, otherwise the usage row claims a discount on spend that never happened and distorts cache-savings reporting. Mirrors the canonical `processOpenRouterUsage` BYOK/free zeroing.

The route test previously stubbed out `countAndStoreEditUsage` entirely, so the whole billing/usage code path was untested. Replace that with a mock at the boundary (`logMicrodollarUsage` + `next/server.after`) so the route tests now exercise the real helper end-to-end and assert on the persisted usage row. New cases: - non-BYOK requests persist the computed cost and cache discount. - BYOK requests persist `cost_mUsd: 0` and `cacheDiscount_mUsd: 0`, while `market_cost` retains the upstream cost. - Upstream responses without a `usage` block persist a zero-cost row rather than silently dropping it. - Unsupported models surface as `unsupported_edit_model`. - Negative `max_tokens` is rejected by the schema. - Empty-balance non-BYOK requests return `insufficient_credits`. Helper-level tests for `parseEditUsageFromResponse` add coverage for: - flat Inception pricing when `cached_input_tokens` is absent. - zero cost / hasError when `usage` is missing. - hasError on upstream 4xx responses. - clamping `cached_input_tokens` greater than `prompt_tokens`. `countAndStoreEditUsage` gets direct unit tests for the BYOK zeroing and the no-body short-circuit.

Two doc-only comments captured during review: - Document why `/api/edit/completions` refuses requests when the org has `data_collection: 'deny'`. The OpenRouter/Vercel paths honor that flag by setting it on the upstream request body, which lets those gateways pick a sub-provider with a no-training contract. This route bypasses both gateways and posts straight to Inception, whose edit endpoint has no per-request opt-out. Their published privacy policy treats prompt content as collectible for model training on the standard API tier; only the enterprise tier advertises no-training / no-retention. Refusing here is the only way to honor the org's stated intent until that changes. - Cite the source for the Inception Mercury Edit 2 per-token rates ($0.25 / $0.025 / $0.75 per 1M tokens) used in `computeEditMicrodollarCost`. Mercury 2 (chat) shares the same rates per Inception's own model and blog pages.

kilo-code-bot Bot added 3 commits May 26, 2026 08:10

chore: retrigger ci

5fe458a

kilo-code-bot Bot reviewed May 27, 2026

View reviewed changes

Comment thread apps/web/src/app/api/nextedit/completions/route.ts Outdated

Comment thread apps/web/src/app/api/edit/completions/route.ts

Comment thread apps/web/src/lib/ai-gateway/llm-proxy-helpers.ts Outdated

markijbema and others added 4 commits May 27, 2026 10:58

fix(nextedit): address review suggestions

cfedb82

fix(nextedit): respect organization data collection denial

97c2233

fix(nextedit): enforce upstream message contract

4427153

fix(nextedit): bill cached input at discounted rate

031b833

markijbema mentioned this pull request May 28, 2026

feat(vscode,gateway): route Mercury Next Edit through Kilo Gateway Kilo-Org/kilocode#10644

Merged

markijbema added 2 commits May 28, 2026 10:20

fix(gateway): rename nextedit endpoint to edit

9b45471

test(gateway): fix edit route test typing

cc0af58

markijbema changed the title ~~feat(nextedit): add /api/nextedit/completions endpoint~~ feat(edit): add /api/edit/completions endpoint May 28, 2026

kilo-code-bot Bot added 2 commits May 28, 2026 09:11

kilo-code-bot Bot reviewed May 28, 2026

View reviewed changes

Comment thread apps/web/src/lib/ai-gateway/llm-proxy-helpers.test.ts Outdated

kilo-code-bot Bot and others added 3 commits May 28, 2026 09:59

test(edit): clarify missing usage expectation

3acb852

fix(edit): route endpoint through global API

45c3d7e

chrarnoldus reviewed May 28, 2026

View reviewed changes

Comment thread apps/web/src/lib/ai-gateway/llm-proxy-helpers.ts Outdated

chrarnoldus approved these changes May 28, 2026

View reviewed changes

fix(edit): use zero system prompt length

598c59e

markijbema enabled auto-merge (squash) May 28, 2026 11:44

markijbema merged commit 5b1108a into main May 28, 2026
49 checks passed

markijbema deleted the mark/nextedit-endpoint branch May 28, 2026 11:45

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(edit): add /api/edit/completions endpoint#3516

feat(edit): add /api/edit/completions endpoint#3516
markijbema merged 15 commits into
mainfrom
mark/nextedit-endpoint

markijbema commented May 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot Bot commented May 27, 2026 •

edited

Loading

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

markijbema commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Verification

Visual Changes

Reviewer Notes

Uh oh!

Uh oh!

Uh oh!

Uh oh!

kilo-code-bot Bot commented May 27, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Code Review Summary

Executive Summary

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

markijbema commented May 27, 2026 •

edited

Loading

kilo-code-bot Bot commented May 27, 2026 •

edited

Loading