Skip to content

feat(edit): add /api/edit/completions endpoint#3516

Merged
markijbema merged 15 commits into
mainfrom
mark/nextedit-endpoint
May 28, 2026
Merged

feat(edit): add /api/edit/completions endpoint#3516
markijbema merged 15 commits into
mainfrom
mark/nextedit-endpoint

Conversation

@markijbema
Copy link
Copy Markdown
Contributor

@markijbema markijbema commented May 27, 2026

Summary

  • Adds a new proxy endpoint at /api/edit/completions for edit completion requests, backed by Inception's Mercury Edit endpoint.
  • Mirrors the existing /api/fim/completions auth, balance, organization policy, BYOK, upstream proxy, and usage logging flow while validating Inception's single-user-message edit contract.
  • Tags usage as edit_completions, reports cached input token discounts, and returns unsupported_edit_model for unsupported model prefixes.

Verification

  • Manual review of the route against the existing /api/fim/completions route to confirm the auth, balance, org-policy, BYOK, proxying, and usage logging flow remains aligned.
  • Manual review that the public route, error type, API kind, logs, tests, and PR metadata consistently use edit terminology.

Visual Changes

N/A — backend-only.

Reviewer Notes

  • Only Inception models prefixed with inception/ are supported for now.
  • Inception's /v1/edit/completions endpoint accepts exactly one role: "user" message and does not support streaming, so the route rejects unsupported message shapes and stream: true before proxying.
  • Future work: route through the Kilo gateway instead of calling Inception directly, and add other edit providers once they expose an equivalent endpoint.

kilo-code-bot Bot added 3 commits May 26, 2026 08:10
Adds a new proxy route at /api/nextedit/completions that mirrors the
shape of /api/fim/completions but targets Inception's
/v1/edit/completions endpoint. For now only the 'inception/' provider
prefix is supported; the route accepts a chat-style messages array
(single user message, no system role — Inception bakes the system
prompt server-side) and returns the upstream response unchanged.

Adds 'nextedit_completions' to GatewayApiKindSchema and a matching
'unsupported_nextedit_model' error type. Pricing reuses Inception's
FIM per-token rates.
…tive max_tokens

Address kilo-code-bot review:
- Wrap parseNextEditUsageFromString in .catch so non-JSON upstream
  error bodies surface as Sentry events instead of silent unhandled
  rejections inside after().
- Constrain max_tokens to a positive integer at the schema level so
  values like -1 can no longer slip past the !max_tokens / >limit
  guard.
Comment thread apps/web/src/app/api/nextedit/completions/route.ts Outdated
Comment thread apps/web/src/app/api/edit/completions/route.ts
Comment thread apps/web/src/lib/ai-gateway/llm-proxy-helpers.ts Outdated
@kilo-code-bot
Copy link
Copy Markdown
Contributor

kilo-code-bot Bot commented May 27, 2026

Code Review Summary

Status: No Issues Found | Recommendation: Merge

Executive Summary

All previously flagged issues have been resolved. The latest commit correctly sets system_prompt_length: 0 in extractEditPromptInfo (removing the misleading total-message-content-length calculation) and adds a covering test.

Resolved Issues (from previous reviews)
File Issue Status
apps/web/src/lib/ai-gateway/llm-proxy-helpers.test.ts Test description said "flags an error when usage is absent" but hasError was never asserted ✅ Fixed
apps/web/src/lib/ai-gateway/llm-proxy-helpers.ts system_prompt_length was set to total message content length rather than 0 ✅ Fixed
Files Reviewed (7 files)
  • apps/web/next.config.mjs — no issues
  • apps/web/src/app/api/edit/completions/route.ts — no issues
  • apps/web/src/app/api/edit/completions/route.test.ts — no issues
  • apps/web/src/lib/ai-gateway/llm-proxy-helpers.ts — no issues
  • apps/web/src/lib/ai-gateway/llm-proxy-helpers.test.ts — no issues
  • apps/web/src/lib/proxy-error-types.ts — no issues
  • packages/db/src/schema-types.ts — no issues

Reviewed by claude-sonnet-4.6 · 141,478 tokens

Review guidance: REVIEW.md from base branch main

@markijbema markijbema changed the title feat(nextedit): add /api/nextedit/completions endpoint feat(edit): add /api/edit/completions endpoint May 28, 2026
kilo-code-bot Bot added 2 commits May 28, 2026 09:11
When a request is BYOK we already zero `cost_mUsd` so the user is not
billed. The cache discount we computed for that same request needs to
be zeroed too, otherwise the usage row claims a discount on spend that
never happened and distorts cache-savings reporting. Mirrors the
canonical `processOpenRouterUsage` BYOK/free zeroing.
The route test previously stubbed out `countAndStoreEditUsage` entirely,
so the whole billing/usage code path was untested. Replace that with a
mock at the boundary (`logMicrodollarUsage` + `next/server.after`) so
the route tests now exercise the real helper end-to-end and assert on
the persisted usage row. New cases:

- non-BYOK requests persist the computed cost and cache discount.
- BYOK requests persist `cost_mUsd: 0` and `cacheDiscount_mUsd: 0`,
  while `market_cost` retains the upstream cost.
- Upstream responses without a `usage` block persist a zero-cost row
  rather than silently dropping it.
- Unsupported models surface as `unsupported_edit_model`.
- Negative `max_tokens` is rejected by the schema.
- Empty-balance non-BYOK requests return `insufficient_credits`.

Helper-level tests for `parseEditUsageFromResponse` add coverage for:
- flat Inception pricing when `cached_input_tokens` is absent.
- zero cost / hasError when `usage` is missing.
- hasError on upstream 4xx responses.
- clamping `cached_input_tokens` greater than `prompt_tokens`.

`countAndStoreEditUsage` gets direct unit tests for the BYOK zeroing
and the no-body short-circuit.
Comment thread apps/web/src/lib/ai-gateway/llm-proxy-helpers.test.ts Outdated
kilo-code-bot Bot and others added 3 commits May 28, 2026 09:59
Two doc-only comments captured during review:

- Document why `/api/edit/completions` refuses requests when the org
  has `data_collection: 'deny'`. The OpenRouter/Vercel paths honor that
  flag by setting it on the upstream request body, which lets those
  gateways pick a sub-provider with a no-training contract. This route
  bypasses both gateways and posts straight to Inception, whose edit
  endpoint has no per-request opt-out. Their published privacy policy
  treats prompt content as collectible for model training on the
  standard API tier; only the enterprise tier advertises no-training /
  no-retention. Refusing here is the only way to honor the org's
  stated intent until that changes.

- Cite the source for the Inception Mercury Edit 2 per-token rates
  ($0.25 / $0.025 / $0.75 per 1M tokens) used in
  `computeEditMicrodollarCost`. Mercury 2 (chat) shares the same
  rates per Inception's own model and blog pages.
Comment thread apps/web/src/lib/ai-gateway/llm-proxy-helpers.ts Outdated
@markijbema markijbema enabled auto-merge (squash) May 28, 2026 11:44
@markijbema markijbema merged commit 5b1108a into main May 28, 2026
49 checks passed
@markijbema markijbema deleted the mark/nextedit-endpoint branch May 28, 2026 11:45
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants