feat(edit): add /api/edit/completions endpoint#3516
Merged
Conversation
Adds a new proxy route at /api/nextedit/completions that mirrors the shape of /api/fim/completions but targets Inception's /v1/edit/completions endpoint. For now only the 'inception/' provider prefix is supported; the route accepts a chat-style messages array (single user message, no system role — Inception bakes the system prompt server-side) and returns the upstream response unchanged. Adds 'nextedit_completions' to GatewayApiKindSchema and a matching 'unsupported_nextedit_model' error type. Pricing reuses Inception's FIM per-token rates.
…tive max_tokens Address kilo-code-bot review: - Wrap parseNextEditUsageFromString in .catch so non-JSON upstream error bodies surface as Sentry events instead of silent unhandled rejections inside after(). - Constrain max_tokens to a positive integer at the schema level so values like -1 can no longer slip past the !max_tokens / >limit guard.
Contributor
Code Review SummaryStatus: No Issues Found | Recommendation: Merge Executive SummaryAll previously flagged issues have been resolved. The latest commit correctly sets Resolved Issues (from previous reviews)
Files Reviewed (7 files)
Reviewed by claude-sonnet-4.6 · 141,478 tokens Review guidance: REVIEW.md from base branch |
When a request is BYOK we already zero `cost_mUsd` so the user is not billed. The cache discount we computed for that same request needs to be zeroed too, otherwise the usage row claims a discount on spend that never happened and distorts cache-savings reporting. Mirrors the canonical `processOpenRouterUsage` BYOK/free zeroing.
The route test previously stubbed out `countAndStoreEditUsage` entirely, so the whole billing/usage code path was untested. Replace that with a mock at the boundary (`logMicrodollarUsage` + `next/server.after`) so the route tests now exercise the real helper end-to-end and assert on the persisted usage row. New cases: - non-BYOK requests persist the computed cost and cache discount. - BYOK requests persist `cost_mUsd: 0` and `cacheDiscount_mUsd: 0`, while `market_cost` retains the upstream cost. - Upstream responses without a `usage` block persist a zero-cost row rather than silently dropping it. - Unsupported models surface as `unsupported_edit_model`. - Negative `max_tokens` is rejected by the schema. - Empty-balance non-BYOK requests return `insufficient_credits`. Helper-level tests for `parseEditUsageFromResponse` add coverage for: - flat Inception pricing when `cached_input_tokens` is absent. - zero cost / hasError when `usage` is missing. - hasError on upstream 4xx responses. - clamping `cached_input_tokens` greater than `prompt_tokens`. `countAndStoreEditUsage` gets direct unit tests for the BYOK zeroing and the no-body short-circuit.
Two doc-only comments captured during review: - Document why `/api/edit/completions` refuses requests when the org has `data_collection: 'deny'`. The OpenRouter/Vercel paths honor that flag by setting it on the upstream request body, which lets those gateways pick a sub-provider with a no-training contract. This route bypasses both gateways and posts straight to Inception, whose edit endpoint has no per-request opt-out. Their published privacy policy treats prompt content as collectible for model training on the standard API tier; only the enterprise tier advertises no-training / no-retention. Refusing here is the only way to honor the org's stated intent until that changes. - Cite the source for the Inception Mercury Edit 2 per-token rates ($0.25 / $0.025 / $0.75 per 1M tokens) used in `computeEditMicrodollarCost`. Mercury 2 (chat) shares the same rates per Inception's own model and blog pages.
chrarnoldus
reviewed
May 28, 2026
chrarnoldus
approved these changes
May 28, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
/api/edit/completionsfor edit completion requests, backed by Inception's Mercury Edit endpoint./api/fim/completionsauth, balance, organization policy, BYOK, upstream proxy, and usage logging flow while validating Inception's single-user-message edit contract.edit_completions, reports cached input token discounts, and returnsunsupported_edit_modelfor unsupported model prefixes.Verification
/api/fim/completionsroute to confirm the auth, balance, org-policy, BYOK, proxying, and usage logging flow remains aligned.editterminology.Visual Changes
N/A — backend-only.
Reviewer Notes
inception/are supported for now./v1/edit/completionsendpoint accepts exactly onerole: "user"message and does not support streaming, so the route rejects unsupported message shapes andstream: truebefore proxying.