feat(platform): OpenAI Chat Completions API compatibility layer#1251
Conversation
There was a problem hiding this comment.
Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.
📝 WalkthroughWalkthroughThis PR implements OpenAI-compatible HTTP endpoints for chat completions and model listing. It adds HTTP handlers for Estimated code review effort🎯 4 (Complex) | ⏱️ ~55 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
Actionable comments posted: 8
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@services/platform/convex/lib/agent_response/generate_response.ts`:
- Around line 622-639: The follow-up generateText calls in the empty-tool-result
retry, the "length" continuation branch, and the timeout recovery branch are not
receiving generationParams so they fall back to model defaults; update the code
so every generateText invocation (including retries and continuation/recovery
paths referenced around the empty-tool-result retry, the "length" continuation,
and timeout recovery) merges or forwards the same generationParams (temperature,
maxTokens, topP, frequencyPenalty, presencePenalty, stopSequences) used in the
initial call—either by passing generationParams directly into each
generateText(...) call or by constructing a single merged params object (e.g.,
mergedGenerationParams) and using it everywhere to ensure max_tokens, stop, and
sampling settings are preserved across all branches.
In `@services/platform/convex/openai_compat/http_actions.ts`:
- Around line 356-371: The chat call currently passes only
lastUserMessage.content to
internal.openai_compat.internal_actions.chatViaOpenAI, which drops prior
conversation turns; change the payload in ctx.runAction to pass the full
conversation history (e.g., the messages array or reconstructed list of
system/assistant/user messages) rather than just lastUserMessage.content,
preserving threadId, enableStreaming/shouldStream, generationParams and
responseFormat so multi-turn context is forwarded to chatViaOpenAI.
- Around line 766-785: The code currently treats any terminal
persistentStreaming state as a normal "stop" by always calling
buildChatCompletionChunk(..., 'stop', ...); update the logic in the polling loop
to capture the actual terminal status (e.g., const terminalStatus = body.status)
and when creating finalChunk pass a finish_reason that reflects that status
(e.g., 'stop' for 'done', 'error' for 'error', and 'timeout' or 'length' for
'timeout'), and include any available error/timeout details if present; modify
the call to buildChatCompletionChunk(completionId, model, {}, finishReason,
created) so streaming clients can distinguish successful completions from
failures.
- Around line 184-190: convertToModelMessages currently calls JSON.parse on
tc.function.arguments (from msg.tool_calls) without validation, causing a thrown
error to bubble up as a 500; change convertToModelMessages to defensively parse
tc.function.arguments inside a try/catch, and when JSON.parse fails throw an
invalid_request_error mapped to a 400 (e.g., construct and throw the same error
type used elsewhere for request validation failures so callers see a 400),
referencing msg.tool_calls and tc.function.arguments so malformed client input
on continuation requests is rejected with a 400 instead of producing a server
error.
In `@services/platform/convex/openai_compat/internal_actions.ts`:
- Around line 230-237: The code only scrubs args.message via scrubMessagePii,
but downstream flow (e.g., conversationMessages forwarded into streamText on
tool continuations) allows unsanitized client-supplied transcript to bypass PII
scrubbing; update the logic in internal_actions.ts to run scrubMessagePii over
the entire conversation transcript (e.g., the conversationMessages array/string
that is passed into streamText and any continuation payloads) rather than only
args.message, and ensure all places noted (including the similar block around
lines 301-309) replace uses of raw conversation messages with the returned
sanitized value before forwarding to streamText or other continuation handlers.
- Around line 272-282: The current mutation call to
internal.openai_compat.internal_mutations.createThreadAndSaveMessage is
re-saving the original user `message` when `hasConversation` is true, causing
duplication; change the payload so you save the tool/assistant output (or an
empty/placeholder) instead of the original user turn when `hasConversation` is
true — e.g., compute a `messageToSave` (use the assistant/tool-generated text or
omit saving) and pass that into the mutation instead of `message` for the block
with `createThreadAndSaveMessage` and the analogous block around lines 296-304;
ensure the same `messageToSave` logic is used before calling `streamText` so
stored thread history matches the actual OpenAI conversation.
In `@services/platform/convex/openai_compat/response_format.ts`:
- Line 18: The FinishReason union type should include additional OpenAI finish
reasons for forward compatibility; update the FinishReason type definition (the
`FinishReason` alias in response_format.ts) to add 'content_filter' to the union
(e.g., type FinishReason = 'stop' | 'length' | 'tool_calls' | 'content_filter')
so the type accurately represents possible OpenAI API responses.
In `@services/platform/scripts/test_openai_compat.py`:
- Around line 127-139: The named tests (e.g., the "4. Generation params
(temperature=0.1, max_tokens=15)" block that calls
client.chat.completions.create and the json_object and stop test blocks) are too
permissive—update each to assert the actual feature: for generation params
assert the response finish_reason (or token-count) reflects truncation at
max_tokens and that the returned text length is bounded by max_tokens (and/or
that finish_reason == "length"); for the json_object test call json.loads(...)
on r.choices[0].message.content and assert it parses without exception and
matches expected keys/types; for the stop-sequence test, request a known stop
token and assert the returned text ends before or does not contain content after
that stop sequence (and/or that finish_reason indicates stop), using the same
client.chat.completions.create call sites to locate and modify the assertions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 5cf12ca9-1966-4c4d-bcf7-a8547676735e
⛔ Files ignored due to path filters (1)
services/platform/convex/_generated/api.d.tsis excluded by!**/_generated/**
📒 Files selected for processing (16)
services/platform/convex/betterAuth/schema.tsservices/platform/convex/http.tsservices/platform/convex/lib/agent_chat/internal_actions.tsservices/platform/convex/lib/agent_chat/start_agent_chat.tsservices/platform/convex/lib/agent_chat/types.tsservices/platform/convex/lib/agent_response/generate_response.tsservices/platform/convex/lib/agent_response/types.tsservices/platform/convex/lib/rate_limiter/index.tsservices/platform/convex/openai_compat/http_actions.tsservices/platform/convex/openai_compat/internal_actions.tsservices/platform/convex/openai_compat/internal_mutations.tsservices/platform/convex/openai_compat/internal_queries.tsservices/platform/convex/openai_compat/response_format.tsservices/platform/convex/openai_compat/tool_conversion.tsservices/platform/scripts/test_openai_compat.pytools/cli/src/index.ts
Add /api/v1/chat/completions and /api/v1/models endpoints that follow the OpenAI API schema, enabling existing OpenAI integrations to work with the platform without modification. Closes #1199
…AI compat endpoint Support two modes in the Chat Completions API: agent mode (server-side tools, async generation) and client tool mode (client-defined tools, direct streamText). Thread generation parameters (temperature, max_tokens, top_p, etc.) from the OpenAI request through to the underlying LLM call.
…tion history and tool_choice Convert full OpenAI message history to AI SDK ModelMessage format for multi-round tool calling instead of extracting only tool messages. Add tool_choice mapping (auto/none/required/specific function) from OpenAI format to AI SDK format. Update tool call extraction for AI SDK v6 content-based step format. Also expand e2e test coverage from 9 to 16 tests and fix CLI exit handling to use process.exit(1) and parseAsync().
- Add comprehensive API documentation for OpenAI-compatible endpoints to docs/api-reference.md with Python, Node.js, curl examples - Inject /api/v1/chat/completions and /api/v1/models paths into the generated OpenAPI spec with full request/response schemas - Update convex-helpers patch version 0.1.113 → 0.1.114 - Regenerate public/openapi.json with OpenAI Compatible tag and schemas Closes #1199
Replace the full 263-endpoint spec (452KB) with a curated public spec (12.5KB) containing only the OpenAI-compatible endpoints. The Swagger UI at /docs now loads instantly and shows only externally relevant APIs.
The #root div had overflow:clip which prevented scrolling on the standalone docs page. Override overflow and height for html, body, and #root when the swagger-ui-standalone class is present.
41b4dc0 to
9c0a2d8
Compare
Summary
/v1/chat/completionsendpoint that proxies requests to Tale agents, enabling any OpenAI SDK client to interact with Tale agents seamlesslytool_choicecontrol, generation params (temperature,max_tokens,top_p,stop),response_format: json_object, and a/v1/modelslisting endpointModelMessageformat) for multi-turn tool calling, rate limiting, PII scrubbing, and governance policy enforcementprocess.exit(1)andparseAsync()for proper error propagationTest plan
services/platform/scripts/test_openai_compat.pyend-to-end (16 tests covering basic chat, streaming, tool calling, multi-round continuation, tool_choice variants, agent mode, stop sequences, and error cases)openai.ChatCompletion.create) works against the endpoint with streaming and non-streaming modesnpx convex devtype-checks cleanly with the new modulesSummary by CodeRabbit
New Features
Tests