Skip to content

feat(platform): OpenAI Chat Completions API compatibility layer#1251

Merged
larryro merged 9 commits into
mainfrom
feat/1199-openai-chat-completions-compat
Apr 9, 2026
Merged

feat(platform): OpenAI Chat Completions API compatibility layer#1251
larryro merged 9 commits into
mainfrom
feat/1199-openai-chat-completions-compat

Conversation

@larryro
Copy link
Copy Markdown
Collaborator

@larryro larryro commented Apr 9, 2026

Summary

  • Adds an OpenAI-compatible /v1/chat/completions endpoint that proxies requests to Tale agents, enabling any OpenAI SDK client to interact with Tale agents seamlessly
  • Supports both streaming and non-streaming responses, client-side tool calling (single-round and multi-round continuation), tool_choice control, generation params (temperature, max_tokens, top_p, stop), response_format: json_object, and a /v1/models listing endpoint
  • Includes full conversation history conversion (OpenAI → AI SDK ModelMessage format) for multi-turn tool calling, rate limiting, PII scrubbing, and governance policy enforcement
  • Fixes CLI exit handling to use process.exit(1) and parseAsync() for proper error propagation

Test plan

  • Run services/platform/scripts/test_openai_compat.py end-to-end (16 tests covering basic chat, streaming, tool calling, multi-round continuation, tool_choice variants, agent mode, stop sequences, and error cases)
  • Verify OpenAI Python SDK (openai.ChatCompletion.create) works against the endpoint with streaming and non-streaming modes
  • Test with an agent that has server-side tools configured (agent mode path)
  • Confirm rate limiting and API key validation return proper HTTP error codes
  • Verify npx convex dev type-checks cleanly with the new modules

Summary by CodeRabbit

  • New Features

    • Added OpenAI-compatible API endpoints for chat completions and model listing
    • Support for generation parameters (temperature, max tokens, token penalties, stop sequences)
    • Function calling and tool support for API requests
    • JSON response format option
    • Rate limiting for API operations
  • Tests

    • Added comprehensive test suite for OpenAI-compatible API endpoints

Copy link
Copy Markdown

@greptile-apps greptile-apps Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented Apr 9, 2026

📝 Walkthrough

Walkthrough

This PR implements OpenAI-compatible HTTP endpoints for chat completions and model listing. It adds HTTP handlers for POST /api/v1/chat/completions, GET /api/v1/models, and CORS preflight routes that support both streaming and non-streaming chat modes, tool calling with continuation, and agent-based responses. Supporting infrastructure includes internal mutations and queries for thread/message management, organization resolution, and PII scrubbing. Generation parameters are propagated throughout the agent response pipeline. The PR also adds a database index on the apikey table, rate-limiting rules for OpenAI chat operations, a comprehensive Python test suite, and improves CLI error handling.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 60.71% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The pull request title clearly and accurately summarizes the main change: adding an OpenAI-compatible Chat Completions API layer to the platform.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feat/1199-openai-chat-completions-compat

Comment @coderabbitai help to get the list of available commands and usage tips.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@services/platform/convex/lib/agent_response/generate_response.ts`:
- Around line 622-639: The follow-up generateText calls in the empty-tool-result
retry, the "length" continuation branch, and the timeout recovery branch are not
receiving generationParams so they fall back to model defaults; update the code
so every generateText invocation (including retries and continuation/recovery
paths referenced around the empty-tool-result retry, the "length" continuation,
and timeout recovery) merges or forwards the same generationParams (temperature,
maxTokens, topP, frequencyPenalty, presencePenalty, stopSequences) used in the
initial call—either by passing generationParams directly into each
generateText(...) call or by constructing a single merged params object (e.g.,
mergedGenerationParams) and using it everywhere to ensure max_tokens, stop, and
sampling settings are preserved across all branches.

In `@services/platform/convex/openai_compat/http_actions.ts`:
- Around line 356-371: The chat call currently passes only
lastUserMessage.content to
internal.openai_compat.internal_actions.chatViaOpenAI, which drops prior
conversation turns; change the payload in ctx.runAction to pass the full
conversation history (e.g., the messages array or reconstructed list of
system/assistant/user messages) rather than just lastUserMessage.content,
preserving threadId, enableStreaming/shouldStream, generationParams and
responseFormat so multi-turn context is forwarded to chatViaOpenAI.
- Around line 766-785: The code currently treats any terminal
persistentStreaming state as a normal "stop" by always calling
buildChatCompletionChunk(..., 'stop', ...); update the logic in the polling loop
to capture the actual terminal status (e.g., const terminalStatus = body.status)
and when creating finalChunk pass a finish_reason that reflects that status
(e.g., 'stop' for 'done', 'error' for 'error', and 'timeout' or 'length' for
'timeout'), and include any available error/timeout details if present; modify
the call to buildChatCompletionChunk(completionId, model, {}, finishReason,
created) so streaming clients can distinguish successful completions from
failures.
- Around line 184-190: convertToModelMessages currently calls JSON.parse on
tc.function.arguments (from msg.tool_calls) without validation, causing a thrown
error to bubble up as a 500; change convertToModelMessages to defensively parse
tc.function.arguments inside a try/catch, and when JSON.parse fails throw an
invalid_request_error mapped to a 400 (e.g., construct and throw the same error
type used elsewhere for request validation failures so callers see a 400),
referencing msg.tool_calls and tc.function.arguments so malformed client input
on continuation requests is rejected with a 400 instead of producing a server
error.

In `@services/platform/convex/openai_compat/internal_actions.ts`:
- Around line 230-237: The code only scrubs args.message via scrubMessagePii,
but downstream flow (e.g., conversationMessages forwarded into streamText on
tool continuations) allows unsanitized client-supplied transcript to bypass PII
scrubbing; update the logic in internal_actions.ts to run scrubMessagePii over
the entire conversation transcript (e.g., the conversationMessages array/string
that is passed into streamText and any continuation payloads) rather than only
args.message, and ensure all places noted (including the similar block around
lines 301-309) replace uses of raw conversation messages with the returned
sanitized value before forwarding to streamText or other continuation handlers.
- Around line 272-282: The current mutation call to
internal.openai_compat.internal_mutations.createThreadAndSaveMessage is
re-saving the original user `message` when `hasConversation` is true, causing
duplication; change the payload so you save the tool/assistant output (or an
empty/placeholder) instead of the original user turn when `hasConversation` is
true — e.g., compute a `messageToSave` (use the assistant/tool-generated text or
omit saving) and pass that into the mutation instead of `message` for the block
with `createThreadAndSaveMessage` and the analogous block around lines 296-304;
ensure the same `messageToSave` logic is used before calling `streamText` so
stored thread history matches the actual OpenAI conversation.

In `@services/platform/convex/openai_compat/response_format.ts`:
- Line 18: The FinishReason union type should include additional OpenAI finish
reasons for forward compatibility; update the FinishReason type definition (the
`FinishReason` alias in response_format.ts) to add 'content_filter' to the union
(e.g., type FinishReason = 'stop' | 'length' | 'tool_calls' | 'content_filter')
so the type accurately represents possible OpenAI API responses.

In `@services/platform/scripts/test_openai_compat.py`:
- Around line 127-139: The named tests (e.g., the "4. Generation params
(temperature=0.1, max_tokens=15)" block that calls
client.chat.completions.create and the json_object and stop test blocks) are too
permissive—update each to assert the actual feature: for generation params
assert the response finish_reason (or token-count) reflects truncation at
max_tokens and that the returned text length is bounded by max_tokens (and/or
that finish_reason == "length"); for the json_object test call json.loads(...)
on r.choices[0].message.content and assert it parses without exception and
matches expected keys/types; for the stop-sequence test, request a known stop
token and assert the returned text ends before or does not contain content after
that stop sequence (and/or that finish_reason indicates stop), using the same
client.chat.completions.create call sites to locate and modify the assertions.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5cf12ca9-1966-4c4d-bcf7-a8547676735e

📥 Commits

Reviewing files that changed from the base of the PR and between 69db2b5 and e1f80d8.

⛔ Files ignored due to path filters (1)
  • services/platform/convex/_generated/api.d.ts is excluded by !**/_generated/**
📒 Files selected for processing (16)
  • services/platform/convex/betterAuth/schema.ts
  • services/platform/convex/http.ts
  • services/platform/convex/lib/agent_chat/internal_actions.ts
  • services/platform/convex/lib/agent_chat/start_agent_chat.ts
  • services/platform/convex/lib/agent_chat/types.ts
  • services/platform/convex/lib/agent_response/generate_response.ts
  • services/platform/convex/lib/agent_response/types.ts
  • services/platform/convex/lib/rate_limiter/index.ts
  • services/platform/convex/openai_compat/http_actions.ts
  • services/platform/convex/openai_compat/internal_actions.ts
  • services/platform/convex/openai_compat/internal_mutations.ts
  • services/platform/convex/openai_compat/internal_queries.ts
  • services/platform/convex/openai_compat/response_format.ts
  • services/platform/convex/openai_compat/tool_conversion.ts
  • services/platform/scripts/test_openai_compat.py
  • tools/cli/src/index.ts

Comment thread services/platform/convex/lib/agent_response/generate_response.ts
Comment thread services/platform/convex/openai_compat/http_actions.ts
Comment thread services/platform/convex/openai_compat/http_actions.ts
Comment thread services/platform/convex/openai_compat/http_actions.ts
Comment thread services/platform/convex/openai_compat/internal_actions.ts
Comment thread services/platform/convex/openai_compat/internal_actions.ts
Comment thread services/platform/convex/openai_compat/response_format.ts
Comment thread services/platform/scripts/test_openai_compat.py
larryro added 8 commits April 9, 2026 20:01
Add /api/v1/chat/completions and /api/v1/models endpoints that follow
the OpenAI API schema, enabling existing OpenAI integrations to work
with the platform without modification.

Closes #1199
…AI compat endpoint

Support two modes in the Chat Completions API: agent mode (server-side tools,
async generation) and client tool mode (client-defined tools, direct streamText).
Thread generation parameters (temperature, max_tokens, top_p, etc.) from the
OpenAI request through to the underlying LLM call.
…tion history and tool_choice

Convert full OpenAI message history to AI SDK ModelMessage format for
multi-round tool calling instead of extracting only tool messages. Add
tool_choice mapping (auto/none/required/specific function) from OpenAI
format to AI SDK format. Update tool call extraction for AI SDK v6
content-based step format.

Also expand e2e test coverage from 9 to 16 tests and fix CLI exit
handling to use process.exit(1) and parseAsync().
- Add comprehensive API documentation for OpenAI-compatible endpoints
  to docs/api-reference.md with Python, Node.js, curl examples
- Inject /api/v1/chat/completions and /api/v1/models paths into the
  generated OpenAPI spec with full request/response schemas
- Update convex-helpers patch version 0.1.113 → 0.1.114
- Regenerate public/openapi.json with OpenAI Compatible tag and schemas

Closes #1199
Replace the full 263-endpoint spec (452KB) with a curated public spec
(12.5KB) containing only the OpenAI-compatible endpoints. The Swagger UI
at /docs now loads instantly and shows only externally relevant APIs.
The #root div had overflow:clip which prevented scrolling on the
standalone docs page. Override overflow and height for html, body,
and #root when the swagger-ui-standalone class is present.
@larryro larryro force-pushed the feat/1199-openai-chat-completions-compat branch from 41b4dc0 to 9c0a2d8 Compare April 9, 2026 12:02
@larryro larryro merged commit 33e3fe6 into main Apr 9, 2026
24 checks passed
@larryro larryro deleted the feat/1199-openai-chat-completions-compat branch April 9, 2026 12:13
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant