feat(platform): OpenAI Chat Completions API compatibility layer by larryro · Pull Request #1251 · tale-project/tale

larryro · 2026-04-09T11:38:34Z

Summary

Adds an OpenAI-compatible /v1/chat/completions endpoint that proxies requests to Tale agents, enabling any OpenAI SDK client to interact with Tale agents seamlessly
Supports both streaming and non-streaming responses, client-side tool calling (single-round and multi-round continuation), tool_choice control, generation params (temperature, max_tokens, top_p, stop), response_format: json_object, and a /v1/models listing endpoint
Includes full conversation history conversion (OpenAI → AI SDK ModelMessage format) for multi-turn tool calling, rate limiting, PII scrubbing, and governance policy enforcement
Fixes CLI exit handling to use process.exit(1) and parseAsync() for proper error propagation

Test plan

Run services/platform/scripts/test_openai_compat.py end-to-end (16 tests covering basic chat, streaming, tool calling, multi-round continuation, tool_choice variants, agent mode, stop sequences, and error cases)
Verify OpenAI Python SDK (openai.ChatCompletion.create) works against the endpoint with streaming and non-streaming modes
Test with an agent that has server-side tools configured (agent mode path)
Confirm rate limiting and API key validation return proper HTTP error codes
Verify npx convex dev type-checks cleanly with the new modules

Summary by CodeRabbit

New Features
- Added OpenAI-compatible API endpoints for chat completions and model listing
- Support for generation parameters (temperature, max tokens, token penalties, stop sequences)
- Function calling and tool support for API requests
- JSON response format option
- Rate limiting for API operations
Tests
- Added comprehensive test suite for OpenAI-compatible API endpoints

greptile-apps

Your free trial has ended. If you'd like to continue receiving code reviews, you can add a payment method here.

coderabbitai · 2026-04-09T11:49:26Z

📝 Walkthrough

Walkthrough

This PR implements OpenAI-compatible HTTP endpoints for chat completions and model listing. It adds HTTP handlers for POST /api/v1/chat/completions, GET /api/v1/models, and CORS preflight routes that support both streaming and non-streaming chat modes, tool calling with continuation, and agent-based responses. Supporting infrastructure includes internal mutations and queries for thread/message management, organization resolution, and PII scrubbing. Generation parameters are propagated throughout the agent response pipeline. The PR also adds a database index on the apikey table, rate-limiting rules for OpenAI chat operations, a comprehensive Python test suite, and improves CLI error handling.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

feat(platform): add API key authentication and management #366: Introduces apikey table and related features; this PR adds a database index on the same table for key lookups.
feat(platform): improve chat UX with persisted drafts and error handling #502: Modifies the agent response pipeline in generate_response.ts and internal_actions.ts; this PR extends the same functions with generation parameter support.
feat(platform): add optimistic messages for existing chat threads #627: Adds error handling and message operations to runAgentGeneration in internal_actions.ts; this PR extends the same function with generation parameter threading.

🚥 Pre-merge checks | ✅ 2 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 60.71% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (2 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The pull request title clearly and accurately summarizes the main change: adding an OpenAI-compatible Chat Completions API layer to the platform.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/1199-openai-chat-completions-compat

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@services/platform/convex/lib/agent_response/generate_response.ts`:
- Around line 622-639: The follow-up generateText calls in the empty-tool-result
retry, the "length" continuation branch, and the timeout recovery branch are not
receiving generationParams so they fall back to model defaults; update the code
so every generateText invocation (including retries and continuation/recovery
paths referenced around the empty-tool-result retry, the "length" continuation,
and timeout recovery) merges or forwards the same generationParams (temperature,
maxTokens, topP, frequencyPenalty, presencePenalty, stopSequences) used in the
initial call—either by passing generationParams directly into each
generateText(...) call or by constructing a single merged params object (e.g.,
mergedGenerationParams) and using it everywhere to ensure max_tokens, stop, and
sampling settings are preserved across all branches.

In `@services/platform/convex/openai_compat/http_actions.ts`:
- Around line 356-371: The chat call currently passes only
lastUserMessage.content to
internal.openai_compat.internal_actions.chatViaOpenAI, which drops prior
conversation turns; change the payload in ctx.runAction to pass the full
conversation history (e.g., the messages array or reconstructed list of
system/assistant/user messages) rather than just lastUserMessage.content,
preserving threadId, enableStreaming/shouldStream, generationParams and
responseFormat so multi-turn context is forwarded to chatViaOpenAI.
- Around line 766-785: The code currently treats any terminal
persistentStreaming state as a normal "stop" by always calling
buildChatCompletionChunk(..., 'stop', ...); update the logic in the polling loop
to capture the actual terminal status (e.g., const terminalStatus = body.status)
and when creating finalChunk pass a finish_reason that reflects that status
(e.g., 'stop' for 'done', 'error' for 'error', and 'timeout' or 'length' for
'timeout'), and include any available error/timeout details if present; modify
the call to buildChatCompletionChunk(completionId, model, {}, finishReason,
created) so streaming clients can distinguish successful completions from
failures.
- Around line 184-190: convertToModelMessages currently calls JSON.parse on
tc.function.arguments (from msg.tool_calls) without validation, causing a thrown
error to bubble up as a 500; change convertToModelMessages to defensively parse
tc.function.arguments inside a try/catch, and when JSON.parse fails throw an
invalid_request_error mapped to a 400 (e.g., construct and throw the same error
type used elsewhere for request validation failures so callers see a 400),
referencing msg.tool_calls and tc.function.arguments so malformed client input
on continuation requests is rejected with a 400 instead of producing a server
error.

In `@services/platform/convex/openai_compat/internal_actions.ts`:
- Around line 230-237: The code only scrubs args.message via scrubMessagePii,
but downstream flow (e.g., conversationMessages forwarded into streamText on
tool continuations) allows unsanitized client-supplied transcript to bypass PII
scrubbing; update the logic in internal_actions.ts to run scrubMessagePii over
the entire conversation transcript (e.g., the conversationMessages array/string
that is passed into streamText and any continuation payloads) rather than only
args.message, and ensure all places noted (including the similar block around
lines 301-309) replace uses of raw conversation messages with the returned
sanitized value before forwarding to streamText or other continuation handlers.
- Around line 272-282: The current mutation call to
internal.openai_compat.internal_mutations.createThreadAndSaveMessage is
re-saving the original user `message` when `hasConversation` is true, causing
duplication; change the payload so you save the tool/assistant output (or an
empty/placeholder) instead of the original user turn when `hasConversation` is
true — e.g., compute a `messageToSave` (use the assistant/tool-generated text or
omit saving) and pass that into the mutation instead of `message` for the block
with `createThreadAndSaveMessage` and the analogous block around lines 296-304;
ensure the same `messageToSave` logic is used before calling `streamText` so
stored thread history matches the actual OpenAI conversation.

In `@services/platform/convex/openai_compat/response_format.ts`:
- Line 18: The FinishReason union type should include additional OpenAI finish
reasons for forward compatibility; update the FinishReason type definition (the
`FinishReason` alias in response_format.ts) to add 'content_filter' to the union
(e.g., type FinishReason = 'stop' | 'length' | 'tool_calls' | 'content_filter')
so the type accurately represents possible OpenAI API responses.

In `@services/platform/scripts/test_openai_compat.py`:
- Around line 127-139: The named tests (e.g., the "4. Generation params
(temperature=0.1, max_tokens=15)" block that calls
client.chat.completions.create and the json_object and stop test blocks) are too
permissive—update each to assert the actual feature: for generation params
assert the response finish_reason (or token-count) reflects truncation at
max_tokens and that the returned text length is bounded by max_tokens (and/or
that finish_reason == "length"); for the json_object test call json.loads(...)
on r.choices[0].message.content and assert it parses without exception and
matches expected keys/types; for the stop-sequence test, request a known stop
token and assert the returned text ends before or does not contain content after
that stop sequence (and/or that finish_reason indicates stop), using the same
client.chat.completions.create call sites to locate and modify the assertions.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 5cf12ca9-1966-4c4d-bcf7-a8547676735e

📥 Commits

Reviewing files that changed from the base of the PR and between 69db2b5 and e1f80d8.

⛔ Files ignored due to path filters (1)

services/platform/convex/_generated/api.d.ts is excluded by !**/_generated/**

📒 Files selected for processing (16)

services/platform/convex/betterAuth/schema.ts
services/platform/convex/http.ts
services/platform/convex/lib/agent_chat/internal_actions.ts
services/platform/convex/lib/agent_chat/start_agent_chat.ts
services/platform/convex/lib/agent_chat/types.ts
services/platform/convex/lib/agent_response/generate_response.ts
services/platform/convex/lib/agent_response/types.ts
services/platform/convex/lib/rate_limiter/index.ts
services/platform/convex/openai_compat/http_actions.ts
services/platform/convex/openai_compat/internal_actions.ts
services/platform/convex/openai_compat/internal_mutations.ts
services/platform/convex/openai_compat/internal_queries.ts
services/platform/convex/openai_compat/response_format.ts
services/platform/convex/openai_compat/tool_conversion.ts
services/platform/scripts/test_openai_compat.py
tools/cli/src/index.ts

Add /api/v1/chat/completions and /api/v1/models endpoints that follow the OpenAI API schema, enabling existing OpenAI integrations to work with the platform without modification. Closes #1199

…AI compat endpoint Support two modes in the Chat Completions API: agent mode (server-side tools, async generation) and client tool mode (client-defined tools, direct streamText). Thread generation parameters (temperature, max_tokens, top_p, etc.) from the OpenAI request through to the underlying LLM call.

…tion history and tool_choice Convert full OpenAI message history to AI SDK ModelMessage format for multi-round tool calling instead of extracting only tool messages. Add tool_choice mapping (auto/none/required/specific function) from OpenAI format to AI SDK format. Update tool call extraction for AI SDK v6 content-based step format. Also expand e2e test coverage from 9 to 16 tests and fix CLI exit handling to use process.exit(1) and parseAsync().

- Add comprehensive API documentation for OpenAI-compatible endpoints to docs/api-reference.md with Python, Node.js, curl examples - Inject /api/v1/chat/completions and /api/v1/models paths into the generated OpenAPI spec with full request/response schemas - Update convex-helpers patch version 0.1.113 → 0.1.114 - Regenerate public/openapi.json with OpenAI Compatible tag and schemas Closes #1199

Replace the full 263-endpoint spec (452KB) with a curated public spec (12.5KB) containing only the OpenAI-compatible endpoints. The Swagger UI at /docs now loads instantly and shows only externally relevant APIs.

The #root div had overflow:clip which prevented scrolling on the standalone docs page. Override overflow and height for html, body, and #root when the swagger-ui-standalone class is present.

…i.json

…napi

greptile-apps Bot reviewed Apr 9, 2026

View reviewed changes

coderabbitai Bot requested changes Apr 9, 2026

View reviewed changes

larryro added 8 commits April 9, 2026 20:01

feat(platform): add OpenAI Chat Completions API compatibility layer

39ac295

Add /api/v1/chat/completions and /api/v1/models endpoints that follow the OpenAI API schema, enabling existing OpenAI integrations to work with the platform without modification. Closes #1199

chore(platform): update auto-generated api.d.ts

6209498

refactor(platform): output only public API endpoints in openapi.json

9fdfd17

Replace the full 263-endpoint spec (452KB) with a curated public spec (12.5KB) containing only the OpenAI-compatible endpoints. The Swagger UI at /docs now loads instantly and shows only externally relevant APIs.

fix(platform): fix swagger UI page scroll on /docs route

5303da2

The #root div had overflow:clip which prevented scrolling on the standalone docs page. Override overflow and height for html, body, and #root when the swagger-ui-standalone class is present.

style(platform): fix formatting in generate-openapi script and openap…

9c0a2d8

…i.json

larryro force-pushed the feat/1199-openai-chat-completions-compat branch from 41b4dc0 to 9c0a2d8 Compare April 9, 2026 12:02

coderabbitai Bot approved these changes Apr 9, 2026

View reviewed changes

fix(platform): replace useless spread with Array.from in generate-ope…

b6c2436

…napi

larryro merged commit 33e3fe6 into main Apr 9, 2026
24 checks passed

larryro deleted the feat/1199-openai-chat-completions-compat branch April 9, 2026 12:13

larryro mentioned this pull request Apr 9, 2026

Platform API is not compatible with OpenAI Chat Completions API schema #1199

Closed

This was referenced Apr 10, 2026

feat(platform): add REST API with OpenAPI spec and API key auth #1295

Merged

feat(platform): refactor OpenAI-compat API as direct model gateway #1440

Merged

coderabbitai Bot mentioned this pull request Apr 19, 2026

feat(platform): time-based usage analytics in governance #1575

Merged

8 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(platform): OpenAI Chat Completions API compatibility layer#1251

feat(platform): OpenAI Chat Completions API compatibility layer#1251
larryro merged 9 commits into
mainfrom
feat/1199-openai-chat-completions-compat

larryro commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

greptile-apps Bot left a comment

Uh oh!

coderabbitai Bot commented Apr 9, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

larryro commented Apr 9, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Uh oh!

greptile-apps Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot commented Apr 9, 2026

Walkthrough

Estimated code review effort

Possibly related PRs

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

larryro commented Apr 9, 2026 •

edited by coderabbitai Bot

Loading