feat(platform): refactor OpenAI-compat API as direct model gateway#1440
Conversation
…1428) Replace agent-based routing with direct model access. The `model` field now accepts real provider model IDs (e.g., `anthropic/claude-sonnet-4.6`) instead of agent slugs, routing requests directly to the configured provider. Changes: - Rewrite `/api/v1/models` to return real provider models with `owned_by` - Add `chatDirectModel` internal action — direct model resolution with full governance (PII scrubbing, mandatory system prompt, rate limiting, usage ledger, audit log, circuit breaker + failover) - Rewrite `chatCompletionsHandler` to use direct model mode - Add `stream_options.include_usage` support per OpenAI spec - Return real token usage in responses (non-streaming and streaming) - Split OpenAPI `ChatMessage` into role-specific schemas with discriminator - Backport Citation schema into `generate-openapi.ts` - Strip `$`-prefixed keys from tool parameters (Convex reserved prefix) - Remove dead agent-mode code (pollOpenAIResponse, streamOpenAIResponse) - Add unit tests for usage in response builders - Add AI SDK v6 integration test script
Replace agent slug examples with real provider model IDs in the API documentation and OpenAPI schema examples.
Remove outdated agent mode / client tool mode descriptions. The endpoint is now a direct model gateway.
📝 WalkthroughWalkthroughThis PR refactors the OpenAI-compatible chat completions endpoint from a two-mode implementation (persistent agent streaming vs. client tool mode) to a single direct model gateway architecture. It introduces usage tracking through Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Comment |
Direct model mode should be stateless — API calls should not appear in the UI thread history. Replace createThreadAndSaveMessage with a transient ID used only for usage tracking and audit log correlation.
There was a problem hiding this comment.
Actionable comments posted: 11
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (4)
services/platform/convex/openai_compat/http_actions.ts (2)
414-433:⚠️ Potential issue | 🟠 MajorInvalid model IDs now fall through as
500 server_error.This refactor switched the execution path to
chatDirectModel, but the downstream error handling is still the old agent-era matcher. Missing-model failures from the new path (Model "..." not found...) will miss the special case and return a 500 instead of OpenAI-style404 model_not_found.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/openai_compat/http_actions.ts` around lines 414 - 433, The new path calls internal.openai_compat.internal_actions.chatDirectModel but missing-model errors from that action (e.g. messages like 'Model "..." not found' or an internal code indicating missing model) are not translated and fall through to handleChatError causing a 500; update the catch to detect the chatDirectModel missing-model failure and route it to the OpenAI-style model-not-found handler (either call the existing handleModelNotFound helper or extend handleChatError to inspect errors coming from chatDirectModel for the missing-model signature/code and return a 404 model_not_found response) so missing models produce the correct OpenAI-style 404.
706-731:⚠️ Potential issue | 🟠 Major
/api/v1/modelsis no longer organization-scoped.The handler authenticates the caller, then calls
getAllModelIds({})without resolving membership orX-Organization-Slug. Inservices/platform/convex/providers/file_actions.ts, that falls back to'default', so users in non-default orgs can list a different catalog than the one/api/v1/chat/completionswill actually route against.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/openai_compat/http_actions.ts` around lines 706 - 731, modelsListHandler currently authenticates but then calls internal.providers.file_actions.getAllModelIds with an empty object, causing it to fall back to the 'default' org; change the handler to resolve the caller's organization (using authenticateRequest result or the X-Organization-Slug header / membership info available on ctx after auth) and pass that organization identifier into getAllModelIds so the model list matches the org that /api/v1/chat/completions will use; update the call site in modelsListHandler to supply { organization: resolvedOrg } (or the proper param name expected by getAllModelIds) instead of {}.services/platform/public/openapi.json (2)
4245-4268:⚠️ Potential issue | 🟠 MajorAdd
requiredfields toToolCallschema.The ToolCall object currently allows an empty
{}to validate, breaking client code that correlatestool_call_idback to the assistant turn. Per OpenAI's API specification,id,type, andfunctionare required fields.Proposed fix
"ToolCall": { "type": "object", + "required": [ + "id", + "type", + "function" + ], "properties": { "id": { "type": "string" }, "type": {🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/platform/public/openapi.json` around lines 4245 - 4268, Update the OpenAPI schema for the ToolCall object to require the id, type, and function properties so an empty object no longer validates; modify the "ToolCall" schema (the object with "properties" containing "id", "type", and "function") to include a "required" array with ["id","type","function"] and ensure the existing "function" property definition remains unchanged.
4085-4238:⚠️ Potential issue | 🟠 MajorSchemas restrict content to string only and have incorrect nullability; user messages cannot support multimodal content.
OpenAI's Chat Completions API specifies that
contentcan be a string OR an array of content parts (supporting multimodal inputs:text,image_url,input_audio, etc.). Additionally:
- Assistant
contentis optional whentool_callsis present- Tool message
contentis optional, but the schema contradictorily marks it both required and nullable- User messages must support arrays for multimodal capability (vision models)
The current schemas restrict all message types to
"type": "string"only, rejecting valid multimodal payloads, and the required/nullable patterns for assistant and tool messages don't align with OpenAI's actual API behavior.🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@services/platform/public/openapi.json` around lines 4085 - 4238, Update the message schemas to support multimodal content and correct nullability/required rules: introduce a reusable ContentPart schema (or inline oneOf) and change ChatMessageAssistant.content, ChatMessageUser.content, and ChatMessageTool.content to accept either a string OR an array of ContentPart (e.g., oneOf: [{type: string}, {type: array, items: ContentPart}]); remove content from the required list for ChatMessageAssistant and make it nullable (since assistant content may be omitted when tool_calls is present), and remove content from the required list for ChatMessageTool while keeping tool_call_id required and content nullable; ensure ChatMessageUser still requires content but accepts the string|array form to enable multimodal user inputs.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@services/platform/convex/openai_compat/http_actions.ts`:
- Around line 333-360: When messages already represent a tool-result
continuation (isContinuation from hasToolInteraction), the call to
handleDirectModelMode still passes lastUserMessage: lastUserMessage.content (the
earlier saved variable), causing the previous user prompt to be re-saved;
instead, derive lastUserMessage from the continuation context. Update the code
around isContinuation/conversationMessages to set the value passed to
handleDirectModelMode to the most recent user message in conversationMessages
(e.g., find the last item with role === 'user' in conversationMessages generated
by convertToModelMessages) when isContinuation is true, otherwise keep using
lastUserMessage.content; ensure the change references the existing symbols
isContinuation, conversationMessages, lastUserMessage, convertToModelMessages,
hasToolInteraction, and handleDirectModelMode.
- Around line 366-377: stripDollarKeys is removing $-prefixed JSON Schema
keywords (like $ref, $defs, $schema) that OpenAI tool parameter schemas rely on;
stop stripping them when preparing/sending tool parameter schemas to OpenAI by
removing any calls to stripDollarKeys for tool parameter serialization/HTTP
request construction and ensure the original object is sent unchanged (leave
stripDollarKeys only for Convex-persisted data if needed). Locate usages of
stripDollarKeys in the code path that builds OpenAI tool parameter payloads and
remove or conditionalize those calls so $-prefixed keys are preserved.
In `@services/platform/convex/openai_compat/internal_actions.ts`:
- Around line 543-582: The code builds genParams and then calls streamText but
never forwards the requested responseFormat, so JSON-object responses are lost;
update the streamText call (where streamText is invoked) to include the
responseFormat coming from the request (e.g. prefer args.responseFormat if
present otherwise genParams.responseFormat) and pass it through or map it to the
streamText parameter name (responseFormat or response_format) used by
streamText; ensure you coerce/validate the value the same way http_actions.ts
does (accept the enum/string shape) and add the conditional spread like
...(responseFormat != null && { responseFormat }) so the direct-model path
honors {"type":"json_object"} requests.
- Around line 388-455: The code is billing and auditing using the pre-resolution
modelId; update usages to the resolved provider model by passing
resolved.modelData.modelId into estimateCostCents and replacing model: modelId
in the audit log metadata (and any other places in this block that reference
modelId for cost/audit) so costs and audit entries reflect the actual resolved
model; ensure you still fall back safely if resolved or resolved.modelData is
undefined.
In `@services/platform/public/openapi.json`:
- Around line 94-99: The OpenAPI operation with operationId "listModels"
(endpoint /api/v1/models) incorrectly describes returning "available agents" and
referencing `visibleInChat`; update the description to state it lists provider
models (e.g., provider-specific model entries) and align it with the ModelList
schema — remove mention of agents and `visibleInChat`, clarify it returns
provider model metadata for the direct model gateway instead. Ensure the
summary/description and any examples reflect "models" not "agents" so generated
docs match the new gateway behavior.
- Around line 2919-2922: The schema for the "model" property currently documents
bare model IDs; update its description and example to use provider-prefixed
model IDs (e.g., "anthropic/claude-sonnet-4-20250514" or "openai/gpt-4o") so it
matches the Quick Start and ModelList conventions. Locate the "model" property
object (the "type", "description", "example" fields) in the OpenAPI JSON and
change the description text to mention provider-prefixed IDs and update the
"example" string to a provider-prefixed ID; ensure any other occurrences of the
bare example in the same schema are updated consistently.
- Around line 4111-4147: The Citation schema currently allows empty objects;
require at least the index so clients can resolve [N] markers by adding a
required field to the Citation schema: add "required": ["index"] under the
"Citation" object and also consider adding "additionalProperties": false (or
"minProperties": 1) to prevent {} from validating; update the "Citation" schema
in openapi.json (the "Citation" object and its "index" property) accordingly.
- Around line 4027-4049: The OpenAPI ChatMessage union and discriminator are
missing the `developer` role which will cause validation failures for requests
using that role; update the "ChatMessage" oneOf to include a new
ChatMessageDeveloper schema, add "developer":
"#/components/schemas/ChatMessageDeveloper" to the discriminator.mapping
alongside system/user/assistant/tool, and create a matching
components.schemas.ChatMessageDeveloper definition (mirror the structure of
ChatMessageSystem but with role enum/value "developer") so the spec accepts
developer-role messages.
In `@services/platform/scripts/generate-openapi.ts`:
- Around line 148-153: The example value for the OpenAPI "model" schema is
inconsistent with the IDs returned by GET /api/v1/models and the provider
examples; update the example in the model property inside generate-openapi.ts
(the model: { ... } schema) to use the fully-qualified provider-prefixed form
(e.g., "anthropic/claude-sonnet-4.6" or whatever exact shape your /api/v1/models
returns) so generated docs and provider-examples share the same identifier
format.
In `@services/platform/scripts/test-openai-compat.ts`:
- Around line 144-157: The test harness currently logs errors from main() but
still exits with a zero status; modify the invocation of main() so that any
rejection causes the process to exit non‑zero: in
services/platform/scripts/test-openai-compat.ts update the main().catch(...)
handler used where main() is called to log the error (console.error or
processLogger) and then call process.exit(1) so testListModels,
testGenerateText, testStreamText, testToolCalling, or testMultiTurnToolCalling
failures fail the CI run.
- Around line 13-20: The API key is hardcoded in API_KEY and injected into
headers when calling createOpenAICompatible; move this secret into an
environment variable (e.g., process.env.OPENAI_API_KEY) and update the provider
initialization to read that env var instead of the literal string, add a runtime
check that throws or logs a clear error if the env var is missing, and remove
the hardcoded API_KEY constant so no credential remains in source control.
---
Outside diff comments:
In `@services/platform/convex/openai_compat/http_actions.ts`:
- Around line 414-433: The new path calls
internal.openai_compat.internal_actions.chatDirectModel but missing-model errors
from that action (e.g. messages like 'Model "..." not found' or an internal code
indicating missing model) are not translated and fall through to handleChatError
causing a 500; update the catch to detect the chatDirectModel missing-model
failure and route it to the OpenAI-style model-not-found handler (either call
the existing handleModelNotFound helper or extend handleChatError to inspect
errors coming from chatDirectModel for the missing-model signature/code and
return a 404 model_not_found response) so missing models produce the correct
OpenAI-style 404.
- Around line 706-731: modelsListHandler currently authenticates but then calls
internal.providers.file_actions.getAllModelIds with an empty object, causing it
to fall back to the 'default' org; change the handler to resolve the caller's
organization (using authenticateRequest result or the X-Organization-Slug header
/ membership info available on ctx after auth) and pass that organization
identifier into getAllModelIds so the model list matches the org that
/api/v1/chat/completions will use; update the call site in modelsListHandler to
supply { organization: resolvedOrg } (or the proper param name expected by
getAllModelIds) instead of {}.
In `@services/platform/public/openapi.json`:
- Around line 4245-4268: Update the OpenAPI schema for the ToolCall object to
require the id, type, and function properties so an empty object no longer
validates; modify the "ToolCall" schema (the object with "properties" containing
"id", "type", and "function") to include a "required" array with
["id","type","function"] and ensure the existing "function" property definition
remains unchanged.
- Around line 4085-4238: Update the message schemas to support multimodal
content and correct nullability/required rules: introduce a reusable ContentPart
schema (or inline oneOf) and change ChatMessageAssistant.content,
ChatMessageUser.content, and ChatMessageTool.content to accept either a string
OR an array of ContentPart (e.g., oneOf: [{type: string}, {type: array, items:
ContentPart}]); remove content from the required list for ChatMessageAssistant
and make it nullable (since assistant content may be omitted when tool_calls is
present), and remove content from the required list for ChatMessageTool while
keeping tool_call_id required and content nullable; ensure ChatMessageUser still
requires content but accepts the string|array form to enable multimodal user
inputs.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro
Run ID: 60455d24-428d-403e-8902-78d1d3df0ac4
📒 Files selected for processing (9)
services/platform/convex/openai_compat/http_actions.tsservices/platform/convex/openai_compat/internal_actions.tsservices/platform/convex/openai_compat/internal_queries.tsservices/platform/convex/openai_compat/response_format.test.tsservices/platform/convex/openai_compat/response_format.tsservices/platform/convex/providers/file_actions.tsservices/platform/public/openapi.jsonservices/platform/scripts/generate-openapi.tsservices/platform/scripts/test-openai-compat.ts
| const isContinuation = hasToolInteraction(messages); | ||
| const conversationMessages = isContinuation | ||
| ? convertToModelMessages(messages) | ||
| : undefined; | ||
|
|
||
| // ----------------------------------------------------------------------- | ||
| // Agent mode: server-side tools, async generation via persistent stream | ||
| // ----------------------------------------------------------------------- | ||
| let chatResult: { threadId: string; streamId: string }; | ||
| try { | ||
| chatResult = await ctx.runAction( | ||
| internal.openai_compat.internal_actions.chatViaOpenAI, | ||
| { | ||
| agentSlug: model, | ||
| organizationId: orgInfo.organizationId, | ||
| userId: user.userId, | ||
| userEmail: user.email, | ||
| userName: user.name, | ||
| message: lastUserMessage.content, | ||
| threadId, | ||
| enableStreaming: shouldStream, | ||
| generationParams, | ||
| responseFormat, | ||
| }, | ||
| ); | ||
| } catch (error) { | ||
| return handleChatError(error, model); | ||
| } | ||
| // Strip $-prefixed keys from tool parameters (Convex reserves $ prefix) | ||
| const tools = body.tools?.map((t) => ({ | ||
| ...t, | ||
| function: { | ||
| ...t.function, | ||
| parameters: t.function.parameters | ||
| ? stripDollarKeys(t.function.parameters) | ||
| : undefined, | ||
| }, | ||
| })); | ||
|
|
||
| if (shouldStream) { | ||
| return streamOpenAIResponse(ctx, chatResult, model); | ||
| } | ||
| return pollOpenAIResponse(ctx, chatResult, model); | ||
| return handleDirectModelMode(ctx, { | ||
| model, | ||
| messages, | ||
| lastUserMessage: lastUserMessage.content, | ||
| tools, | ||
| toolChoice: body.tool_choice, | ||
| shouldStream, | ||
| includeUsage, | ||
| threadId, | ||
| generationParams, | ||
| responseFormat, | ||
| conversationMessages, |
There was a problem hiding this comment.
Tool-result continuations persist the wrong user message.
When messages already contains an assistant tool call plus a trailing tool result, conversationMessages carries the correct continuation context, but this code still forwards lastUserMessage to chatDirectModel. That action later saves message to the thread, so every tool round-trip duplicates the earlier user prompt instead of recording the current continuation state.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/convex/openai_compat/http_actions.ts` around lines 333 -
360, When messages already represent a tool-result continuation (isContinuation
from hasToolInteraction), the call to handleDirectModelMode still passes
lastUserMessage: lastUserMessage.content (the earlier saved variable), causing
the previous user prompt to be re-saved; instead, derive lastUserMessage from
the continuation context. Update the code around
isContinuation/conversationMessages to set the value passed to
handleDirectModelMode to the most recent user message in conversationMessages
(e.g., find the last item with role === 'user' in conversationMessages generated
by convertToModelMessages) when isContinuation is true, otherwise keep using
lastUserMessage.content; ensure the change references the existing symbols
isContinuation, conversationMessages, lastUserMessage, convertToModelMessages,
hasToolInteraction, and handleDirectModelMode.
| /** Recursively strip keys starting with '$' (Convex reserves this prefix). */ | ||
| function stripDollarKeys(obj: unknown): unknown { | ||
| if (Array.isArray(obj)) return obj.map(stripDollarKeys); | ||
| if (typeof obj === 'object' && obj !== null) { | ||
| const result: Record<string, unknown> = {}; | ||
| for (const [k, v] of Object.entries(obj)) { | ||
| if (!k.startsWith('$')) result[k] = stripDollarKeys(v); | ||
| } | ||
| return result; | ||
| } | ||
| return obj; | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
In JSON Schema, what are $ref, $defs, and $schemaused for, and does removing$-prefixed keys change schema behavior or validity?
💡 Result:
$ref is used to reference another schema (internal via JSON Pointer like #/$defs/name or external via URI), replacing the current subschema with the referenced one during validation. $defs provides a standardized location to define reusable subschemas within the same document, which do not directly affect validation but serve as targets for $ref; it replaced 'definitions' in older drafts. $schema declares the JSON Schema dialect/version (e.g., "https://json-schema.org/draft/2020-12/schema") at the root, identifying the meta-schema for validating the schema itself and guiding implementations on semantics. Removing $-prefixed keys changes behavior: removing $schema may cause implementations to assume a default dialect or fail schema validation; removing $defs makes referenced subschemas unavailable, causing $ref resolution failures; removing $ref replaces the referenced validation logic with nothing (true schema), invalidating intended constraints. These are core keywords (not applicators in older drafts), so removal affects validity against the declared meta-schema and runtime behavior. Unknown keywords are typically ignored in instance validation but fail schema (meta) validation.
Citations:
- 1: https://json-schema.org/understanding-json-schema/reference/schema?highlight=dynamicref
- 2: https://json-schema.org/understanding-json-schema/structuring
- 3: https://mintlify.com/json-schema-org/json-schema-spec/reference/core/defs
- 4: https://json-schema.org/understanding-json-schema/structuring?highlight=ref
- 5: https://mintlify.com/json-schema-org/json-schema-spec/reference/core/ref
- 6: https://json-schema.org/understanding-json-schema/reference/schema
- 7: https://github.com/json-schema-org/json-schema-spec/blob/main/specs/jsonschema-core.md
- 8: https://json-schema.org/draft/2020-12/vocab/core
🏁 Script executed:
# Find calls to stripDollarKeys to understand its usage
rg "stripDollarKeys" services/platform/convex/openai_compat/http_actions.ts -B 5 -A 5Repository: tale-project/tale
Length of output: 776
🏁 Script executed:
# Check the full file to understand context around the function definition
wc -l services/platform/convex/openai_compat/http_actions.tsRepository: tale-project/tale
Length of output: 119
🏁 Script executed:
# Read the section around the function and its callers
sed -n '360,400p' services/platform/convex/openai_compat/http_actions.tsRepository: tale-project/tale
Length of output: 1308
🏁 Script executed:
# Search for any comments or context around Convex and $ prefix
rg "Convex.*\\\$|reserves.*prefix" services/platform/convex/ -B 3 -A 3Repository: tale-project/tale
Length of output: 3771
🏁 Script executed:
# Look for Convex documentation or configuration files
fd -e md -e txt -e json "convex|README" services/platform/ | head -20Repository: tale-project/tale
Length of output: 288
🏁 Script executed:
# Search for any related issues or comments about tool parameters and Convex
rg "tool.*parameter|parameter.*schema" services/platform/convex/openai_compat/http_actions.ts -B 3 -A 3Repository: tale-project/tale
Length of output: 260
🏁 Script executed:
# Check if there's a Convex configuration file that documents constraints
cat services/platform/convex.json 2>/dev/null || cat services/platform/convex/convex.json 2>/dev/null || echo "No convex.json found"Repository: tale-project/tale
Length of output: 333
Remove stripDollarKeys() from tool parameters—Convex's $ prefix constraint doesn't apply to outgoing OpenAI API requests.
The stripDollarKeys() call removes JSON Schema keywords ($ref, $defs, $schema) that are essential for schema composition and validation. These keywords are required by OpenAI's tool parameter schemas to function correctly. Convex's field name restrictions apply only to data persisted in Convex, not to third-party API request bodies. Tool schemas should be sent to OpenAI exactly as the caller provided them; stripping $-prefixed keys causes parameter validation to fail and breaks schema composition.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/convex/openai_compat/http_actions.ts` around lines 366 -
377, stripDollarKeys is removing $-prefixed JSON Schema keywords (like $ref,
$defs, $schema) that OpenAI tool parameter schemas rely on; stop stripping them
when preparing/sending tool parameter schemas to OpenAI by removing any calls to
stripDollarKeys for tool parameter serialization/HTTP request construction and
ensure the original object is sent unchanged (leave stripDollarKeys only for
Convex-persisted data if needed). Locate usages of stripDollarKeys in the code
path that builds OpenAI tool parameter payloads and remove or conditionalize
those calls so $-prefixed keys are preserved.
| // Track usage for client tool mode (this path bypasses onAgentComplete) | ||
| const usage = await result.usage; | ||
| if (usage && args.organizationId) { | ||
| const inputTokens = usage?.inputTokens ?? 0; | ||
| const outputTokens = usage?.outputTokens ?? 0; | ||
| const totalTokens = inputTokens + outputTokens; | ||
|
|
||
| if (usage && args.organizationId && (inputTokens > 0 || outputTokens > 0)) { | ||
| const { estimateCostCents } = | ||
| await import('../governance/cost_estimation'); | ||
| const costCents = estimateCostCents(modelId, inputTokens, outputTokens); | ||
| await ctx | ||
| .runMutation( | ||
| internal.governance.internal_mutations.incrementUsageLedger, | ||
| { | ||
| organizationId: args.organizationId, | ||
| userId: args.userId ?? 'system', | ||
| inputTokens, | ||
| outputTokens, | ||
| costEstimateCents: costCents, | ||
| timestamp: Date.now(), | ||
| }, | ||
| ) | ||
| .catch((error) => { | ||
| console.error( | ||
| '[OpenAI-compat:clientTools] Failed to increment usage ledger:', | ||
| error, | ||
| ); | ||
| }); | ||
|
|
||
| // AI audit log for OpenAI-compat client tool mode | ||
| await ctx | ||
| .runMutation(internal.audit_logs.internal_mutations.createAuditLog, { | ||
| organizationId: args.organizationId, | ||
| actorId: args.userId ?? 'system', | ||
| actorType: 'api' as const, | ||
| action: 'ai.completion', | ||
| category: 'ai' as const, | ||
| resourceType: 'agent_completion', | ||
| resourceId: threadId, | ||
| status: 'success' as const, | ||
| metadata: { | ||
| model: modelId, | ||
| inputTokens, | ||
| outputTokens, | ||
| totalTokens, | ||
| costEstimateCents: costCents, | ||
| threadId, | ||
| agentType: 'openai_compat', | ||
| toolCallCount: toolCalls.length, | ||
| }, | ||
| }) | ||
| .catch((error) => { | ||
| console.error( | ||
| '[OpenAI-compat:clientTools] Failed to write AI audit log:', | ||
| error, | ||
| ); | ||
| }); | ||
| } | ||
|
|
||
| return { | ||
| threadId, | ||
| text: text || null, | ||
| toolCalls: toolCalls.length > 0 ? toolCalls : null, | ||
| finishReason, | ||
| inputTokens, | ||
| outputTokens, | ||
| totalTokens, | ||
| resolvedModel: resolved.modelData.modelId, |
There was a problem hiding this comment.
Bill and audit against the resolved provider model, not the pre-resolution ID.
This block still prices usage with modelId, which can be an agent alias, a tag/default, or a value that later failed over. You already have resolved.modelData.modelId; using the unresolved value can skew cost estimation and leave the audit trail pointing at the wrong model.
💸 Suggested change
- const costCents = estimateCostCents(modelId, inputTokens, outputTokens);
+ const costCents = estimateCostCents(
+ resolved.modelData.modelId,
+ inputTokens,
+ outputTokens,
+ );
@@
- model: modelId,
+ model: resolved.modelData.modelId,📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| // Track usage for client tool mode (this path bypasses onAgentComplete) | |
| const usage = await result.usage; | |
| if (usage && args.organizationId) { | |
| const inputTokens = usage?.inputTokens ?? 0; | |
| const outputTokens = usage?.outputTokens ?? 0; | |
| const totalTokens = inputTokens + outputTokens; | |
| if (usage && args.organizationId && (inputTokens > 0 || outputTokens > 0)) { | |
| const { estimateCostCents } = | |
| await import('../governance/cost_estimation'); | |
| const costCents = estimateCostCents(modelId, inputTokens, outputTokens); | |
| await ctx | |
| .runMutation( | |
| internal.governance.internal_mutations.incrementUsageLedger, | |
| { | |
| organizationId: args.organizationId, | |
| userId: args.userId ?? 'system', | |
| inputTokens, | |
| outputTokens, | |
| costEstimateCents: costCents, | |
| timestamp: Date.now(), | |
| }, | |
| ) | |
| .catch((error) => { | |
| console.error( | |
| '[OpenAI-compat:clientTools] Failed to increment usage ledger:', | |
| error, | |
| ); | |
| }); | |
| // AI audit log for OpenAI-compat client tool mode | |
| await ctx | |
| .runMutation(internal.audit_logs.internal_mutations.createAuditLog, { | |
| organizationId: args.organizationId, | |
| actorId: args.userId ?? 'system', | |
| actorType: 'api' as const, | |
| action: 'ai.completion', | |
| category: 'ai' as const, | |
| resourceType: 'agent_completion', | |
| resourceId: threadId, | |
| status: 'success' as const, | |
| metadata: { | |
| model: modelId, | |
| inputTokens, | |
| outputTokens, | |
| totalTokens, | |
| costEstimateCents: costCents, | |
| threadId, | |
| agentType: 'openai_compat', | |
| toolCallCount: toolCalls.length, | |
| }, | |
| }) | |
| .catch((error) => { | |
| console.error( | |
| '[OpenAI-compat:clientTools] Failed to write AI audit log:', | |
| error, | |
| ); | |
| }); | |
| } | |
| return { | |
| threadId, | |
| text: text || null, | |
| toolCalls: toolCalls.length > 0 ? toolCalls : null, | |
| finishReason, | |
| inputTokens, | |
| outputTokens, | |
| totalTokens, | |
| resolvedModel: resolved.modelData.modelId, | |
| // Track usage for client tool mode (this path bypasses onAgentComplete) | |
| const usage = await result.usage; | |
| const inputTokens = usage?.inputTokens ?? 0; | |
| const outputTokens = usage?.outputTokens ?? 0; | |
| const totalTokens = inputTokens + outputTokens; | |
| if (usage && args.organizationId && (inputTokens > 0 || outputTokens > 0)) { | |
| const { estimateCostCents } = | |
| await import('../governance/cost_estimation'); | |
| const costCents = estimateCostCents( | |
| resolved.modelData.modelId, | |
| inputTokens, | |
| outputTokens, | |
| ); | |
| await ctx | |
| .runMutation( | |
| internal.governance.internal_mutations.incrementUsageLedger, | |
| { | |
| organizationId: args.organizationId, | |
| userId: args.userId ?? 'system', | |
| inputTokens, | |
| outputTokens, | |
| costEstimateCents: costCents, | |
| timestamp: Date.now(), | |
| }, | |
| ) | |
| .catch((error) => { | |
| console.error( | |
| '[OpenAI-compat:clientTools] Failed to increment usage ledger:', | |
| error, | |
| ); | |
| }); | |
| // AI audit log for OpenAI-compat client tool mode | |
| await ctx | |
| .runMutation(internal.audit_logs.internal_mutations.createAuditLog, { | |
| organizationId: args.organizationId, | |
| actorId: args.userId ?? 'system', | |
| actorType: 'api' as const, | |
| action: 'ai.completion', | |
| category: 'ai' as const, | |
| resourceType: 'agent_completion', | |
| resourceId: threadId, | |
| status: 'success' as const, | |
| metadata: { | |
| model: resolved.modelData.modelId, | |
| inputTokens, | |
| outputTokens, | |
| totalTokens, | |
| costEstimateCents: costCents, | |
| threadId, | |
| agentType: 'openai_compat', | |
| toolCallCount: toolCalls.length, | |
| }, | |
| }) | |
| .catch((error) => { | |
| console.error( | |
| '[OpenAI-compat:clientTools] Failed to write AI audit log:', | |
| error, | |
| ); | |
| }); | |
| } | |
| return { | |
| threadId, | |
| text: text || null, | |
| toolCalls: toolCalls.length > 0 ? toolCalls : null, | |
| finishReason, | |
| inputTokens, | |
| outputTokens, | |
| totalTokens, | |
| resolvedModel: resolved.modelData.modelId, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/convex/openai_compat/internal_actions.ts` around lines 388
- 455, The code is billing and auditing using the pre-resolution modelId; update
usages to the resolved provider model by passing resolved.modelData.modelId into
estimateCostCents and replacing model: modelId in the audit log metadata (and
any other places in this block that reference modelId for cost/audit) so costs
and audit entries reflect the actual resolved model; ensure you still fall back
safely if resolved or resolved.modelData is undefined.
| // Build generation params | ||
| // oxlint-disable-next-line typescript/no-unsafe-type-assertion -- generationParams is v.any() from Convex validator; shape is controlled by http_actions.ts buildGenerationParams | ||
| const genParams = (args.generationParams ?? {}) as Record<string, unknown>; | ||
|
|
||
| // Build messages — use full conversation if provided, otherwise single message | ||
| const hasConversation = | ||
| args.conversationMessages && | ||
| Array.isArray(args.conversationMessages) && | ||
| args.conversationMessages.length > 0; | ||
|
|
||
| const messages: ModelMessage[] = hasConversation | ||
| ? // oxlint-disable-next-line typescript/no-unsafe-type-assertion -- conversationMessages is built by convertToModelMessages in http_actions.ts; shape matches ModelMessage[] | ||
| (args.conversationMessages as ModelMessage[]) | ||
| : [{ role: 'user' as const, content: message }]; | ||
|
|
||
| const result = streamText({ | ||
| model: resolved.languageModel, | ||
| system: systemPrompt, | ||
| messages, | ||
| ...(aiTools && { tools: aiTools }), | ||
| ...(args.toolChoice != null && { | ||
| toolChoice: mapToolChoice(args.toolChoice), | ||
| }), | ||
| ...(genParams.temperature != null && { | ||
| temperature: Number(genParams.temperature), | ||
| }), | ||
| ...(genParams.maxTokens != null && { | ||
| maxTokens: Number(genParams.maxTokens), | ||
| }), | ||
| ...(genParams.topP != null && { topP: Number(genParams.topP) }), | ||
| ...(genParams.frequencyPenalty != null && { | ||
| frequencyPenalty: Number(genParams.frequencyPenalty), | ||
| }), | ||
| ...(genParams.presencePenalty != null && { | ||
| presencePenalty: Number(genParams.presencePenalty), | ||
| }), | ||
| ...(Array.isArray(genParams.stopSequences) && { | ||
| stopSequences: genParams.stopSequences, | ||
| }), | ||
| }); |
There was a problem hiding this comment.
response_format is accepted here but never applied.
The new direct-model path receives responseFormat from http_actions.ts, then drops it before calling streamText. Requests asking for {"type":"json_object"} will silently behave like plain text completions on this route.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/convex/openai_compat/internal_actions.ts` around lines 543
- 582, The code builds genParams and then calls streamText but never forwards
the requested responseFormat, so JSON-object responses are lost; update the
streamText call (where streamText is invoked) to include the responseFormat
coming from the request (e.g. prefer args.responseFormat if present otherwise
genParams.responseFormat) and pass it through or map it to the streamText
parameter name (responseFormat or response_format) used by streamText; ensure
you coerce/validate the value the same way http_actions.ts does (accept the
enum/string shape) and add the conditional spread like ...(responseFormat !=
null && { responseFormat }) so the direct-model path honors
{"type":"json_object"} requests.
| "tags": [ | ||
| "OpenAI Compatible" | ||
| ], | ||
| "summary": "List models", | ||
| "description": "List available agents as OpenAI-compatible models. Only agents with `visibleInChat: true` are returned.", | ||
| "operationId": "listModels", |
There was a problem hiding this comment.
Update /api/v1/models to describe provider models, not agents.
This operation still says it lists “available agents” gated by visibleInChat, which contradicts the rest of this PR and the ModelList schema. That will mislead anyone reading the generated docs for the new direct model gateway.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/public/openapi.json` around lines 94 - 99, The OpenAPI
operation with operationId "listModels" (endpoint /api/v1/models) incorrectly
describes returning "available agents" and referencing `visibleInChat`; update
the description to state it lists provider models (e.g., provider-specific model
entries) and align it with the ModelList schema — remove mention of agents and
`visibleInChat`, clarify it returns provider model metadata for the direct model
gateway instead. Ensure the summary/description and any examples reflect
"models" not "agents" so generated docs match the new gateway behavior.
| "ChatMessage": { | ||
| "type": "object", | ||
| "required": ["role"], | ||
| "properties": { | ||
| "role": { | ||
| "type": "string", | ||
| "enum": ["system", "user", "assistant", "tool"] | ||
| "oneOf": [ | ||
| { | ||
| "$ref": "#/components/schemas/ChatMessageSystem" | ||
| }, | ||
| "content": { | ||
| "oneOf": [ | ||
| { | ||
| "type": "string" | ||
| }, | ||
| { | ||
| "type": "null" | ||
| } | ||
| ], | ||
| "description": "Message content." | ||
| { | ||
| "$ref": "#/components/schemas/ChatMessageUser" | ||
| }, | ||
| "tool_calls": { | ||
| "type": "array", | ||
| "items": { | ||
| "$ref": "#/components/schemas/ToolCall" | ||
| }, | ||
| "description": "Tool calls (assistant messages only)." | ||
| { | ||
| "$ref": "#/components/schemas/ChatMessageAssistant" | ||
| }, | ||
| "tool_call_id": { | ||
| "type": "string", | ||
| "description": "ID of the tool call this result is for (tool messages only)." | ||
| { | ||
| "$ref": "#/components/schemas/ChatMessageTool" | ||
| } | ||
| ], | ||
| "discriminator": { | ||
| "propertyName": "role", | ||
| "mapping": { | ||
| "system": "#/components/schemas/ChatMessageSystem", | ||
| "user": "#/components/schemas/ChatMessageUser", | ||
| "assistant": "#/components/schemas/ChatMessageAssistant", | ||
| "tool": "#/components/schemas/ChatMessageTool" | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🌐 Web query:
OpenAI Chat Completions API developer message type
💡 Result:
The OpenAI Chat Completions API supports a "developer" message type (role: "developer"). This is a message role for developer-provided instructions that the model must follow, regardless of user messages. It replaces the previous "system" role for o1 models and newer reasoning models. Key details: - Structure: ChatCompletionDeveloperMessageParam = { content: string or array of ChatCompletionContentPartText (text only), role: "developer", name?: string } - Purpose: High-priority instructions (ahead of "user" messages in the chain of command). - Compatibility: For o1+ models, "system" auto-converts to "developer"; for older models like GPT-4o, "developer" auto-converts to "system". - Usage example: [ { "role": "developer", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] Standard roles: "system" (legacy), "developer" (new), "user", "assistant", "tool".
Citations:
- 1: https://developers.openai.com/api/reference/chat-completions/overview/
- 2: https://platform.openai.com/docs/api-reference/chat/message-list
- 3: https://developers.openai.com/api/reference/typescript/resources/chat
- 4: https://developers.openai.com/api/docs/api-reference/chat/object
- 5: https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create/
- 6: https://developers.openai.com/api/reference/java/resources/chat/subresources/completions/methods/create/
- 7: https://developers.openai.com/api/reference/resources/chat
🏁 Script executed:
find . -name "openapi.json" -path "*/services/platform/public/*" | head -5Repository: tale-project/tale
Length of output: 100
🏁 Script executed:
# Check the content around the specified lines
if [ -f "services/platform/public/openapi.json" ]; then
wc -l services/platform/public/openapi.json
fiRepository: tale-project/tale
Length of output: 103
🏁 Script executed:
# Search for ChatMessageDeveloper in the openapi.json file
if [ -f "services/platform/public/openapi.json" ]; then
grep -n "ChatMessageDeveloper" services/platform/public/openapi.json
fiRepository: tale-project/tale
Length of output: 43
🏁 Script executed:
# Read the specific lines mentioned in the review (4027-4049)
sed -n '4027,4049p' services/platform/public/openapi.jsonRepository: tale-project/tale
Length of output: 804
🏁 Script executed:
# Search for other ChatMessage schema definitions to understand the pattern
grep -n "ChatMessage" services/platform/public/openapi.json | head -20Repository: tale-project/tale
Length of output: 896
🏁 Script executed:
# Check if there are any other references to "developer" role in the file
grep -n '"developer"' services/platform/public/openapi.jsonRepository: tale-project/tale
Length of output: 43
Add developer to the message discriminator mapping and schema.
OpenAI's Chat Completions API supports a developer message type, which is recommended for newer models (o1+ and reasoning models) and replaces the use of system messages. Without including this role in the discriminator mapping and creating the corresponding schema, requests using the developer role will fail validation against this spec, making it incompatible with current OpenAI API usage patterns.
Proposed fix
"ChatMessage": {
"oneOf": [
+ {
+ "$ref": "#/components/schemas/ChatMessageDeveloper"
+ },
{
"$ref": "#/components/schemas/ChatMessageSystem"
},
{
"$ref": "#/components/schemas/ChatMessageUser"
@@
"discriminator": {
"propertyName": "role",
"mapping": {
+ "developer": "#/components/schemas/ChatMessageDeveloper",
"system": "#/components/schemas/ChatMessageSystem",
"user": "#/components/schemas/ChatMessageUser",
"assistant": "#/components/schemas/ChatMessageAssistant",
"tool": "#/components/schemas/ChatMessageTool"
}
}
},
+ "ChatMessageDeveloper": {
+ "type": "object",
+ "required": [
+ "role",
+ "content"
+ ],
+ "properties": {
+ "role": {
+ "type": "string",
+ "enum": [
+ "developer"
+ ]
+ },
+ "content": {
+ "type": "string",
+ "description": "Developer instruction content."
+ }
+ }
+ },📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "ChatMessage": { | |
| "type": "object", | |
| "required": ["role"], | |
| "properties": { | |
| "role": { | |
| "type": "string", | |
| "enum": ["system", "user", "assistant", "tool"] | |
| "oneOf": [ | |
| { | |
| "$ref": "#/components/schemas/ChatMessageSystem" | |
| }, | |
| "content": { | |
| "oneOf": [ | |
| { | |
| "type": "string" | |
| }, | |
| { | |
| "type": "null" | |
| } | |
| ], | |
| "description": "Message content." | |
| { | |
| "$ref": "#/components/schemas/ChatMessageUser" | |
| }, | |
| "tool_calls": { | |
| "type": "array", | |
| "items": { | |
| "$ref": "#/components/schemas/ToolCall" | |
| }, | |
| "description": "Tool calls (assistant messages only)." | |
| { | |
| "$ref": "#/components/schemas/ChatMessageAssistant" | |
| }, | |
| "tool_call_id": { | |
| "type": "string", | |
| "description": "ID of the tool call this result is for (tool messages only)." | |
| { | |
| "$ref": "#/components/schemas/ChatMessageTool" | |
| } | |
| ], | |
| "discriminator": { | |
| "propertyName": "role", | |
| "mapping": { | |
| "system": "#/components/schemas/ChatMessageSystem", | |
| "user": "#/components/schemas/ChatMessageUser", | |
| "assistant": "#/components/schemas/ChatMessageAssistant", | |
| "tool": "#/components/schemas/ChatMessageTool" | |
| } | |
| "ChatMessage": { | |
| "oneOf": [ | |
| { | |
| "$ref": "#/components/schemas/ChatMessageDeveloper" | |
| }, | |
| { | |
| "$ref": "#/components/schemas/ChatMessageSystem" | |
| }, | |
| { | |
| "$ref": "#/components/schemas/ChatMessageUser" | |
| }, | |
| { | |
| "$ref": "#/components/schemas/ChatMessageAssistant" | |
| }, | |
| { | |
| "$ref": "#/components/schemas/ChatMessageTool" | |
| } | |
| ], | |
| "discriminator": { | |
| "propertyName": "role", | |
| "mapping": { | |
| "developer": "#/components/schemas/ChatMessageDeveloper", | |
| "system": "#/components/schemas/ChatMessageSystem", | |
| "user": "#/components/schemas/ChatMessageUser", | |
| "assistant": "#/components/schemas/ChatMessageAssistant", | |
| "tool": "#/components/schemas/ChatMessageTool" | |
| } | |
| } | |
| }, | |
| "ChatMessageDeveloper": { | |
| "type": "object", | |
| "required": [ | |
| "role", | |
| "content" | |
| ], | |
| "properties": { | |
| "role": { | |
| "type": "string", | |
| "enum": [ | |
| "developer" | |
| ] | |
| }, | |
| "content": { | |
| "type": "string", | |
| "description": "Developer instruction content." | |
| } | |
| } | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/public/openapi.json` around lines 4027 - 4049, The OpenAPI
ChatMessage union and discriminator are missing the `developer` role which will
cause validation failures for requests using that role; update the "ChatMessage"
oneOf to include a new ChatMessageDeveloper schema, add "developer":
"#/components/schemas/ChatMessageDeveloper" to the discriminator.mapping
alongside system/user/assistant/tool, and create a matching
components.schemas.ChatMessageDeveloper definition (mirror the structure of
ChatMessageSystem but with role enum/value "developer") so the spec accepts
developer-role messages.
| "Citation": { | ||
| "type": "object", | ||
| "properties": { | ||
| "index": { | ||
| "type": "integer", | ||
| "description": "Citation index corresponding to [N] markers in text." | ||
| }, | ||
| "type": { | ||
| "type": "string", | ||
| "enum": [ | ||
| "rag", | ||
| "web" | ||
| ], | ||
| "description": "Source type: RAG knowledge base or web search." | ||
| }, | ||
| "source": { | ||
| "type": "string", | ||
| "description": "Source name or title." | ||
| }, | ||
| "fileId": { | ||
| "type": "string", | ||
| "description": "File ID for RAG citations." | ||
| }, | ||
| "url": { | ||
| "type": "string", | ||
| "description": "URL for web citations." | ||
| }, | ||
| "page": { | ||
| "type": "integer", | ||
| "description": "Page number for document citations." | ||
| }, | ||
| "relevance": { | ||
| "type": "number", | ||
| "description": "Relevance score (0-1)." | ||
| } | ||
| } | ||
| }, |
There was a problem hiding this comment.
Don’t allow empty citation objects.
citations are described as backing [N] markers in the response text, but this schema now makes every field optional. That means {} validates even though a client cannot resolve a marker without at least an index.
Proposed fix
"Citation": {
"type": "object",
+ "required": [
+ "index"
+ ],
"properties": {
"index": {
"type": "integer",
"description": "Citation index corresponding to [N] markers in text."
},📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| "Citation": { | |
| "type": "object", | |
| "properties": { | |
| "index": { | |
| "type": "integer", | |
| "description": "Citation index corresponding to [N] markers in text." | |
| }, | |
| "type": { | |
| "type": "string", | |
| "enum": [ | |
| "rag", | |
| "web" | |
| ], | |
| "description": "Source type: RAG knowledge base or web search." | |
| }, | |
| "source": { | |
| "type": "string", | |
| "description": "Source name or title." | |
| }, | |
| "fileId": { | |
| "type": "string", | |
| "description": "File ID for RAG citations." | |
| }, | |
| "url": { | |
| "type": "string", | |
| "description": "URL for web citations." | |
| }, | |
| "page": { | |
| "type": "integer", | |
| "description": "Page number for document citations." | |
| }, | |
| "relevance": { | |
| "type": "number", | |
| "description": "Relevance score (0-1)." | |
| } | |
| } | |
| }, | |
| "Citation": { | |
| "type": "object", | |
| "required": [ | |
| "index" | |
| ], | |
| "properties": { | |
| "index": { | |
| "type": "integer", | |
| "description": "Citation index corresponding to [N] markers in text." | |
| }, | |
| "type": { | |
| "type": "string", | |
| "enum": [ | |
| "rag", | |
| "web" | |
| ], | |
| "description": "Source type: RAG knowledge base or web search." | |
| }, | |
| "source": { | |
| "type": "string", | |
| "description": "Source name or title." | |
| }, | |
| "fileId": { | |
| "type": "string", | |
| "description": "File ID for RAG citations." | |
| }, | |
| "url": { | |
| "type": "string", | |
| "description": "URL for web citations." | |
| }, | |
| "page": { | |
| "type": "integer", | |
| "description": "Page number for document citations." | |
| }, | |
| "relevance": { | |
| "type": "number", | |
| "description": "Relevance score (0-1)." | |
| } | |
| } | |
| }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/public/openapi.json` around lines 4111 - 4147, The Citation
schema currently allows empty objects; require at least the index so clients can
resolve [N] markers by adding a required field to the Citation schema: add
"required": ["index"] under the "Citation" object and also consider adding
"additionalProperties": false (or "minProperties": 1) to prevent {} from
validating; update the "Citation" schema in openapi.json (the "Citation" object
and its "index" property) accordingly.
| model: { | ||
| type: 'string', | ||
| description: 'Agent slug (e.g., "chat-agent").', | ||
| example: 'chat-agent', | ||
| description: | ||
| 'Model ID (e.g., "claude-sonnet-4-20250514"). Use GET /api/v1/models to list available models.', | ||
| example: 'claude-sonnet-4-20250514', | ||
| }, |
There was a problem hiding this comment.
Keep the model example consistent with the IDs returned by /api/v1/models.
This example now shows an unqualified model ID, while the same generated spec later documents/provider-examples use IDs like anthropic/claude-sonnet-4.6. Copy-pasting the current example gives clients a different identifier shape than the one this API advertises elsewhere.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/scripts/generate-openapi.ts` around lines 148 - 153, The
example value for the OpenAPI "model" schema is inconsistent with the IDs
returned by GET /api/v1/models and the provider examples; update the example in
the model property inside generate-openapi.ts (the model: { ... } schema) to use
the fully-qualified provider-prefixed form (e.g., "anthropic/claude-sonnet-4.6"
or whatever exact shape your /api/v1/models returns) so generated docs and
provider-examples share the same identifier format.
| const BASE_URL = 'http://localhost:3000/api/v1'; | ||
| const API_KEY = | ||
| 'taleDsYqAacBOcFDlGBISiORAmxkQHhNEChqBgagAngCaaReIsBGfAREKtZTckLmyeqn'; | ||
|
|
||
| const provider = createOpenAICompatible({ | ||
| name: 'tale', | ||
| baseURL: BASE_URL, | ||
| headers: { Authorization: `Bearer ${API_KEY}` }, |
There was a problem hiding this comment.
Move the Bearer token out of source control.
This script checks in a live API key and wires it straight into request headers. Even for a local helper, that leaks credentials into the repo and makes rotation/auditing harder.
🔐 Suggested change
-const BASE_URL = 'http://localhost:3000/api/v1';
-const API_KEY =
- 'taleDsYqAacBOcFDlGBISiORAmxkQHhNEChqBgagAngCaaReIsBGfAREKtZTckLmyeqn';
+const BASE_URL =
+ process.env.OPENAI_COMPAT_BASE_URL ?? 'http://localhost:3000/api/v1';
+const API_KEY = process.env.OPENAI_COMPAT_API_KEY;
+
+if (!API_KEY) {
+ throw new Error('OPENAI_COMPAT_API_KEY is required');
+}As per coding guidelines DO NOT hardcode secrets, API keys, or credentials. Use environment variables instead.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| const BASE_URL = 'http://localhost:3000/api/v1'; | |
| const API_KEY = | |
| 'taleDsYqAacBOcFDlGBISiORAmxkQHhNEChqBgagAngCaaReIsBGfAREKtZTckLmyeqn'; | |
| const provider = createOpenAICompatible({ | |
| name: 'tale', | |
| baseURL: BASE_URL, | |
| headers: { Authorization: `Bearer ${API_KEY}` }, | |
| const BASE_URL = | |
| process.env.OPENAI_COMPAT_BASE_URL ?? 'http://localhost:3000/api/v1'; | |
| const API_KEY = process.env.OPENAI_COMPAT_API_KEY; | |
| if (!API_KEY) { | |
| throw new Error('OPENAI_COMPAT_API_KEY is required'); | |
| } | |
| const provider = createOpenAICompatible({ | |
| name: 'tale', | |
| baseURL: BASE_URL, | |
| headers: { Authorization: `Bearer ${API_KEY}` }, |
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/scripts/test-openai-compat.ts` around lines 13 - 20, The
API key is hardcoded in API_KEY and injected into headers when calling
createOpenAICompatible; move this secret into an environment variable (e.g.,
process.env.OPENAI_API_KEY) and update the provider initialization to read that
env var instead of the literal string, add a runtime check that throws or logs a
clear error if the env var is missing, and remove the hardcoded API_KEY constant
so no credential remains in source control.
| async function main() { | ||
| console.log('Testing OpenAI-compatible API with AI SDK v6'); | ||
| console.log(`Base URL: ${BASE_URL}`); | ||
|
|
||
| const modelId = await testListModels(); | ||
| await testGenerateText(modelId); | ||
| await testStreamText(modelId); | ||
| await testToolCalling(modelId); | ||
| await testMultiTurnToolCalling(modelId); | ||
|
|
||
| console.log('\n═══ All tests done ═══\n'); | ||
| } | ||
|
|
||
| main().catch(console.error); |
There was a problem hiding this comment.
Fail the process when a test step throws.
main().catch(console.error) logs the failure but still exits successfully, so CI or wrapper scripts can report a passing integration run after a broken request.
🧪 Suggested change
-main().catch(console.error);
+main().catch((error) => {
+ console.error(error);
+ process.exitCode = 1;
+});🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@services/platform/scripts/test-openai-compat.ts` around lines 144 - 157, The
test harness currently logs errors from main() but still exits with a zero
status; modify the invocation of main() so that any rejection causes the process
to exit non‑zero: in services/platform/scripts/test-openai-compat.ts update the
main().catch(...) handler used where main() is called to log the error
(console.error or processLogger) and then call process.exit(1) so
testListModels, testGenerateText, testStreamText, testToolCalling, or
testMultiTurnToolCalling failures fail the CI run.
- Remove X-Thread-Id header parameter (stateless direct model mode) - Delete chatViaOpenAI, chatViaOpenAIWithTools, listVisibleAgents actions - Delete resolveOrgSlug helper (no longer needed) - Clean up unused imports (readdir, components, getString, agents/file_utils) - Rename ToolCallResult to DirectModelResult with requestId field
Summary
Closes #1428
/api/v1/chat/completionsfrom agent-based routing to a direct model gateway —modelfield now accepts real provider model IDs (e.g.,anthropic/claude-sonnet-4.6) instead of agent slugs/api/v1/modelsnow returns real provider models withowned_byfieldstream_options.include_usagesupport per OpenAI spec (opt-in streaming usage chunk withchoices: [])ChatMessageinto role-specific schemas (ChatMessageSystem,ChatMessageUser,ChatMessageAssistant,ChatMessageTool) with discriminatorgenerate-openapi.ts(was previously hand-edited in openapi.json)$-prefixed keys from tool parameters to work around Convex reserved prefixTest plan
generateText,streamText, tool calling (single + multi-turn) all passingSummary by CodeRabbit
Release Notes
New Features
stream_options.include_usageparameter to include token usage statistics in streaming responsesImprovements