feat(platform): refactor OpenAI-compat API as direct model gateway by larryro · Pull Request #1440 · tale-project/tale

larryro · 2026-04-12T10:32:55Z

Summary

Closes #1428

Refactors /api/v1/chat/completions from agent-based routing to a direct model gateway — model field now accepts real provider model IDs (e.g., anthropic/claude-sonnet-4.6) instead of agent slugs
/api/v1/models now returns real provider models with owned_by field
Adds stream_options.include_usage support per OpenAI spec (opt-in streaming usage chunk with choices: [])
Returns real token usage in all response paths (non-streaming + streaming)
Splits OpenAPI ChatMessage into role-specific schemas (ChatMessageSystem, ChatMessageUser, ChatMessageAssistant, ChatMessageTool) with discriminator
Backports Citation schema into generate-openapi.ts (was previously hand-edited in openapi.json)
Full governance preserved in direct mode: auth, rate limiting, PII scrubbing, mandatory system prompt, usage ledger, audit log, circuit breaker + failover
Strips $-prefixed keys from tool parameters to work around Convex reserved prefix

Test plan

TypeScript: 0 errors
Lint: 0 errors
Unit tests: 28/28 passing (includes new usage tests)
AI SDK v6 integration test: generateText, streamText, tool calling (single + multi-turn) all passing
Governance verified: mandatory system prompt (German response) + PII scrubbing (block mode) both working in direct model mode

Summary by CodeRabbit

Release Notes

New Features
- Added stream_options.include_usage parameter to include token usage statistics in streaming responses
Improvements
- Model listing API now displays available models with provider metadata
- Chat completions API updated to use model IDs instead of agent identifiers
- Enhanced token usage tracking with detailed input and output token counts
- Improved API documentation and usage examples

…1428) Replace agent-based routing with direct model access. The `model` field now accepts real provider model IDs (e.g., `anthropic/claude-sonnet-4.6`) instead of agent slugs, routing requests directly to the configured provider. Changes: - Rewrite `/api/v1/models` to return real provider models with `owned_by` - Add `chatDirectModel` internal action — direct model resolution with full governance (PII scrubbing, mandatory system prompt, rate limiting, usage ledger, audit log, circuit breaker + failover) - Rewrite `chatCompletionsHandler` to use direct model mode - Add `stream_options.include_usage` support per OpenAI spec - Return real token usage in responses (non-streaming and streaming) - Split OpenAPI `ChatMessage` into role-specific schemas with discriminator - Backport Citation schema into `generate-openapi.ts` - Strip `$`-prefixed keys from tool parameters (Convex reserved prefix) - Remove dead agent-mode code (pollOpenAIResponse, streamOpenAIResponse) - Add unit tests for usage in response builders - Add AI SDK v6 integration test script

Replace agent slug examples with real provider model IDs in the API documentation and OpenAPI schema examples.

Remove outdated agent mode / client tool mode descriptions. The endpoint is now a direct model gateway.

coderabbitai · 2026-04-12T10:35:23Z

📝 Walkthrough

Walkthrough

This PR refactors the OpenAI-compatible chat completions endpoint from a two-mode implementation (persistent agent streaming vs. client tool mode) to a single direct model gateway architecture. It introduces usage tracking through stream_options.include_usage parameter, enabling SSE chunks with token counts during responses. The model listing endpoint switches from listing organization-scoped agents to provider model IDs with metadata (providerName, displayName). Response types and schemas are extended throughout to include inputTokens, outputTokens, totalTokens, and resolvedModel fields. The OpenAPI specification is comprehensively updated to improve OpenAI schema compatibility and reflect the new request/response structure.

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

feat(platform): OpenAI Chat Completions API compatibility layer #1251: Introduces the original two-mode OpenAI-compat implementation that this PR replaces with a direct model gateway flow.
feat(platform): enforce mandatory governance system prompt #1257: Modifies services/platform/convex/openai_compat/internal_actions.ts with system prompt and tool handling logic affecting the same code paths.
feat(platform): add BYOM support and per-model config #1319: Related changes to provider/model metadata and the getAllModelIds surface (providerName/displayName fields).

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title 'refactor OpenAI-compat API as direct model gateway' accurately describes the primary change: converting from agent-based routing to direct provider model access.
Linked Issues check	✅ Passed	All key objectives from `#1428` are addressed: real provider model IDs replace custom 'chat-agent' identifier, actual token usage is returned, OpenAPI schema is refactored for compatibility, non-standard fields are removed, and standard OpenAI SDKs work without modification.
Out of Scope Changes check	✅ Passed	All changes are scoped to the direct model gateway refactor: model resolution logic, token usage tracking, OpenAPI schema alignment, governance enforcement, and supporting utilities are all directly aligned with the PR objectives.
Docstring Coverage	✅ Passed	Docstring coverage is 83.33% which is sufficient. The required threshold is 80.00%.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

📝 Generate docstrings

Create stacked PR
Commit on current branch

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch feat/openai-compat-direct-model-gateway

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Direct model mode should be stateless — API calls should not appear in the UI thread history. Replace createThreadAndSaveMessage with a transient ID used only for usage tracking and audit log correlation.

coderabbitai

Actionable comments posted: 11

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (4)

services/platform/convex/openai_compat/http_actions.ts (2)
414-433: ⚠️ Potential issue | 🟠 Major

Invalid model IDs now fall through as 500 server_error.

This refactor switched the execution path to chatDirectModel, but the downstream error handling is still the old agent-era matcher. Missing-model failures from the new path (Model "..." not found...) will miss the special case and return a 500 instead of OpenAI-style 404 model_not_found.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/openai_compat/http_actions.ts` around lines 414 -
433, The new path calls internal.openai_compat.internal_actions.chatDirectModel
but missing-model errors from that action (e.g. messages like 'Model "..." not
found' or an internal code indicating missing model) are not translated and fall
through to handleChatError causing a 500; update the catch to detect the
chatDirectModel missing-model failure and route it to the OpenAI-style
model-not-found handler (either call the existing handleModelNotFound helper or
extend handleChatError to inspect errors coming from chatDirectModel for the
missing-model signature/code and return a 404 model_not_found response) so
missing models produce the correct OpenAI-style 404.
706-731: ⚠️ Potential issue | 🟠 Major

/api/v1/models is no longer organization-scoped.

The handler authenticates the caller, then calls getAllModelIds({}) without resolving membership or X-Organization-Slug. In services/platform/convex/providers/file_actions.ts, that falls back to 'default', so users in non-default orgs can list a different catalog than the one /api/v1/chat/completions will actually route against.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/convex/openai_compat/http_actions.ts` around lines 706 -
731, modelsListHandler currently authenticates but then calls
internal.providers.file_actions.getAllModelIds with an empty object, causing it
to fall back to the 'default' org; change the handler to resolve the caller's
organization (using authenticateRequest result or the X-Organization-Slug header
/ membership info available on ctx after auth) and pass that organization
identifier into getAllModelIds so the model list matches the org that
/api/v1/chat/completions will use; update the call site in modelsListHandler to
supply { organization: resolvedOrg } (or the proper param name expected by
getAllModelIds) instead of {}.
services/platform/public/openapi.json (2)
4245-4268: ⚠️ Potential issue | 🟠 Major

Add required fields to ToolCall schema.

The ToolCall object currently allows an empty {} to validate, breaking client code that correlates tool_call_id back to the assistant turn. Per OpenAI's API specification, id, type, and function are required fields.
Proposed fix
       "ToolCall": {
         "type": "object",
+        "required": [
+          "id",
+          "type",
+          "function"
+        ],
         "properties": {
           "id": {
             "type": "string"
           },
           "type": {
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/public/openapi.json` around lines 4245 - 4268, Update the
OpenAPI schema for the ToolCall object to require the id, type, and function
properties so an empty object no longer validates; modify the "ToolCall" schema
(the object with "properties" containing "id", "type", and "function") to
include a "required" array with ["id","type","function"] and ensure the existing
"function" property definition remains unchanged.
4085-4238: ⚠️ Potential issue | 🟠 Major

Schemas restrict content to string only and have incorrect nullability; user messages cannot support multimodal content.

OpenAI's Chat Completions API specifies that content can be a string OR an array of content parts (supporting multimodal inputs: text, image_url, input_audio, etc.). Additionally:

Assistant content is optional when tool_calls is present

Tool message content is optional, but the schema contradictorily marks it both required and nullable

User messages must support arrays for multimodal capability (vision models)

The current schemas restrict all message types to "type": "string" only, rejecting valid multimodal payloads, and the required/nullable patterns for assistant and tool messages don't align with OpenAI's actual API behavior.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@services/platform/public/openapi.json` around lines 4085 - 4238, Update the
message schemas to support multimodal content and correct nullability/required
rules: introduce a reusable ContentPart schema (or inline oneOf) and change
ChatMessageAssistant.content, ChatMessageUser.content, and
ChatMessageTool.content to accept either a string OR an array of ContentPart
(e.g., oneOf: [{type: string}, {type: array, items: ContentPart}]); remove
content from the required list for ChatMessageAssistant and make it nullable
(since assistant content may be omitted when tool_calls is present), and remove
content from the required list for ChatMessageTool while keeping tool_call_id
required and content nullable; ensure ChatMessageUser still requires content but
accepts the string|array form to enable multimodal user inputs.

🤖 Prompt for all review comments with AI agents

Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@services/platform/convex/openai_compat/http_actions.ts`:
- Around line 333-360: When messages already represent a tool-result
continuation (isContinuation from hasToolInteraction), the call to
handleDirectModelMode still passes lastUserMessage: lastUserMessage.content (the
earlier saved variable), causing the previous user prompt to be re-saved;
instead, derive lastUserMessage from the continuation context. Update the code
around isContinuation/conversationMessages to set the value passed to
handleDirectModelMode to the most recent user message in conversationMessages
(e.g., find the last item with role === 'user' in conversationMessages generated
by convertToModelMessages) when isContinuation is true, otherwise keep using
lastUserMessage.content; ensure the change references the existing symbols
isContinuation, conversationMessages, lastUserMessage, convertToModelMessages,
hasToolInteraction, and handleDirectModelMode.
- Around line 366-377: stripDollarKeys is removing $-prefixed JSON Schema
keywords (like $ref, $defs, $schema) that OpenAI tool parameter schemas rely on;
stop stripping them when preparing/sending tool parameter schemas to OpenAI by
removing any calls to stripDollarKeys for tool parameter serialization/HTTP
request construction and ensure the original object is sent unchanged (leave
stripDollarKeys only for Convex-persisted data if needed). Locate usages of
stripDollarKeys in the code path that builds OpenAI tool parameter payloads and
remove or conditionalize those calls so $-prefixed keys are preserved.

In `@services/platform/convex/openai_compat/internal_actions.ts`:
- Around line 543-582: The code builds genParams and then calls streamText but
never forwards the requested responseFormat, so JSON-object responses are lost;
update the streamText call (where streamText is invoked) to include the
responseFormat coming from the request (e.g. prefer args.responseFormat if
present otherwise genParams.responseFormat) and pass it through or map it to the
streamText parameter name (responseFormat or response_format) used by
streamText; ensure you coerce/validate the value the same way http_actions.ts
does (accept the enum/string shape) and add the conditional spread like
...(responseFormat != null && { responseFormat }) so the direct-model path
honors {"type":"json_object"} requests.
- Around line 388-455: The code is billing and auditing using the pre-resolution
modelId; update usages to the resolved provider model by passing
resolved.modelData.modelId into estimateCostCents and replacing model: modelId
in the audit log metadata (and any other places in this block that reference
modelId for cost/audit) so costs and audit entries reflect the actual resolved
model; ensure you still fall back safely if resolved or resolved.modelData is
undefined.

In `@services/platform/public/openapi.json`:
- Around line 94-99: The OpenAPI operation with operationId "listModels"
(endpoint /api/v1/models) incorrectly describes returning "available agents" and
referencing `visibleInChat`; update the description to state it lists provider
models (e.g., provider-specific model entries) and align it with the ModelList
schema — remove mention of agents and `visibleInChat`, clarify it returns
provider model metadata for the direct model gateway instead. Ensure the
summary/description and any examples reflect "models" not "agents" so generated
docs match the new gateway behavior.
- Around line 2919-2922: The schema for the "model" property currently documents
bare model IDs; update its description and example to use provider-prefixed
model IDs (e.g., "anthropic/claude-sonnet-4-20250514" or "openai/gpt-4o") so it
matches the Quick Start and ModelList conventions. Locate the "model" property
object (the "type", "description", "example" fields) in the OpenAPI JSON and
change the description text to mention provider-prefixed IDs and update the
"example" string to a provider-prefixed ID; ensure any other occurrences of the
bare example in the same schema are updated consistently.
- Around line 4111-4147: The Citation schema currently allows empty objects;
require at least the index so clients can resolve [N] markers by adding a
required field to the Citation schema: add "required": ["index"] under the
"Citation" object and also consider adding "additionalProperties": false (or
"minProperties": 1) to prevent {} from validating; update the "Citation" schema
in openapi.json (the "Citation" object and its "index" property) accordingly.
- Around line 4027-4049: The OpenAPI ChatMessage union and discriminator are
missing the `developer` role which will cause validation failures for requests
using that role; update the "ChatMessage" oneOf to include a new
ChatMessageDeveloper schema, add "developer":
"#/components/schemas/ChatMessageDeveloper" to the discriminator.mapping
alongside system/user/assistant/tool, and create a matching
components.schemas.ChatMessageDeveloper definition (mirror the structure of
ChatMessageSystem but with role enum/value "developer") so the spec accepts
developer-role messages.

In `@services/platform/scripts/generate-openapi.ts`:
- Around line 148-153: The example value for the OpenAPI "model" schema is
inconsistent with the IDs returned by GET /api/v1/models and the provider
examples; update the example in the model property inside generate-openapi.ts
(the model: { ... } schema) to use the fully-qualified provider-prefixed form
(e.g., "anthropic/claude-sonnet-4.6" or whatever exact shape your /api/v1/models
returns) so generated docs and provider-examples share the same identifier
format.

In `@services/platform/scripts/test-openai-compat.ts`:
- Around line 144-157: The test harness currently logs errors from main() but
still exits with a zero status; modify the invocation of main() so that any
rejection causes the process to exit non‑zero: in
services/platform/scripts/test-openai-compat.ts update the main().catch(...)
handler used where main() is called to log the error (console.error or
processLogger) and then call process.exit(1) so testListModels,
testGenerateText, testStreamText, testToolCalling, or testMultiTurnToolCalling
failures fail the CI run.
- Around line 13-20: The API key is hardcoded in API_KEY and injected into
headers when calling createOpenAICompatible; move this secret into an
environment variable (e.g., process.env.OPENAI_API_KEY) and update the provider
initialization to read that env var instead of the literal string, add a runtime
check that throws or logs a clear error if the env var is missing, and remove
the hardcoded API_KEY constant so no credential remains in source control.

---

Outside diff comments:
In `@services/platform/convex/openai_compat/http_actions.ts`:
- Around line 414-433: The new path calls
internal.openai_compat.internal_actions.chatDirectModel but missing-model errors
from that action (e.g. messages like 'Model "..." not found' or an internal code
indicating missing model) are not translated and fall through to handleChatError
causing a 500; update the catch to detect the chatDirectModel missing-model
failure and route it to the OpenAI-style model-not-found handler (either call
the existing handleModelNotFound helper or extend handleChatError to inspect
errors coming from chatDirectModel for the missing-model signature/code and
return a 404 model_not_found response) so missing models produce the correct
OpenAI-style 404.
- Around line 706-731: modelsListHandler currently authenticates but then calls
internal.providers.file_actions.getAllModelIds with an empty object, causing it
to fall back to the 'default' org; change the handler to resolve the caller's
organization (using authenticateRequest result or the X-Organization-Slug header
/ membership info available on ctx after auth) and pass that organization
identifier into getAllModelIds so the model list matches the org that
/api/v1/chat/completions will use; update the call site in modelsListHandler to
supply { organization: resolvedOrg } (or the proper param name expected by
getAllModelIds) instead of {}.

In `@services/platform/public/openapi.json`:
- Around line 4245-4268: Update the OpenAPI schema for the ToolCall object to
require the id, type, and function properties so an empty object no longer
validates; modify the "ToolCall" schema (the object with "properties" containing
"id", "type", and "function") to include a "required" array with
["id","type","function"] and ensure the existing "function" property definition
remains unchanged.
- Around line 4085-4238: Update the message schemas to support multimodal
content and correct nullability/required rules: introduce a reusable ContentPart
schema (or inline oneOf) and change ChatMessageAssistant.content,
ChatMessageUser.content, and ChatMessageTool.content to accept either a string
OR an array of ContentPart (e.g., oneOf: [{type: string}, {type: array, items:
ContentPart}]); remove content from the required list for ChatMessageAssistant
and make it nullable (since assistant content may be omitted when tool_calls is
present), and remove content from the required list for ChatMessageTool while
keeping tool_call_id required and content nullable; ensure ChatMessageUser still
requires content but accepts the string|array form to enable multimodal user
inputs.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: ASSERTIVE

Plan: Pro

Run ID: 60455d24-428d-403e-8902-78d1d3df0ac4

📥 Commits

Reviewing files that changed from the base of the PR and between 52c2dff and 6d39aa9.

📒 Files selected for processing (9)

services/platform/convex/openai_compat/http_actions.ts
services/platform/convex/openai_compat/internal_actions.ts
services/platform/convex/openai_compat/internal_queries.ts
services/platform/convex/openai_compat/response_format.test.ts
services/platform/convex/openai_compat/response_format.ts
services/platform/convex/providers/file_actions.ts
services/platform/public/openapi.json
services/platform/scripts/generate-openapi.ts
services/platform/scripts/test-openai-compat.ts

coderabbitai · 2026-04-12T10:44:41Z

+  const isContinuation = hasToolInteraction(messages);
+  const conversationMessages = isContinuation
+    ? convertToModelMessages(messages)
+    : undefined;

-  // -----------------------------------------------------------------------
-  // Agent mode: server-side tools, async generation via persistent stream
-  // -----------------------------------------------------------------------
-  let chatResult: { threadId: string; streamId: string };
-  try {
-    chatResult = await ctx.runAction(
-      internal.openai_compat.internal_actions.chatViaOpenAI,
-      {
-        agentSlug: model,
-        organizationId: orgInfo.organizationId,
-        userId: user.userId,
-        userEmail: user.email,
-        userName: user.name,
-        message: lastUserMessage.content,
-        threadId,
-        enableStreaming: shouldStream,
-        generationParams,
-        responseFormat,
-      },
-    );
-  } catch (error) {
-    return handleChatError(error, model);
-  }
+  // Strip $-prefixed keys from tool parameters (Convex reserves $ prefix)
+  const tools = body.tools?.map((t) => ({
+    ...t,
+    function: {
+      ...t.function,
+      parameters: t.function.parameters
+        ? stripDollarKeys(t.function.parameters)
+        : undefined,
+    },
+  }));

-  if (shouldStream) {
-    return streamOpenAIResponse(ctx, chatResult, model);
-  }
-  return pollOpenAIResponse(ctx, chatResult, model);
+  return handleDirectModelMode(ctx, {
+    model,
+    messages,
+    lastUserMessage: lastUserMessage.content,
+    tools,
+    toolChoice: body.tool_choice,
+    shouldStream,
+    includeUsage,
+    threadId,
+    generationParams,
+    responseFormat,
+    conversationMessages,


⚠️ Potential issue | 🟠 Major

Tool-result continuations persist the wrong user message.

When messages already contains an assistant tool call plus a trailing tool result, conversationMessages carries the correct continuation context, but this code still forwards lastUserMessage to chatDirectModel. That action later saves message to the thread, so every tool round-trip duplicates the earlier user prompt instead of recording the current continuation state.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/openai_compat/http_actions.ts` around lines 333 - 360, When messages already represent a tool-result continuation (isContinuation from hasToolInteraction), the call to handleDirectModelMode still passes lastUserMessage: lastUserMessage.content (the earlier saved variable), causing the previous user prompt to be re-saved; instead, derive lastUserMessage from the continuation context. Update the code around isContinuation/conversationMessages to set the value passed to handleDirectModelMode to the most recent user message in conversationMessages (e.g., find the last item with role === 'user' in conversationMessages generated by convertToModelMessages) when isContinuation is true, otherwise keep using lastUserMessage.content; ensure the change references the existing symbols isContinuation, conversationMessages, lastUserMessage, convertToModelMessages, hasToolInteraction, and handleDirectModelMode.

coderabbitai · 2026-04-12T10:44:41Z

+/** Recursively strip keys starting with '$' (Convex reserves this prefix). */
+function stripDollarKeys(obj: unknown): unknown {
+  if (Array.isArray(obj)) return obj.map(stripDollarKeys);
+  if (typeof obj === 'object' && obj !== null) {
+    const result: Record<string, unknown> = {};
+    for (const [k, v] of Object.entries(obj)) {
+      if (!k.startsWith('$')) result[k] = stripDollarKeys(v);
+    }
+    return result;
+  }
+  return obj;
+}


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

In JSON Schema, what are $ref, $defs, and $schemaused for, and does removing$-prefixed keys change schema behavior or validity?

💡 Result:

$ref is used to reference another schema (internal via JSON Pointer like #/$defs/name or external via URI), replacing the current subschema with the referenced one during validation. $defs provides a standardized location to define reusable subschemas within the same document, which do not directly affect validation but serve as targets for $ref; it replaced 'definitions' in older drafts. $schema declares the JSON Schema dialect/version (e.g., "https://json-schema.org/draft/2020-12/schema") at the root, identifying the meta-schema for validating the schema itself and guiding implementations on semantics. Removing $-prefixed keys changes behavior: removing $schema may cause implementations to assume a default dialect or fail schema validation; removing $defs makes referenced subschemas unavailable, causing $ref resolution failures; removing $ref replaces the referenced validation logic with nothing (true schema), invalidating intended constraints. These are core keywords (not applicators in older drafts), so removal affects validity against the declared meta-schema and runtime behavior. Unknown keywords are typically ignored in instance validation but fail schema (meta) validation.

Citations:

1: https://json-schema.org/understanding-json-schema/reference/schema?highlight=dynamicref

2: https://json-schema.org/understanding-json-schema/structuring

3: https://mintlify.com/json-schema-org/json-schema-spec/reference/core/defs

4: https://json-schema.org/understanding-json-schema/structuring?highlight=ref

5: https://mintlify.com/json-schema-org/json-schema-spec/reference/core/ref

6: https://json-schema.org/understanding-json-schema/reference/schema

7: https://github.com/json-schema-org/json-schema-spec/blob/main/specs/jsonschema-core.md

8: https://json-schema.org/draft/2020-12/vocab/core

🏁 Script executed:

# Find calls to stripDollarKeys to understand its usage rg "stripDollarKeys" services/platform/convex/openai_compat/http_actions.ts -B 5 -A 5

Repository: tale-project/tale

Length of output: 776

🏁 Script executed:

# Check the full file to understand context around the function definition wc -l services/platform/convex/openai_compat/http_actions.ts

Repository: tale-project/tale

Length of output: 119

🏁 Script executed:

# Read the section around the function and its callers sed -n '360,400p' services/platform/convex/openai_compat/http_actions.ts

Repository: tale-project/tale

Length of output: 1308

🏁 Script executed:

# Search for any comments or context around Convex and $ prefix rg "Convex.*\\\$|reserves.*prefix" services/platform/convex/ -B 3 -A 3

Repository: tale-project/tale

Length of output: 3771

🏁 Script executed:

# Look for Convex documentation or configuration files fd -e md -e txt -e json "convex|README" services/platform/ | head -20

Repository: tale-project/tale

Length of output: 288

🏁 Script executed:

# Search for any related issues or comments about tool parameters and Convex rg "tool.*parameter|parameter.*schema" services/platform/convex/openai_compat/http_actions.ts -B 3 -A 3

Repository: tale-project/tale

Length of output: 260

🏁 Script executed:

# Check if there's a Convex configuration file that documents constraints cat services/platform/convex.json 2>/dev/null || cat services/platform/convex/convex.json 2>/dev/null || echo "No convex.json found"

Repository: tale-project/tale

Length of output: 333

Remove stripDollarKeys() from tool parameters—Convex's $ prefix constraint doesn't apply to outgoing OpenAI API requests.

The stripDollarKeys() call removes JSON Schema keywords ($ref, $defs, $schema) that are essential for schema composition and validation. These keywords are required by OpenAI's tool parameter schemas to function correctly. Convex's field name restrictions apply only to data persisted in Convex, not to third-party API request bodies. Tool schemas should be sent to OpenAI exactly as the caller provided them; stripping $-prefixed keys causes parameter validation to fail and breaks schema composition.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/openai_compat/http_actions.ts` around lines 366 - 377, stripDollarKeys is removing $-prefixed JSON Schema keywords (like $ref, $defs, $schema) that OpenAI tool parameter schemas rely on; stop stripping them when preparing/sending tool parameter schemas to OpenAI by removing any calls to stripDollarKeys for tool parameter serialization/HTTP request construction and ensure the original object is sent unchanged (leave stripDollarKeys only for Convex-persisted data if needed). Locate usages of stripDollarKeys in the code path that builds OpenAI tool parameter payloads and remove or conditionalize those calls so $-prefixed keys are preserved.

coderabbitai · 2026-04-12T10:44:41Z

    // Track usage for client tool mode (this path bypasses onAgentComplete)
    const usage = await result.usage;
-    if (usage && args.organizationId) {
+    const inputTokens = usage?.inputTokens ?? 0;
+    const outputTokens = usage?.outputTokens ?? 0;
+    const totalTokens = inputTokens + outputTokens;
+
+    if (usage && args.organizationId && (inputTokens > 0 || outputTokens > 0)) {
+      const { estimateCostCents } =
+        await import('../governance/cost_estimation');
+      const costCents = estimateCostCents(modelId, inputTokens, outputTokens);
+      await ctx
+        .runMutation(
+          internal.governance.internal_mutations.incrementUsageLedger,
+          {
+            organizationId: args.organizationId,
+            userId: args.userId ?? 'system',
+            inputTokens,
+            outputTokens,
+            costEstimateCents: costCents,
+            timestamp: Date.now(),
+          },
+        )
+        .catch((error) => {
+          console.error(
+            '[OpenAI-compat:clientTools] Failed to increment usage ledger:',
+            error,
+          );
+        });
+
+      // AI audit log for OpenAI-compat client tool mode
+      await ctx
+        .runMutation(internal.audit_logs.internal_mutations.createAuditLog, {
+          organizationId: args.organizationId,
+          actorId: args.userId ?? 'system',
+          actorType: 'api' as const,
+          action: 'ai.completion',
+          category: 'ai' as const,
+          resourceType: 'agent_completion',
+          resourceId: threadId,
+          status: 'success' as const,
+          metadata: {
+            model: modelId,
+            inputTokens,
+            outputTokens,
+            totalTokens,
+            costEstimateCents: costCents,
+            threadId,
+            agentType: 'openai_compat',
+            toolCallCount: toolCalls.length,
+          },
+        })
+        .catch((error) => {
+          console.error(
+            '[OpenAI-compat:clientTools] Failed to write AI audit log:',
+            error,
+          );
+        });
+    }
+
+    return {
+      threadId,
+      text: text || null,
+      toolCalls: toolCalls.length > 0 ? toolCalls : null,
+      finishReason,
+      inputTokens,
+      outputTokens,
+      totalTokens,
+      resolvedModel: resolved.modelData.modelId,


⚠️ Potential issue | 🟠 Major

Bill and audit against the resolved provider model, not the pre-resolution ID.

This block still prices usage with modelId, which can be an agent alias, a tag/default, or a value that later failed over. You already have resolved.modelData.modelId; using the unresolved value can skew cost estimation and leave the audit trail pointing at the wrong model.

💸 Suggested change

- const costCents = estimateCostCents(modelId, inputTokens, outputTokens); + const costCents = estimateCostCents( + resolved.modelData.modelId, + inputTokens, + outputTokens, + ); @@ - model: modelId, + model: resolved.modelData.modelId,

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

// Track usage for client tool mode (this path bypasses onAgentComplete)

const usage = await result.usage;

if (usage && args.organizationId) {

const inputTokens = usage?.inputTokens ?? 0;

const outputTokens = usage?.outputTokens ?? 0;

const totalTokens = inputTokens + outputTokens;

if (usage && args.organizationId && (inputTokens > 0 || outputTokens > 0)) {

const { estimateCostCents } =

await import('../governance/cost_estimation');

const costCents = estimateCostCents(modelId, inputTokens, outputTokens);

await ctx

.runMutation(

internal.governance.internal_mutations.incrementUsageLedger,

{

organizationId: args.organizationId,

userId: args.userId ?? 'system',

inputTokens,

outputTokens,

costEstimateCents: costCents,

timestamp: Date.now(),

},

)

.catch((error) => {

console.error(

'[OpenAI-compat:clientTools] Failed to increment usage ledger:',

error,

);

});

// AI audit log for OpenAI-compat client tool mode

await ctx

.runMutation(internal.audit_logs.internal_mutations.createAuditLog, {

organizationId: args.organizationId,

actorId: args.userId ?? 'system',

actorType: 'api' as const,

action: 'ai.completion',

category: 'ai' as const,

resourceType: 'agent_completion',

resourceId: threadId,

status: 'success' as const,

metadata: {

model: modelId,

inputTokens,

outputTokens,

totalTokens,

costEstimateCents: costCents,

threadId,

agentType: 'openai_compat',

toolCallCount: toolCalls.length,

},

})

.catch((error) => {

console.error(

'[OpenAI-compat:clientTools] Failed to write AI audit log:',

error,

);

});

}

return {

threadId,

text: text || null,

toolCalls: toolCalls.length > 0 ? toolCalls : null,

finishReason,

inputTokens,

outputTokens,

totalTokens,

resolvedModel: resolved.modelData.modelId,

// Track usage for client tool mode (this path bypasses onAgentComplete)

const usage = await result.usage;

const inputTokens = usage?.inputTokens ?? 0;

const outputTokens = usage?.outputTokens ?? 0;

const totalTokens = inputTokens + outputTokens;

if (usage && args.organizationId && (inputTokens > 0 || outputTokens > 0)) {

const { estimateCostCents } =

await import('../governance/cost_estimation');

const costCents = estimateCostCents(

resolved.modelData.modelId,

inputTokens,

outputTokens,

);

await ctx

.runMutation(

internal.governance.internal_mutations.incrementUsageLedger,

{

organizationId: args.organizationId,

userId: args.userId ?? 'system',

inputTokens,

outputTokens,

costEstimateCents: costCents,

timestamp: Date.now(),

},

)

.catch((error) => {

console.error(

'[OpenAI-compat:clientTools] Failed to increment usage ledger:',

error,

);

});

// AI audit log for OpenAI-compat client tool mode

await ctx

.runMutation(internal.audit_logs.internal_mutations.createAuditLog, {

organizationId: args.organizationId,

actorId: args.userId ?? 'system',

actorType: 'api' as const,

action: 'ai.completion',

category: 'ai' as const,

resourceType: 'agent_completion',

resourceId: threadId,

status: 'success' as const,

metadata: {

model: resolved.modelData.modelId,

inputTokens,

outputTokens,

totalTokens,

costEstimateCents: costCents,

threadId,

agentType: 'openai_compat',

toolCallCount: toolCalls.length,

},

})

.catch((error) => {

console.error(

'[OpenAI-compat:clientTools] Failed to write AI audit log:',

error,

);

});

}

return {

threadId,

text: text || null,

toolCalls: toolCalls.length > 0 ? toolCalls : null,

finishReason,

inputTokens,

outputTokens,

totalTokens,

resolvedModel: resolved.modelData.modelId,

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/openai_compat/internal_actions.ts` around lines 388 - 455, The code is billing and auditing using the pre-resolution modelId; update usages to the resolved provider model by passing resolved.modelData.modelId into estimateCostCents and replacing model: modelId in the audit log metadata (and any other places in this block that reference modelId for cost/audit) so costs and audit entries reflect the actual resolved model; ensure you still fall back safely if resolved or resolved.modelData is undefined.

coderabbitai · 2026-04-12T10:44:41Z

+    // Build generation params
+    // oxlint-disable-next-line typescript/no-unsafe-type-assertion -- generationParams is v.any() from Convex validator; shape is controlled by http_actions.ts buildGenerationParams
+    const genParams = (args.generationParams ?? {}) as Record<string, unknown>;
+
+    // Build messages — use full conversation if provided, otherwise single message
+    const hasConversation =
+      args.conversationMessages &&
+      Array.isArray(args.conversationMessages) &&
+      args.conversationMessages.length > 0;
+
+    const messages: ModelMessage[] = hasConversation
+      ? // oxlint-disable-next-line typescript/no-unsafe-type-assertion -- conversationMessages is built by convertToModelMessages in http_actions.ts; shape matches ModelMessage[]
+        (args.conversationMessages as ModelMessage[])
+      : [{ role: 'user' as const, content: message }];
+
+    const result = streamText({
+      model: resolved.languageModel,
+      system: systemPrompt,
+      messages,
+      ...(aiTools && { tools: aiTools }),
+      ...(args.toolChoice != null && {
+        toolChoice: mapToolChoice(args.toolChoice),
+      }),
+      ...(genParams.temperature != null && {
+        temperature: Number(genParams.temperature),
+      }),
+      ...(genParams.maxTokens != null && {
+        maxTokens: Number(genParams.maxTokens),
+      }),
+      ...(genParams.topP != null && { topP: Number(genParams.topP) }),
+      ...(genParams.frequencyPenalty != null && {
+        frequencyPenalty: Number(genParams.frequencyPenalty),
+      }),
+      ...(genParams.presencePenalty != null && {
+        presencePenalty: Number(genParams.presencePenalty),
+      }),
+      ...(Array.isArray(genParams.stopSequences) && {
+        stopSequences: genParams.stopSequences,
+      }),
+    });


⚠️ Potential issue | 🟠 Major

response_format is accepted here but never applied.

The new direct-model path receives responseFormat from http_actions.ts, then drops it before calling streamText. Requests asking for {"type":"json_object"} will silently behave like plain text completions on this route.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/convex/openai_compat/internal_actions.ts` around lines 543 - 582, The code builds genParams and then calls streamText but never forwards the requested responseFormat, so JSON-object responses are lost; update the streamText call (where streamText is invoked) to include the responseFormat coming from the request (e.g. prefer args.responseFormat if present otherwise genParams.responseFormat) and pass it through or map it to the streamText parameter name (responseFormat or response_format) used by streamText; ensure you coerce/validate the value the same way http_actions.ts does (accept the enum/string shape) and add the conditional spread like ...(responseFormat != null && { responseFormat }) so the direct-model path honors {"type":"json_object"} requests.

coderabbitai · 2026-04-12T10:44:41Z

+        "tags": [
+          "OpenAI Compatible"
+        ],
        "summary": "List models",
        "description": "List available agents as OpenAI-compatible models. Only agents with `visibleInChat: true` are returned.",
        "operationId": "listModels",


⚠️ Potential issue | 🟡 Minor

Update /api/v1/models to describe provider models, not agents.

This operation still says it lists “available agents” gated by visibleInChat, which contradicts the rest of this PR and the ModelList schema. That will mislead anyone reading the generated docs for the new direct model gateway.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/public/openapi.json` around lines 94 - 99, The OpenAPI operation with operationId "listModels" (endpoint /api/v1/models) incorrectly describes returning "available agents" and referencing `visibleInChat`; update the description to state it lists provider models (e.g., provider-specific model entries) and align it with the ModelList schema — remove mention of agents and `visibleInChat`, clarify it returns provider model metadata for the direct model gateway instead. Ensure the summary/description and any examples reflect "models" not "agents" so generated docs match the new gateway behavior.

coderabbitai · 2026-04-12T10:44:42Z

      "ChatMessage": {
-        "type": "object",
-        "required": ["role"],
-        "properties": {
-          "role": {
-            "type": "string",
-            "enum": ["system", "user", "assistant", "tool"]
+        "oneOf": [
+          {
+            "$ref": "#/components/schemas/ChatMessageSystem"
          },
-          "content": {
-            "oneOf": [
-              {
-                "type": "string"
-              },
-              {
-                "type": "null"
-              }
-            ],
-            "description": "Message content."
+          {
+            "$ref": "#/components/schemas/ChatMessageUser"
          },
-          "tool_calls": {
-            "type": "array",
-            "items": {
-              "$ref": "#/components/schemas/ToolCall"
-            },
-            "description": "Tool calls (assistant messages only)."
+          {
+            "$ref": "#/components/schemas/ChatMessageAssistant"
          },
-          "tool_call_id": {
-            "type": "string",
-            "description": "ID of the tool call this result is for (tool messages only)."
+          {
+            "$ref": "#/components/schemas/ChatMessageTool"
+          }
+        ],
+        "discriminator": {
+          "propertyName": "role",
+          "mapping": {
+            "system": "#/components/schemas/ChatMessageSystem",
+            "user": "#/components/schemas/ChatMessageUser",
+            "assistant": "#/components/schemas/ChatMessageAssistant",
+            "tool": "#/components/schemas/ChatMessageTool"
          }


⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🌐 Web query:

OpenAI Chat Completions API developer message type

💡 Result:

The OpenAI Chat Completions API supports a "developer" message type (role: "developer"). This is a message role for developer-provided instructions that the model must follow, regardless of user messages. It replaces the previous "system" role for o1 models and newer reasoning models. Key details: - Structure: ChatCompletionDeveloperMessageParam = { content: string or array of ChatCompletionContentPartText (text only), role: "developer", name?: string } - Purpose: High-priority instructions (ahead of "user" messages in the chain of command). - Compatibility: For o1+ models, "system" auto-converts to "developer"; for older models like GPT-4o, "developer" auto-converts to "system". - Usage example: [ { "role": "developer", "content": "You are a helpful assistant." }, { "role": "user", "content": "Hello!" } ] Standard roles: "system" (legacy), "developer" (new), "user", "assistant", "tool".

Citations:

1: https://developers.openai.com/api/reference/chat-completions/overview/

2: https://platform.openai.com/docs/api-reference/chat/message-list

3: https://developers.openai.com/api/reference/typescript/resources/chat

4: https://developers.openai.com/api/docs/api-reference/chat/object

5: https://developers.openai.com/api/reference/resources/chat/subresources/completions/methods/create/

6: https://developers.openai.com/api/reference/java/resources/chat/subresources/completions/methods/create/

7: https://developers.openai.com/api/reference/resources/chat

🏁 Script executed:

find . -name "openapi.json" -path "*/services/platform/public/*" | head -5

Repository: tale-project/tale

Length of output: 100

🏁 Script executed:

# Check the content around the specified lines if [ -f "services/platform/public/openapi.json" ]; then wc -l services/platform/public/openapi.json fi

Repository: tale-project/tale

Length of output: 103

🏁 Script executed:

# Search for ChatMessageDeveloper in the openapi.json file if [ -f "services/platform/public/openapi.json" ]; then grep -n "ChatMessageDeveloper" services/platform/public/openapi.json fi

Repository: tale-project/tale

Length of output: 43

🏁 Script executed:

# Read the specific lines mentioned in the review (4027-4049) sed -n '4027,4049p' services/platform/public/openapi.json

Repository: tale-project/tale

Length of output: 804

🏁 Script executed:

# Search for other ChatMessage schema definitions to understand the pattern grep -n "ChatMessage" services/platform/public/openapi.json | head -20

Repository: tale-project/tale

Length of output: 896

🏁 Script executed:

# Check if there are any other references to "developer" role in the file grep -n '"developer"' services/platform/public/openapi.json

Repository: tale-project/tale

Length of output: 43

Add developer to the message discriminator mapping and schema.

OpenAI's Chat Completions API supports a developer message type, which is recommended for newer models (o1+ and reasoning models) and replaces the use of system messages. Without including this role in the discriminator mapping and creating the corresponding schema, requests using the developer role will fail validation against this spec, making it incompatible with current OpenAI API usage patterns.

Proposed fix

"ChatMessage": { "oneOf": [ + { + "$ref": "#/components/schemas/ChatMessageDeveloper" + }, { "$ref": "#/components/schemas/ChatMessageSystem" }, { "$ref": "#/components/schemas/ChatMessageUser" @@ "discriminator": { "propertyName": "role", "mapping": { + "developer": "#/components/schemas/ChatMessageDeveloper", "system": "#/components/schemas/ChatMessageSystem", "user": "#/components/schemas/ChatMessageUser", "assistant": "#/components/schemas/ChatMessageAssistant", "tool": "#/components/schemas/ChatMessageTool" } } }, + "ChatMessageDeveloper": { + "type": "object", + "required": [ + "role", + "content" + ], + "properties": { + "role": { + "type": "string", + "enum": [ + "developer" + ] + }, + "content": { + "type": "string", + "description": "Developer instruction content." + } + } + },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"ChatMessage": {

"type": "object",

"required": ["role"],

"properties": {

"role": {

"type": "string",

"enum": ["system", "user", "assistant", "tool"]

"oneOf": [

{

"$ref": "#/components/schemas/ChatMessageSystem"

},

"content": {

"oneOf": [

{

"type": "string"

},

{

"type": "null"

}

],

"description": "Message content."

{

"$ref": "#/components/schemas/ChatMessageUser"

},

"tool_calls": {

"type": "array",

"items": {

"$ref": "#/components/schemas/ToolCall"

},

"description": "Tool calls (assistant messages only)."

{

"$ref": "#/components/schemas/ChatMessageAssistant"

},

"tool_call_id": {

"type": "string",

"description": "ID of the tool call this result is for (tool messages only)."

{

"$ref": "#/components/schemas/ChatMessageTool"

}

],

"discriminator": {

"propertyName": "role",

"mapping": {

"system": "#/components/schemas/ChatMessageSystem",

"user": "#/components/schemas/ChatMessageUser",

"assistant": "#/components/schemas/ChatMessageAssistant",

"tool": "#/components/schemas/ChatMessageTool"

}

"ChatMessage": {

"oneOf": [

{

"$ref": "#/components/schemas/ChatMessageDeveloper"

},

{

"$ref": "#/components/schemas/ChatMessageSystem"

},

{

"$ref": "#/components/schemas/ChatMessageUser"

},

{

"$ref": "#/components/schemas/ChatMessageAssistant"

},

{

"$ref": "#/components/schemas/ChatMessageTool"

}

],

"discriminator": {

"propertyName": "role",

"mapping": {

"developer": "#/components/schemas/ChatMessageDeveloper",

"system": "#/components/schemas/ChatMessageSystem",

"user": "#/components/schemas/ChatMessageUser",

"assistant": "#/components/schemas/ChatMessageAssistant",

"tool": "#/components/schemas/ChatMessageTool"

}

}

},

"ChatMessageDeveloper": {

"type": "object",

"required": [

"role",

"content"

],

"properties": {

"role": {

"type": "string",

"enum": [

"developer"

]

},

"content": {

"type": "string",

"description": "Developer instruction content."

}

}

},

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/public/openapi.json` around lines 4027 - 4049, The OpenAPI ChatMessage union and discriminator are missing the `developer` role which will cause validation failures for requests using that role; update the "ChatMessage" oneOf to include a new ChatMessageDeveloper schema, add "developer": "#/components/schemas/ChatMessageDeveloper" to the discriminator.mapping alongside system/user/assistant/tool, and create a matching components.schemas.ChatMessageDeveloper definition (mirror the structure of ChatMessageSystem but with role enum/value "developer") so the spec accepts developer-role messages.

coderabbitai · 2026-04-12T10:44:42Z

+      "Citation": {
+        "type": "object",
+        "properties": {
+          "index": {
+            "type": "integer",
+            "description": "Citation index corresponding to [N] markers in text."
+          },
+          "type": {
+            "type": "string",
+            "enum": [
+              "rag",
+              "web"
+            ],
+            "description": "Source type: RAG knowledge base or web search."
+          },
+          "source": {
+            "type": "string",
+            "description": "Source name or title."
+          },
+          "fileId": {
+            "type": "string",
+            "description": "File ID for RAG citations."
+          },
+          "url": {
+            "type": "string",
+            "description": "URL for web citations."
+          },
+          "page": {
+            "type": "integer",
+            "description": "Page number for document citations."
+          },
+          "relevance": {
+            "type": "number",
+            "description": "Relevance score (0-1)."
+          }
+        }
+      },


⚠️ Potential issue | 🟠 Major

Don’t allow empty citation objects.

citations are described as backing [N] markers in the response text, but this schema now makes every field optional. That means {} validates even though a client cannot resolve a marker without at least an index.

Proposed fix

"Citation": { "type": "object", + "required": [ + "index" + ], "properties": { "index": { "type": "integer", "description": "Citation index corresponding to [N] markers in text." },

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

"Citation": {

"type": "object",

"properties": {

"index": {

"type": "integer",

"description": "Citation index corresponding to [N] markers in text."

},

"type": {

"type": "string",

"enum": [

"rag",

"web"

],

"description": "Source type: RAG knowledge base or web search."

},

"source": {

"type": "string",

"description": "Source name or title."

},

"fileId": {

"type": "string",

"description": "File ID for RAG citations."

},

"url": {

"type": "string",

"description": "URL for web citations."

},

"page": {

"type": "integer",

"description": "Page number for document citations."

},

"relevance": {

"type": "number",

"description": "Relevance score (0-1)."

}

}

},

"Citation": {

"type": "object",

"required": [

"index"

],

"properties": {

"index": {

"type": "integer",

"description": "Citation index corresponding to [N] markers in text."

},

"type": {

"type": "string",

"enum": [

"rag",

"web"

],

"description": "Source type: RAG knowledge base or web search."

},

"source": {

"type": "string",

"description": "Source name or title."

},

"fileId": {

"type": "string",

"description": "File ID for RAG citations."

},

"url": {

"type": "string",

"description": "URL for web citations."

},

"page": {

"type": "integer",

"description": "Page number for document citations."

},

"relevance": {

"type": "number",

"description": "Relevance score (0-1)."

}

}

},

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/public/openapi.json` around lines 4111 - 4147, The Citation schema currently allows empty objects; require at least the index so clients can resolve [N] markers by adding a required field to the Citation schema: add "required": ["index"] under the "Citation" object and also consider adding "additionalProperties": false (or "minProperties": 1) to prevent {} from validating; update the "Citation" schema in openapi.json (the "Citation" object and its "index" property) accordingly.

coderabbitai · 2026-04-12T10:44:42Z

      model: {
        type: 'string',
-        description: 'Agent slug (e.g., "chat-agent").',
-        example: 'chat-agent',
+        description:
+          'Model ID (e.g., "claude-sonnet-4-20250514"). Use GET /api/v1/models to list available models.',
+        example: 'claude-sonnet-4-20250514',
      },


⚠️ Potential issue | 🟡 Minor

Keep the model example consistent with the IDs returned by /api/v1/models.

This example now shows an unqualified model ID, while the same generated spec later documents/provider-examples use IDs like anthropic/claude-sonnet-4.6. Copy-pasting the current example gives clients a different identifier shape than the one this API advertises elsewhere.

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/scripts/generate-openapi.ts` around lines 148 - 153, The example value for the OpenAPI "model" schema is inconsistent with the IDs returned by GET /api/v1/models and the provider examples; update the example in the model property inside generate-openapi.ts (the model: { ... } schema) to use the fully-qualified provider-prefixed form (e.g., "anthropic/claude-sonnet-4.6" or whatever exact shape your /api/v1/models returns) so generated docs and provider-examples share the same identifier format.

coderabbitai · 2026-04-12T10:44:42Z

+const BASE_URL = 'http://localhost:3000/api/v1';
+const API_KEY =
+  'taleDsYqAacBOcFDlGBISiORAmxkQHhNEChqBgagAngCaaReIsBGfAREKtZTckLmyeqn';
+
+const provider = createOpenAICompatible({
+  name: 'tale',
+  baseURL: BASE_URL,
+  headers: { Authorization: `Bearer ${API_KEY}` },


⚠️ Potential issue | 🟠 Major

Move the Bearer token out of source control.

This script checks in a live API key and wires it straight into request headers. Even for a local helper, that leaks credentials into the repo and makes rotation/auditing harder.

🔐 Suggested change

-const BASE_URL = 'http://localhost:3000/api/v1'; -const API_KEY = - 'taleDsYqAacBOcFDlGBISiORAmxkQHhNEChqBgagAngCaaReIsBGfAREKtZTckLmyeqn'; +const BASE_URL = + process.env.OPENAI_COMPAT_BASE_URL ?? 'http://localhost:3000/api/v1'; +const API_KEY = process.env.OPENAI_COMPAT_API_KEY; + +if (!API_KEY) { + throw new Error('OPENAI_COMPAT_API_KEY is required'); +}

As per coding guidelines DO NOT hardcode secrets, API keys, or credentials. Use environment variables instead.

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

const BASE_URL = 'http://localhost:3000/api/v1';

const API_KEY =

'taleDsYqAacBOcFDlGBISiORAmxkQHhNEChqBgagAngCaaReIsBGfAREKtZTckLmyeqn';

const provider = createOpenAICompatible({

name: 'tale',

baseURL: BASE_URL,

headers: { Authorization: `Bearer ${API_KEY}` },

const BASE_URL =

process.env.OPENAI_COMPAT_BASE_URL ?? 'http://localhost:3000/api/v1';

const API_KEY = process.env.OPENAI_COMPAT_API_KEY;

if (!API_KEY) {

throw new Error('OPENAI_COMPAT_API_KEY is required');

}

const provider = createOpenAICompatible({

name: 'tale',

baseURL: BASE_URL,

headers: { Authorization: `Bearer ${API_KEY}` },

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/scripts/test-openai-compat.ts` around lines 13 - 20, The API key is hardcoded in API_KEY and injected into headers when calling createOpenAICompatible; move this secret into an environment variable (e.g., process.env.OPENAI_API_KEY) and update the provider initialization to read that env var instead of the literal string, add a runtime check that throws or logs a clear error if the env var is missing, and remove the hardcoded API_KEY constant so no credential remains in source control.

coderabbitai · 2026-04-12T10:44:42Z

+async function main() {
+  console.log('Testing OpenAI-compatible API with AI SDK v6');
+  console.log(`Base URL: ${BASE_URL}`);
+
+  const modelId = await testListModels();
+  await testGenerateText(modelId);
+  await testStreamText(modelId);
+  await testToolCalling(modelId);
+  await testMultiTurnToolCalling(modelId);
+
+  console.log('\n═══ All tests done ═══\n');
+}
+
+main().catch(console.error);


⚠️ Potential issue | 🟠 Major

Fail the process when a test step throws.

main().catch(console.error) logs the failure but still exits successfully, so CI or wrapper scripts can report a passing integration run after a broken request.

🧪 Suggested change

-main().catch(console.error); +main().catch((error) => { + console.error(error); + process.exitCode = 1; +});

🤖 Prompt for AI Agents

Verify each finding against the current code and only fix it if needed. In `@services/platform/scripts/test-openai-compat.ts` around lines 144 - 157, The test harness currently logs errors from main() but still exits with a zero status; modify the invocation of main() so that any rejection causes the process to exit non‑zero: in services/platform/scripts/test-openai-compat.ts update the main().catch(...) handler used where main() is called to log the error (console.error or processLogger) and then call process.exit(1) so testListModels, testGenerateText, testStreamText, testToolCalling, or testMultiTurnToolCalling failures fail the CI run.

- Remove X-Thread-Id header parameter (stateless direct model mode) - Delete chatViaOpenAI, chatViaOpenAIWithTools, listVisibleAgents actions - Delete resolveOrgSlug helper (no longer needed) - Clean up unused imports (readdir, components, getString, agents/file_utils) - Rename ToolCallResult to DirectModelResult with requestId field

larryro added 3 commits April 12, 2026 18:30

docs(platform): update OpenAI-compat quick start to use real model IDs

e2742fc

Replace agent slug examples with real provider model IDs in the API documentation and OpenAPI schema examples.

docs(platform): update chat completions endpoint description

6d39aa9

Remove outdated agent mode / client tool mode descriptions. The endpoint is now a direct model gateway.

fix(platform): don't create threads for direct model API calls

63045de

Direct model mode should be stateless — API calls should not appear in the UI thread history. Replace createThreadAndSaveMessage with a transient ID used only for usage tracking and audit log correlation.

coderabbitai Bot requested changes Apr 12, 2026

View reviewed changes

larryro added 2 commits April 12, 2026 18:53

style(platform): format openapi.json

345a05a

larryro merged commit 303ddb0 into main Apr 12, 2026
24 checks passed

larryro deleted the feat/openai-compat-direct-model-gateway branch April 12, 2026 11:05

Conversation

larryro commented Apr 12, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Test plan

Summary by CodeRabbit

Release Notes

Uh oh!

coderabbitai Bot commented Apr 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

Possibly related PRs

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Apr 12, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

larryro commented Apr 12, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Apr 12, 2026 •

edited

Loading