[feat] register openai as built-in openai-compatible provider by EYH0602 · Pull Request #3 · SecurityLab-UCD/korabench

EYH0602 · 2026-05-20T02:33:25Z

Summary

Follow-up to merged PR #2. Lets callers address OpenAI models directly with the slug form openai/<model-id> (e.g. openai/gpt-5-nano) — same dispatch path as vllm/<model-id>, no models.json entry required.

New built-in provider entry in packages/cli/src/models/openAICompatibleProviders.ts:
- defaultBaseURL: https://api.openai.com/v1
- baseURLEnv: OPENAI_BASE_URL (override for OpenAI-compatible proxies)
- apiKeyEnv: OPENAI_API_KEY
- supportsStructuredOutputs: true (server-side response_format: { type: "json_schema" })
Also bundles the smaller fixes from earlier on the branch that landed after PR [feat] support OpenAI-compatible endpoints (vLLM) as judge/target #2:
- drop hardcoded 300-token cap on user-message generation
- enable json_schema response_format for vLLM structured outputs
- auto-resolve served model id against /v1/models
.env.example documents OPENAI_API_KEY / OPENAI_BASE_URL next to the existing gateway and vLLM keys; yarn kora:env already loads .env via node --env-file, so OpenAI direct follows the same flow as the Vercel AI gateway.
README has a quick OpenAI direct example.
Tests updated: two cases previously used openai/gpt-4o as the example of an unknown prefix; replaced with a synthetic nonprovider/... slug. Added an explicit positive test that the openai prefix resolves to api.openai.com.

Gateway-routed OpenAI models (named entries like gpt-4o, gpt-5.2:high in models.json) are unchanged.

Test plan

yarn tsbuild — passes
yarn test — 182/182 passing
Smoke test against api.openai.com: yarn kora run openai/gpt-5-nano openai/gpt-5-nano --judges openai/gpt-5-nano --limit 10 — 10/10 scenarios scored, judge MechanismAssessment JSON validated server-side.
vLLM regression: same smoke run with vllm/Qwen3-30B-A3B-Instruct-2507 — 1/1 passes.

The openai-compatible AI SDK provider defaulted `supportsStructuredOutputs` to false, which downgraded every `generateObject` call to schema-less `response_format: { type: "json_object" }`. Thinking models returned JSON that did not satisfy strict Valibot schemas (notably MechanismAssessment's required M1-M7 keys and TestAssessment's minLength reasons), so each call retried up to 5x — which, multiplied across user/target/judge calls with minute-long thinking-model latencies, looked like an infinite loop. Mark vllm as supporting structured outputs and thread the flag through ResolvedTarget so the AI SDK forwards the full JSON Schema for server-side xgrammar enforcement. Expose the same toggle on `models.json` entries for custom providers.

The 300-token per-request cap in generateUserMessage starves reasoning models: the <think> trace consumes the entire budget before any content token is emitted, so message.content comes back empty. Let the provider-level cap (`<PREFIX>_MAX_TOKENS` or models.json `maxTokens`) govern instead — that's the right place to bound generation since it scales with the model. Also bump the README's recommended VLLM_MAX_TOKENS from 8192 to 32768 and document the reasoning+answer shared budget, the `max_model_len` ceiling, and the `finish_reason="length"` → retry symptom.

…forcement Now that the hardcoded 300-token cap on user-message generation is gone, make explicit that `<PREFIX>_MAX_TOKENS` is the single knob governing every call site — and that it's the right place to budget for reasoning models. Also add a note about `supportsStructuredOutputs` for vllm so users debugging schema-validation retries know where to look.

Lets callers address OpenAI models directly via the `openai/<model-id>` slug (e.g. `openai/gpt-5-nano`) without adding a per-model `models.json` entry. Defaults to `https://api.openai.com/v1`; `OPENAI_BASE_URL` overrides for proxies. Gateway-routed OpenAI models keep their named `models.json` entries unchanged.

The `.env` file is auto-loaded by `yarn kora:env`, so the OpenAI direct path uses the same flow as the Vercel AI gateway key. Add `OPENAI_API_KEY` / `OPENAI_BASE_URL` placeholders to `.env.example` and update the README quick example to use `kora:env`.

Two tests used `openai/gpt-4o` as an example of an "unknown prefix", which is no longer true now that `openai` is a registered provider. Replace those cases with a synthetic `nonprovider/...` slug, and add a dedicated test asserting the openai prefix resolves to api.openai.com with `OPENAI_API_KEY` and structured-outputs support.

fenfenai

Review Summary

Reviewed 10 files, 89 additions / 10 deletions across bug detection, error handling, type design, test coverage, comment quality, and guidelines compliance.

No findings at confidence ≥75. Approving.

Positive observations

supportsStructuredOutputs is added as optional on both OpenAICompatibleProvider and VOpenAICompatibleModelConfig, with ?? false defaults at both fromParsedSlug and fromConfig — safe asymmetric design (opt-in per provider, conservative for unknown configs).
The structured-output fallback in getStructuredResponse already handles json_schema-rejecting models via prompt injection + extractJson, so flipping the flag on for openai/vllm degrades safely.
Tests for the unknown-prefix case were correctly migrated from openai/gpt-4o to a synthetic nonprovider/... slug — necessary and easy to forget.
Comments explain the why (e.g., the json_object vs json_schema distinction and the consequence for strict MechanismAssessment schemas).
.env.example and README updates align with yarn kora:env's --env-file flow.

Lower-confidence observations (FYI, below report threshold)

generateUserMessage.ts removal of maxTokens: 300 also widens scope to gateway-routed user models — intentional per the PR description and updated test comment, but worth a heads-up since gateway callers have no <PREFIX>_MAX_TOKENS equivalent.
End-to-end propagation of supportsStructuredOutputs from provider entry through ResolvedTarget to createOpenAICompatible isn't asserted by a test. The flow is short and reads correctly; minor coverage gap, not a bug.

EYH0602 added 6 commits May 14, 2026 23:51

EYH0602 requested a review from fenfenai May 20, 2026 02:34

fenfenai approved these changes May 20, 2026

View reviewed changes

EYH0602 merged commit b7d8bc7 into soulfuzz-main May 20, 2026

EYH0602 deleted the feat/openai-compatible-endpoint branch May 20, 2026 02:38

EYH0602 mentioned this pull request May 21, 2026

Register --prompts soul variant reading from SOUL_MD_PATH #4

Open

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[feat] register openai as built-in openai-compatible provider#3

[feat] register openai as built-in openai-compatible provider#3
EYH0602 merged 6 commits into
soulfuzz-mainfrom
feat/openai-compatible-endpoint

EYH0602 commented May 20, 2026

Uh oh!

fenfenai left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

EYH0602 commented May 20, 2026

Summary

Test plan

Uh oh!

fenfenai left a comment

Choose a reason for hiding this comment

Review Summary

Positive observations

Lower-confidence observations (FYI, below report threshold)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants