Scope
Ship the second ModelBackend implementation in core: an openai backend that talks to the OpenAI API (or any OpenAI-compatible endpoint via baseUrl override). Validates the Phase 1 interface against a real provider with native tool-call support and a different streaming shape from Ollama, catching design flaws before more backends land.
Also lands toolMode: 'return' end-to-end: caller passes tools: ToolDef[], model returns tool-call requests in the GenerateResult or GenerateChunk, caller resolves externally. This is the trivial half of the tool-call story; toolMode: 'auto' orchestration is split out to #612.
What ships:
openai backend class implementing ModelBackend with embed, generate, generateStream.
- Native tool-call support:
GenerateOpts.tools passed to the model; tool-call deltas surface in GenerateChunk.deltaToolCalls and finalize in GenerateResult.toolCalls.
responseFormat mapping: text / JSON / JSON Schema modes.
- Streaming via the OpenAI SDK's chunked completion API →
AsyncIterable<GenerateChunk> per Phase 1's contract.
AbortSignal propagation into SDK calls.
- Per-call accounting (including prompt_tokens, completion_tokens, latency) written through
hdb_model_calls.
API surface
Implements the interface from Phase 1 (#628). Capability negotiation:
capabilities(): ModelCapabilities {
return {
embed: true,
generate: true,
stream: true,
tools: true, // first backend with native tool-call support
adapters: false, // OpenAI does not surface LoRA adapters externally
};
}
Configuration:
models:
embedding:
high-quality:
backend: openai
model: <your-embedding-model> # e.g. text-embedding-3-large
apiKey: ${OPENAI_API_KEY} # env-var expansion via existing config pattern
generative:
default:
backend: openai
model: <your-generative-model> # e.g. gpt-4o or gpt-4-turbo
apiKey: ${OPENAI_API_KEY}
baseUrl: https://api.openai.com/v1 # optional; supports OpenAI-compatible endpoints (Azure OpenAI, Together AI, OpenRouter, vLLM, etc.)
toolMode: 'return' semantics
Caller passes tools: ToolDef[] and toolMode: 'return' (default). The backend:
- Translates
ToolDef[] to OpenAI's tools parameter on the API call.
- Receives the model's response which may include
tool_calls.
- Returns those on
GenerateResult.toolCalls (non-streaming) or yields them via GenerateChunk.deltaToolCalls (streaming).
- Does not invoke any tool — the caller decides what to do with the tool-call requests.
Caller code looks like:
const result = await scope.models.generate(messages, { tools, toolMode: 'return' });
if (result.toolCalls?.length) {
// caller executes the tool, builds a tool-response message, calls generate again
}
toolMode: 'auto' (orchestrator-driven loop) is #612's scope; this phase only ships the type-level field acceptance and the 'return' path.
Implementation notes
- SDK: the official
openai npm package.
- SDK pinning: lock to a specific minor version, bump on a deliberate cadence per the existing Harper third-party trust model (a few days after upstream release).
- Streaming: SDK's
stream: true path yields delta events; backend translates them to GenerateChunk.
- Token-count fields from
result.usage (prompt_tokens, completion_tokens, total_tokens) map directly to TokenUsage.
gpu_ms is not reported by OpenAI's API; left undefined in the accounting record.
inputType: 'document' | 'query' on EmbedOpts: OpenAI's embedding models don't currently distinguish; field is ignored when present.
baseUrl override allows OpenAI-compatible third parties (Azure OpenAI, Together, OpenRouter, vLLM's own OpenAI shim, etc.) — sets the foundation for the FAB-503 fabric backend to also speak OpenAI-shape if it wants to.
signal: AbortSignal passed into the SDK's signal option; client disconnect → SDK aborts the underlying request.
Files
| Path |
Change |
resources/models/backends/openai.ts |
new — OpenAIBackend class |
resources/models/backends/index.ts |
extended — register openai factory |
package.json |
new dep — openai (pinned version) |
test/models/openai.test.ts |
new — integration tests (mocked HTTP in CI; live test against real API behind an env-gated flag) |
Acceptance criteria
Out of scope
Stacks on
Independently shippable after Phase 1; parallel-able with Phase 2.
Hard prerequisites
Branch & PR conventions
Smoke test
# Prerequisites:
# - OPENAI_API_KEY set in env
# - Harper config has models.generative.default = { backend: openai, model: gpt-4o-mini, apiKey: ${OPENAI_API_KEY} }
# In a Resource method:
class ChatTest extends Resource {
async post(_target, body, _request) {
return await scope.models.generate([{ role: 'user', content: body.q }]);
}
}
curl -X POST http://localhost:9926/ChatTest/ \
-H 'Content-Type: application/json' \
-d '{"q": "what is 2+2?"}'
# Expected: { content: "4" (or similar), finishReason: "stop", ... }
# Verify: SELECT * FROM system.hdb_model_calls WHERE backend = 'openai' ORDER BY $createdtime DESC LIMIT 1
# shows method='generate', model='gpt-4o-mini', prompt_tokens, completion_tokens, latency_ms, success=true.
# Tool-call smoke test (return mode):
const tools = [{ name: 'get_weather', description: '...', parameters: {/* JSON Schema */} }];
const result = await scope.models.generate(messages, { tools, toolMode: 'return' });
// Expected: if the model decides to call get_weather, result.toolCalls has one entry; result.content may be empty.
Tracking
Part of #510. Sub-issue 3 of 6.
🤖 Generated with Claude Code
Scope
Ship the second
ModelBackendimplementation in core: anopenaibackend that talks to the OpenAI API (or any OpenAI-compatible endpoint viabaseUrloverride). Validates the Phase 1 interface against a real provider with native tool-call support and a different streaming shape from Ollama, catching design flaws before more backends land.Also lands
toolMode: 'return'end-to-end: caller passestools: ToolDef[], model returns tool-call requests in theGenerateResultorGenerateChunk, caller resolves externally. This is the trivial half of the tool-call story;toolMode: 'auto'orchestration is split out to #612.What ships:
openaibackend class implementingModelBackendwithembed,generate,generateStream.GenerateOpts.toolspassed to the model; tool-call deltas surface inGenerateChunk.deltaToolCallsand finalize inGenerateResult.toolCalls.responseFormatmapping: text / JSON / JSON Schema modes.AsyncIterable<GenerateChunk>per Phase 1's contract.AbortSignalpropagation into SDK calls.hdb_model_calls.API surface
Implements the interface from Phase 1 (#628). Capability negotiation:
Configuration:
toolMode: 'return'semanticsCaller passes
tools: ToolDef[]andtoolMode: 'return'(default). The backend:ToolDef[]to OpenAI'stoolsparameter on the API call.tool_calls.GenerateResult.toolCalls(non-streaming) or yields them viaGenerateChunk.deltaToolCalls(streaming).Caller code looks like:
toolMode: 'auto'(orchestrator-driven loop) is #612's scope; this phase only ships the type-level field acceptance and the'return'path.Implementation notes
openainpm package.stream: truepath yields delta events; backend translates them toGenerateChunk.result.usage(prompt_tokens,completion_tokens,total_tokens) map directly toTokenUsage.gpu_msis not reported by OpenAI's API; left undefined in the accounting record.inputType: 'document' | 'query'onEmbedOpts: OpenAI's embedding models don't currently distinguish; field is ignored when present.baseUrloverride allows OpenAI-compatible third parties (Azure OpenAI, Together, OpenRouter, vLLM's own OpenAI shim, etc.) — sets the foundation for the FAB-503fabricbackend to also speak OpenAI-shape if it wants to.signal: AbortSignalpassed into the SDK'ssignaloption; client disconnect → SDK aborts the underlying request.Files
resources/models/backends/openai.tsOpenAIBackendclassresources/models/backends/index.tsopenaifactorypackage.jsonopenai(pinned version)test/models/openai.test.tsAcceptance criteria
OpenAIBackendimplementsModelBackendper Phase 1's interface.scope.models.embed()withbackend: openaiproduces a vector from OpenAI's embedding endpoint.scope.models.generate()withbackend: openaiproduces a completion.scope.models.generateStream()yields content deltas; tool-call deltas surface viaGenerateChunk.deltaToolCallswhen the model uses tools.toolMode: 'return'end-to-end: caller passestools, backend translates to OpenAI'stoolsparam, model's tool-call requests reach the caller viaGenerateResult.toolCalls/GenerateChunk.deltaToolCalls.responseFormat: 'text' | 'json' | { schema }correctly maps to OpenAI'sresponse_formatparameter.backend: 'openai', model, token counts, latency, success.AbortSignalfromBackendOptscancels in-flight SDK calls.baseUrloverride works against an OpenAI-compatible third party (test fixture or mock).ToolDef[]↔ OpenAItools,responseFormatmapping, streaming delta shape).OPENAI_API_KEYenv gate.Out of scope
toolMode: 'auto'orchestration — split to Add agent-loop orchestration /toolMode: 'auto'toscope.models#612 (agent-loop orchestration).ModelCallResult.pendingis reserved in the interface but no backend emits it in v1.Stacks on
ModelBackendinterface, backend registry, andhdb_model_callswriter.Independently shippable after Phase 1; parallel-able with Phase 2.
Hard prerequisites
request.signalexposed on Resource methods) —Modelsfacade readsctx.signalfrom ALS; this phase consumes it throughBackendOpts.signalinto the SDK'ssignaloption.Branch & PR conventions
feat/models-openai-backendmain(after Phase 1 merges).Closes #<self>; references Add unified model-access API (scope.models) #510 viaTracking: #510.Smoke test
Tracking
Part of #510. Sub-issue 3 of 6.
🤖 Generated with Claude Code