Skip to content

feat: add OpenAI-compatible LLM provider#307

Open
fatinghenji wants to merge 2 commits into
rohitg00:mainfrom
fatinghenji:feat/openai-llm-provider-v2
Open

feat: add OpenAI-compatible LLM provider#307
fatinghenji wants to merge 2 commits into
rohitg00:mainfrom
fatinghenji:feat/openai-llm-provider-v2

Conversation

@fatinghenji
Copy link
Copy Markdown

@fatinghenji fatinghenji commented May 12, 2026

Summary

Adds a new openai LLM provider that uses raw fetch to call any OpenAI-compatible /v1/chat/completions endpoint.

Changes

  • src/types.ts: Add openai to ProviderType union
  • src/providers/openai.ts: New OpenAIProvider class using raw fetch (no SDK dependency)
    • Supports any /v1/chat/completions endpoint
    • Respects OPENAI_API_KEY, OPENAI_BASE_URL, OPENAI_MODEL env vars
    • New: OPENAI_REASONING_EFFORT passthrough for thinking models (Ollama Cloud, etc.)
    • Fallback: returns reasoning if content is empty
  • src/providers/index.ts: Wire openai case into createBaseProvider()
  • src/config.ts:
    • Add OPENAI_API_KEY detection to detectProvider() (with OPENAI_API_KEY_FOR_LLM opt-out)
    • Add OPENAI_API_KEY to detectLlmProviderKind()
    • Add openai to VALID_PROVIDERS set
    • Update no-key warning message
  • README.md: Add OpenAI to LLM provider table and document all env vars

Supported Endpoints

Service OPENAI_BASE_URL Notes
OpenAI official https://api.openai.com (default)
DeepSeek https://api.deepseek.com/v1
SiliconFlow https://api.siliconflow.cn/v1
Azure OpenAI https://{resource}.openai.azure.com/openai/deployments/{deployment}
vLLM / LM Studio http://localhost:8000/v1 or http://localhost:1234/v1
Ollama http://localhost:11434/v1 With --enable-openai

Configuration Example

# Embedding + LLM both via SiliconFlow
OPENAI_API_KEY=sk-......
OPENAI_BASE_URL=https://api.siliconflow.cn/v1
OPENAI_MODEL=deepseek-ai/DeepSeek-V3

# For Ollama Cloud thinking models, set reasoning effort to ensure content is populated
OPENAI_REASONING_EFFORT=none

Backwards Compatibility

  • Default behavior unchanged: OPENAI_API_KEY is now checked first in detectProvider(), but only activates when the key is present
  • Users who only use OPENAI_API_KEY for embedding and prefer another LLM provider can set OPENAI_API_KEY_FOR_LLM=false to skip auto-detection
  • No breaking changes to existing provider configurations

Testing

  • npm run build passes
  • npm test passes

Checklist

  • Build passes
  • Tests pass
  • No new dependencies
  • Backwards compatible
  • Follows existing provider patterns (raw fetch, env var config)
  • README updated

Closes #185
Supersedes #240

Summary by CodeRabbit

  • New Features

    • Added OpenAI as a supported provider for memory compression and summarization with configurable model and base URL.
    • Provider detection and fallback logic updated to recognize OpenAI environment variables.
  • Documentation

    • README updated with OpenAI environment variables (API key, base URL, model, reasoning effort) and guidance for reasoning-model settings.

Review Change Stack

- Add OpenAIProvider using raw fetch (no SDK dependency)
- Supports any /v1/chat/completions endpoint: OpenAI, DeepSeek,
  SiliconFlow, Azure OpenAI, vLLM, LM Studio, Ollama
- Auto-detects OPENAI_API_KEY with OPENAI_API_KEY_FOR_LLM opt-out
- Add OPENAI_REASONING_EFFORT passthrough for thinking models
  (e.g. Ollama Cloud kimi-k2.6) to ensure content is populated
- Update README with OpenAI provider table, env vars, and reasoning config
@vercel
Copy link
Copy Markdown

vercel Bot commented May 12, 2026

@fatinghenji is attempting to deploy a commit to the rohitg00's projects Team on Vercel.

A member of the Team first needs to authorize it.

@coderabbitai
Copy link
Copy Markdown

coderabbitai Bot commented May 12, 2026

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: cc3b39fd-d4d9-427f-811c-8654f3a8d775

📥 Commits

Reviewing files that changed from the base of the PR and between 4deeaa4 and 691d47c.

📒 Files selected for processing (2)
  • README.md
  • src/config.ts
✅ Files skipped from review due to trivial changes (1)
  • README.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • src/config.ts

📝 Walkthrough

Walkthrough

Adds OpenAI as a supported LLM provider alongside existing ones. Updates the ProviderType union, extends configuration detection to recognize OPENAI_API_KEY, implements OpenAIProvider to handle OpenAI-compatible /v1/chat/completions calls, integrates it into the provider factory, and documents the new environment variables.

Changes

OpenAI Provider

Layer / File(s) Summary
Provider type and detection
src/types.ts, src/config.ts
Adds "openai" to ProviderType union. detectProvider() recognizes OPENAI_API_KEY (guarded by OPENAI_API_KEY_FOR_LLM !== "false") and returns an openai config using OPENAI_MODEL and OPENAI_BASE_URL. Warning message, detectLlmProviderKind(), and loadFallbackConfig() whitelist are updated to support OpenAI.
OpenAI provider implementation
src/providers/openai.ts
New OpenAIProvider class implements MemoryProvider, exposing compress() and summarize() public methods. Both delegate to a private call() method that POSTs to /v1/chat/completions with system/user messages, optional reasoning_effort, validates HTTP responses, and extracts message.content or message.reasoning from the response.
Provider factory integration
src/providers/index.ts
Imports OpenAIProvider and adds an "openai" case to createBaseProvider that reads OPENAI_API_KEY, throws if missing, and instantiates the provider with config.model, config.maxTokens, and config.baseURL.
Configuration documentation
README.md
.env example documents OPENAI_API_KEY, OPENAI_BASE_URL, OPENAI_MODEL, OPENAI_REASONING_EFFORT, and OPENAI_API_KEY_FOR_LLM with guidance for reasoning models.

Estimated code review effort

🎯 3 (Moderate) | ⏱️ ~20 minutes

Possibly related PRs

  • rohitg00/agentmemory#103: Adds a different LLM provider implementation using the same detection and factory wiring pattern across src/config.ts, src/providers/index.ts, and src/types.ts.

Poem

🐰 I hopped through code to add a brand-new key,

OPENAI joins the provider family.
Messages posted, responses parsed with care,
Compress and summarize now dance in the air.
🌿✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title 'feat: add OpenAI-compatible LLM provider' directly and accurately summarizes the main change: adding a new OpenAI provider implementation.
Linked Issues check ✅ Passed The PR fully implements requirements from #185: new OpenAIProvider for OpenAI-compatible endpoints, detection logic with OPENAI_API_KEY_FOR_LLM opt-out, configurable baseURL/model via env vars, and documentation.
Out of Scope Changes check ✅ Passed All changes are directly related to implementing the OpenAI provider: new provider class, wiring into config/detection logic, type updates, and documentation—no unrelated changes detected.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 ESLint

If the error stems from missing dependencies, add them to the package.json file. For unrecoverable errors (e.g., due to private dependencies), disable the tool in the CodeRabbit configuration.

ESLint skipped: no ESLint configuration detected in root package.json. To enable, add eslint to devDependencies.

Tip

💬 Introducing Slack Agent: The best way for teams to turn conversations into code.

Slack Agent is built on CodeRabbit's deep understanding of your code, so your team can collaborate across the entire SDLC without losing context.

  • Generate code and open pull requests
  • Plan features and break down work
  • Investigate incidents and troubleshoot customer tickets together
  • Automate recurring tasks and respond to alerts with triggers
  • Summarize progress and report instantly

Built for teams:

  • Shared memory across your entire org—no repeating context
  • Per-thread sandboxes to safely plan and execute work
  • Governance built-in—scoped access, auditability, and budget controls

One agent for your entire SDLC. Right inside Slack.

👉 Get started


Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@fatinghenji
Copy link
Copy Markdown
Author

@rohitg00 This is the cleaned-up version of #240. All review feedback has been addressed:

  1. Timeout scope creep removed — rebased onto latest main, no hook script changes
  2. README updated — OpenAI provider table + env vars documented
  3. Ollama Cloud thinking models — added passthrough per @flamerged's testing (thanks!). When set to bnoneb, the request body includes , preventing empty on models like . Also added fallback: if is empty but is present, we return reasoning instead of throwing.

Ready for review.

Copy link
Copy Markdown

@coderabbitai coderabbitai Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (2)
src/config.ts (1)

53-53: ⚡ Quick win

Prefer naming over WHAT comments in provider detection.

Please remove/reword this branch comment and rely on clear naming/structure instead.

As per coding guidelines, "Avoid code comments explaining WHAT — use clear naming instead".

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/config.ts` at line 53, Remove the inline "// OpenAI-compatible: supports
OpenAI, DeepSeek, SiliconFlow, Azure, vLLM, LM Studio" comment and instead
express that intent in code by renaming the related symbol(s) (e.g., a
boolean/branch, array, or function) to a descriptive name such as
openAICompatibleProviders or isOpenAICompatible(provider); update the provider
detection branch to use that renamed identifier and adjust any usages
accordingly so the code reads self-documentingly without the WHAT-style comment.
src/providers/openai.ts (1)

7-30: ⚡ Quick win

Remove WHAT-style comments and keep only intent/constraints.

This block/inline comment content is mostly descriptive of implementation details already clear from code.

As per coding guidelines, "Avoid code comments explaining WHAT — use clear naming instead".

Also applies to: 90-90

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/providers/openai.ts` around lines 7 - 30, The header block comment for
the OpenAI-compatible provider is written in WHAT-style/descriptive details;
remove or shrink it to a concise intent and constraints note (e.g.,
"OpenAI-compatible LLM provider; requires OPENAI_API_KEY; supports configurable
OPENAI_BASE_URL, OPENAI_MODEL, MAX_TOKENS, OPENAI_REASONING_EFFORT") and drop
implementation-specific examples and long prose so the top-of-file comment only
states purpose and configuration constraints referenced by the module (symbols:
the module header comment, OPENAI_API_KEY, OPENAI_BASE_URL, OPENAI_MODEL,
MAX_TOKENS, OPENAI_REASONING_EFFORT).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@src/config.ts`:
- Around line 169-170: The condition in detectLlmProviderKind currently treats
any presence of OPENAI_API_KEY as opting into the "llm" provider; change the
check that uses hasRealValue(env["OPENAI_API_KEY"]) so it also respects
OPENAI_API_KEY_FOR_LLM being explicitly disabled. Specifically, update the logic
around hasRealValue(env["OPENAI_API_KEY"]) (in detectLlmProviderKind) to require
that OPENAI_API_KEY_FOR_LLM is not set to a falsey/disabled value (e.g., treat
"false" case-insensitively as disabling) — for example, only consider
OPENAI_API_KEY when hasRealValue(env["OPENAI_API_KEY_FOR_LLM"]) is false or
env["OPENAI_API_KEY_FOR_LLM"].toLowerCase() !== "false" (or use a helper
parseBoolean) so that an explicit OPENAI_API_KEY_FOR_LLM=false prevents
reporting "llm".

In `@src/providers/openai.ts`:
- Around line 68-75: The outbound fetch to the OpenAI-compatible endpoint (the
call creating `response` with `fetch(url, { method: "POST", headers: { ...
Authorization: Bearer ${this.apiKey} }, body: JSON.stringify(body) })`) needs a
timeout using an AbortController: create an AbortController, pass its signal
into fetch, set a timer to call controller.abort() after a configurable timeout
(e.g., DEFAULT_TIMEOUT_MS), clear the timer once fetch completes, and handle the
abort error to surface a clear timeout error instead of letting the call hang;
update the method that performs this request to use the controller and ensure
the timer is cleaned up on success or error.

---

Nitpick comments:
In `@src/config.ts`:
- Line 53: Remove the inline "// OpenAI-compatible: supports OpenAI, DeepSeek,
SiliconFlow, Azure, vLLM, LM Studio" comment and instead express that intent in
code by renaming the related symbol(s) (e.g., a boolean/branch, array, or
function) to a descriptive name such as openAICompatibleProviders or
isOpenAICompatible(provider); update the provider detection branch to use that
renamed identifier and adjust any usages accordingly so the code reads
self-documentingly without the WHAT-style comment.

In `@src/providers/openai.ts`:
- Around line 7-30: The header block comment for the OpenAI-compatible provider
is written in WHAT-style/descriptive details; remove or shrink it to a concise
intent and constraints note (e.g., "OpenAI-compatible LLM provider; requires
OPENAI_API_KEY; supports configurable OPENAI_BASE_URL, OPENAI_MODEL, MAX_TOKENS,
OPENAI_REASONING_EFFORT") and drop implementation-specific examples and long
prose so the top-of-file comment only states purpose and configuration
constraints referenced by the module (symbols: the module header comment,
OPENAI_API_KEY, OPENAI_BASE_URL, OPENAI_MODEL, MAX_TOKENS,
OPENAI_REASONING_EFFORT).
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 3e6eb155-a046-45d9-9423-c61876ecf37b

📥 Commits

Reviewing files that changed from the base of the PR and between 292e9f6 and 4deeaa4.

📒 Files selected for processing (5)
  • README.md
  • src/config.ts
  • src/providers/index.ts
  • src/providers/openai.ts
  • src/types.ts

Comment thread src/config.ts Outdated
Comment thread src/providers/openai.ts
Comment on lines +68 to +75
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${this.apiKey}`,
},
body: JSON.stringify(body),
});
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Add a timeout to outbound OpenAI-compatible requests.

These fetch calls can hang indefinitely on network/upstream stalls, which can block provider operations.

Suggested fix
-    const response = await fetch(url, {
-      method: "POST",
-      headers: {
-        "Content-Type": "application/json",
-        Authorization: `Bearer ${this.apiKey}`,
-      },
-      body: JSON.stringify(body),
-    });
+    const controller = new AbortController();
+    const timeout = setTimeout(() => controller.abort(), 30_000);
+    let response: Response;
+    try {
+      response = await fetch(url, {
+        method: "POST",
+        headers: {
+          "Content-Type": "application/json",
+          Authorization: `Bearer ${this.apiKey}`,
+        },
+        body: JSON.stringify(body),
+        signal: controller.signal,
+      });
+    } finally {
+      clearTimeout(timeout);
+    }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
const response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${this.apiKey}`,
},
body: JSON.stringify(body),
});
const controller = new AbortController();
const timeout = setTimeout(() => controller.abort(), 30_000);
let response: Response;
try {
response = await fetch(url, {
method: "POST",
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${this.apiKey}`,
},
body: JSON.stringify(body),
signal: controller.signal,
});
} finally {
clearTimeout(timeout);
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@src/providers/openai.ts` around lines 68 - 75, The outbound fetch to the
OpenAI-compatible endpoint (the call creating `response` with `fetch(url, {
method: "POST", headers: { ... Authorization: Bearer ${this.apiKey} }, body:
JSON.stringify(body) })`) needs a timeout using an AbortController: create an
AbortController, pass its signal into fetch, set a timer to call
controller.abort() after a configurable timeout (e.g., DEFAULT_TIMEOUT_MS),
clear the timer once fetch completes, and handle the abort error to surface a
clear timeout error instead of letting the call hang; update the method that
performs this request to use the controller and ensure the timer is cleaned up
on success or error.

@justmenyou
Copy link
Copy Markdown

any update?

`detectProvider()` correctly gates OpenAI auto-detection on
OPENAI_API_KEY_FOR_LLM !== "false", but `detectLlmProviderKind()` did
not — so users who set OPENAI_API_KEY only for embeddings (via the
existing OPENAI_BASE_URL + OPENAI_EMBEDDING_MODEL flow from rohitg00#186)
would see /agentmemory/config/flags report `provider: llm` even
though detectProvider() routed them to the noop provider.

Also clarify in the README that OPENAI_REASONING_EFFORT is honored
only by reasoning models (o1, o3, gpt-*-reasoning) and providers
that mirror that schema (Ollama Cloud thinking models). Standard
chat models reject the field with 400.

Verified:
- OPENAI_API_KEY=sk-... + OPENAI_API_KEY_FOR_LLM=false now returns
  "noop" from detectLlmProviderKind (was "llm" before the fix).
- OPENAI_API_KEY=sk-... alone still returns "llm" (intended default).
- npm run build clean.

Note: 10 pre-existing test failures on test/mcp-standalone.test.ts
are a stale-branch artefact — this branch is 10 commits behind main
and is missing the MCP shim fixes that landed via rohitg00#311 / rohitg00#327.
Recommend rebasing on main (or "Update branch" via the GitHub UI)
before merge.
@rohitg00
Copy link
Copy Markdown
Owner

@fatinghenji — pushed two small fixes to your branch via maintainer-edit access. Please review the diff and shout if anything looks wrong.

  1. src/config.ts:170detectLlmProviderKind() was reading OPENAI_API_KEY without the OPENAI_API_KEY_FOR_LLM !== "false" gate that detectProvider() already honors at line 54. Users who set OPENAI_API_KEY only for embeddings (via the OPENAI_BASE_URL + OPENAI_EMBEDDING_MODEL flow from feat: support OPENAI_BASE_URL and OPENAI_EMBEDDING_MODEL for OpenAI-compatible endpoints #186) would see /agentmemory/config/flags report provider: llm even though detectProvider() correctly returned noop. Fix mirrors the existing gate. Verified:

    • OPENAI_API_KEY=sk-... OPENAI_API_KEY_FOR_LLM=false now returns noop (was llm).
    • OPENAI_API_KEY=sk-... alone still returns llm (intended default).
  2. README.md — clarified that OPENAI_REASONING_EFFORT is honored only by reasoning models (o1, o3, gpt-*-reasoning) and providers that mirror that schema (Ollama Cloud thinking models). Standard chat models reject the field with 400. The existing fallback (return message.reasoning if message.content is empty) covers the Ollama Cloud case.

Skipped findings (won't block merge):

  • Outbound fetch timeout — src/providers/{anthropic,gemini,openrouter,minimax}.ts all lack AbortController; this is a same-pattern repo-wide concern that deserves its own follow-up rather than gating this PR.
  • CodeRabbit WHAT-style comment nitpicks — consistent with existing provider files, low ROI.

Stale-branch note: this branch is currently 10 commits behind main. test/mcp-standalone.test.ts has 10 failures on bare HEAD that disappear after merging main (the MCP shim was reworked in #311 / #327). Please use the GitHub "Update branch" button (or git pull --rebase origin main) before final merge — your src/providers/openai.ts doesn't touch anything that conflicts.

Will land this PR + close superseded issues (#232 Ollama LLM is fully covered by OPENAI_BASE_URL=http://localhost:11434/v1 per your table) once the branch is up to date and CI is green.

Thanks for cleaning #240 up into this shape.

anthony-spruyt added a commit to anthony-spruyt/spruyt-labs that referenced this pull request May 15, 2026
Disable AUTO_COMPRESS, GRAPH_EXTRACTION_ENABLED, and
CONSOLIDATION_ENABLED — all three call Gemini Flash for
summarization/compression. Keys retained for future use when
agentmemory ships an OpenAI-compatible provider (rohitg00/agentmemory#307)
to target local vLLM/Gemma4.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants