Skip to content

feat: switch to OpenAI-compatible /v1/ endpoints#23

Merged
raphasouthall merged 2 commits intomainfrom
feat/openai-compatible-endpoints
Mar 16, 2026
Merged

feat: switch to OpenAI-compatible /v1/ endpoints#23
raphasouthall merged 2 commits intomainfrom
feat/openai-compatible-endpoints

Conversation

@raphasouthall
Copy link
Owner

Summary

  • Replace all Ollama-native API calls (/api/generate, /api/embed) with OpenAI-compatible /v1/chat/completions and /v1/embeddings endpoints across 11 call sites in 8 files
  • Add reasoning_effort: "none" as the standard replacement for Ollama's think: false — works with thinking models like qwen3.5
  • Strip markdown fences from JSON responses for cross-provider compatibility
  • Add llm_api_key / embed_api_key config fields with Bearer auth headers for cloud providers
  • Update neurostack init wizard to optionally prompt for API keys (auto-prompts for cloud URLs, skips for localhost)

What this unlocks

Users can point llm_url/embed_url at any OpenAI-compatible provider by changing the URL + key in config:

  • Ollama (unchanged — serves both protocols)
  • vLLM, llama.cpp server, LM Studio (self-hosted)
  • Together AI, Groq, OpenRouter (cloud)

Test plan

  • uv run ruff check — lint clean
  • uv run pytest — 160 passed, 6 skipped
  • Live triple extraction via Ollama /v1/ endpoint verified
  • Manual: neurostack init with cloud URL prompts for API key
  • Manual: test against a cloud provider (Together AI / Groq)

Closes #21

🤖 Generated with Claude Code

Replace all Ollama-native API calls (/api/generate, /api/embed) with
OpenAI-compatible /v1/chat/completions and /v1/embeddings endpoints.
This enables any OpenAI-compatible provider (Ollama, vLLM, llama.cpp,
Together AI, Groq, OpenRouter) with just a URL change.

- Convert 11 call sites across 8 files to OpenAI request/response format
- Replace Ollama "think: false" with standard "reasoning_effort: none"
- Strip markdown fences from JSON responses for provider compatibility
- Add llm_api_key / embed_api_key config fields with Bearer auth headers
- Support config.toml and NEUROSTACK_LLM_API_KEY / NEUROSTACK_EMBED_API_KEY env vars

Closes #21
The setup wizard now prompts for API keys when needed:
- Auto-prompts when LLM URL is a cloud provider (non-localhost)
- Offers optional key config for local users via confirm prompt
- Shares LLM key for embeddings when both use the same URL
- Writes keys to config.toml only when set
@raphasouthall raphasouthall merged commit ffffa4b into main Mar 16, 2026
5 checks passed
@raphasouthall raphasouthall deleted the feat/openai-compatible-endpoints branch March 16, 2026 23:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Switch from Ollama-native to OpenAI-compatible /v1/ endpoints

1 participant