docs(readme): document Ollama / LM Studio / vLLM local-model setup (#671) by rohitg00 · Pull Request #697 · rohitg00/agentmemory

rohitg00 · 2026-05-28T09:19:31Z

Summary

Discussion #671 — user asked "why do all the tools run on paid LLMs? we have hardware to run them locally". This is a docs gap: agentmemory already speaks OpenAI-API-compatible to any local server (Ollama, LM Studio, vLLM, llama.cpp), but the path was buried in a single-line comment inside the env-example block at README:1255. Users missed it.

Changes (README.md only)

New row in the LLM Providers table for Local (Ollama / LM Studio / vLLM / llama.cpp) with the exact env to set.
New row for explicit OpenAI API (was supported via auto-detect but never listed alongside Anthropic / Gemini / etc.).
New subsection "Local models (Ollama / LM Studio / vLLM)" under "LLM Providers" with:
- Copy-pasteable Ollama config (OPENAI_BASE_URL=http://localhost:11434/v1)
- Copy-pasteable LM Studio config (OPENAI_BASE_URL=http://localhost:1234/v1)
- Model-pick table for memory compression workloads: qwen2.5-coder:7b, llama3.2:3b, mistral:7b-instruct, deepseek-r1:7b — sized at 2-5GB so they fit on consumer hardware
- Reasoning-class empty-content pitfall callout (the <think> block + missing content field problem that bites local users of o1-style models)
- Note that EMBEDDING_PROVIDER=local (the default) gives you BGE-small on-device, so fully-local stacks need no extra config

Test plan

Visual: rendered the new section locally; tables align, anchor link #local-models-ollama-lm-studio-vllm works.
No source changes — runtime already supports these endpoints via OpenAIProvider + OPENAI_BASE_URL. Just naming the path so users find it.

Closes #671.

Summary by CodeRabbit

Documentation
- Enhanced LLM provider configuration guide with OpenAI API integration details and model override options.
- Added setup instructions for local models (Ollama, LM Studio, vLLM) with memory recommendations and behavior notes.
- Documented default local embeddings support and configuration parameters.

Users were missing the OpenAI-compatible local-model path because it was buried in a comment inside the env-example block. xerudro asked "why all the tools runs on paid LMs" in #671 — agentmemory already supports any OpenAI-API-compatible server; the docs just didn't say so clearly. Adds a dedicated "Local models (Ollama / LM Studio / vLLM)" subsection under "LLM Providers" with concrete config examples for both Ollama (port 11434) and LM Studio (port 1234), plus a model-pick table (qwen2.5-coder:7b, llama3.2:3b, mistral:7b-instruct, deepseek-r1:7b) sized for memory compression workloads. Calls out the reasoning-class empty-content pitfall and the EMBEDDING_PROVIDER=local default for fully-local stacks. Also adds OpenAI as an explicit row in the provider table (it was already supported via OPENAI_API_KEY auto-detection but never listed alongside Anthropic / Gemini / OpenRouter / MiniMax).

vercel · 2026-05-28T09:19:36Z

The latest updates on your projects. Learn more about Vercel for GitHub.

Project	Deployment	Actions	Updated (UTC)
agentmemory	Ready	Preview, Comment	May 28, 2026 9:19am

coderabbitai · 2026-05-28T09:19:45Z

📝 Walkthrough

Walkthrough

Documentation-only PR expanding the README configuration guide with explicit OpenAI API provider configuration and a new subsection explaining local LLM model setup via Ollama, LM Studio, and vLLM, including environment variable examples and model recommendations.

Changes

LLM Provider Configuration

Layer / File(s)	Summary
LLM providers table and local models configuration `README.md`	LLM Providers table adds OpenAI API row with default model and `OPENAI_MODEL` override. New "Local models" section documents Ollama, LM Studio, and vLLM as OpenAI-compatible servers with example `OPENAI_BASE_URL`/`OPENAI_MODEL` configuration, recommended model choices for memory compression/summarization, reasoning-class model behavior, and confirmation that local embeddings are shipped by default via `@xenova/transformers`.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~3 minutes

Poem

🐰 A guide for our friends with models at hand,
From OpenAI's cloud to Ollama's own land,
With Transformers local, embeddings fly free,
Configuration whispers, now clear for all to see! ✨

🚥 Pre-merge checks | ✅ 5

✅ Passed checks (5 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title clearly and specifically describes the main change: documenting local model setup for Ollama, LM Studio, and vLLM in the README, which aligns with the changeset.
Docstring Coverage	✅ Passed	No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests
Commit unit tests in branch docs/local-models

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 2

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@README.md`:
- Around line 1185-1186: The README incorrectly states the local embedding model
is "BGE-small" while the code and providers table use "all-MiniLM-L6-v2"; update
the documentation so the "Local embeddings ship out of the box…" paragraph, the
EMBEDDING_PROVIDER=local default mention, and the providers table consistently
reference "all-MiniLM-L6-v2" (or if you intend to change the implementation,
update the provider implementation and table to use "BGE-small"); ensure any
mention of default provider or example env var aligns with the chosen model.
- Line 1140: The table link to the "Local models (Ollama / LM Studio / vLLM)"
section uses the wrong fragment; update the link target in the table row that
contains the `OPENAI_API_KEY=local` entry so it matches the header slug
generated for `### Local models (Ollama / LM Studio / vLLM)` (i.e., change
`#local-models-ollama-lm-studio-vllm` to
`#local-models-ollama--lm-studio--vllm`) so the anchor fragment matches
markdownlint-cli2/MD051 expectations.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 87e2824c-f03e-4a63-973f-51349ff5c3cb

📥 Commits

Reviewing files that changed from the base of the PR and between a0da02b and a60adc0.

📒 Files selected for processing (1)

README.md

coderabbitai · 2026-05-28T09:26:47Z

 | Gemini | `GEMINI_API_KEY` | Also enables embeddings |
 | OpenRouter | `OPENROUTER_API_KEY` | Any model |
+| OpenAI API | `OPENAI_API_KEY` | Default `gpt-4o-mini`, override with `OPENAI_MODEL` |
+| **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](#local-models-ollama-lm-studio-vllm) below. |


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Verify anchor link resolution for the local models section # Check what anchor would be generated from the header echo "Section header at line 1143:" sed -n '1143p' README.md echo -e "\nExpected anchor (GitHub/most processors):" # GitHub lowercases, replaces spaces with -, strips special chars except - sed -n '1143p' README.md | sed 's/^###* //' | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9 -]//g' | tr -s ' ' '-' echo -e "\nLink fragment used at line 1140:" rg -n '`#local-models`' README.md

Repository: rohitg00/agentmemory

Length of output: 592

🏁 Script executed:

#!/bin/bash set -euo pipefail # Show potential markdownlint-cli2 configuration echo "== Repo markdownlint-cli2 config files =="; ls -la . | rg -n "markdownlint|markdown-lint" || true fd -t f -a "markdownlint" . || true fd -t f -a "markdownlint-cli2" . || true fd -t f -a ".markdownlint" . || true echo echo "== Search for MD051 configuration =="; rg -n "MD051" . || true echo echo "== package.json (scripts/devDeps) =="; if [ -f package.json ]; then cat package.json | sed -n '1,200p' else echo "No package.json found." fi echo echo "== Run markdownlint-cli2 for MD051 (best effort) =="; # Prefer local bin if present; fall back to npx. If neither works, we still get output/errors. if [ -d node_modules ]; then if [ -x node_modules/.bin/markdownlint-cli2 ]; then node_modules/.bin/markdownlint-cli2 --help >/dev/null 2>&1 || true node_modules/.bin/markdownlint-cli2 README.md -r "MD051" || true fi fi npx -y markdownlint-cli2 README.md -r "MD051" || true

Repository: rohitg00/agentmemory

Length of output: 53195

🏁 Script executed:

#!/bin/bash set -euo pipefail # Capture only MD051-related lines mentioning the local models fragment # (limit output to avoid truncation) set +e npx -y markdownlint-cli2 README.md -r "MD051" 2>&1 | rg "README\.md:|MD051|local-models" | rg -i "local-models|`#local-models`|1140" | head -n 50 exit 0

Repository: rohitg00/agentmemory

Length of output: 206

🏁 Script executed:

#!/bin/bash set -euo pipefail orig="README.md" tmpdir="$(mktemp -d)" trap 'rm -rf "$tmpdir"' EXIT # Extract current local-models fragment from the Local models link current="$(rg -o --no-filename '\`#local-models-`[a-zA-Z0-9\-]+' "$orig" | head -n 1)" echo "Current fragment: $current" candidates=( "local-models-ollama-lm-studio-vllm" # current "local-models-ollama--lm-studio--vllm" # double-dash as proposed "local-models--ollama--lm-studio--vllm" # also includes leading double dash "local-models-ollama--lm-studio-vllm" # single dash before last "local-models-ollama-lm-studio--vllm" # single dash between first two "local-models--ollama-lm-studio--vllm" # mix "local-models-ollama-lm-studio-vllm-" # trailing hyphen "local-models-ollama-lm-studio-vllm--" # trailing double hyphen "local-models-ollama-lm-studio-vllm--vllm" # unlikely (but cheap) "local-models-ollama--lm-studio--vllm-" # trailing hyphen after double-dash ) # Ensure we only swap the specific fragment in the Local models link # (safer than global replace) for slug in "${candidates[@]}"; do frag="#$slug" tmp="$tmpdir/README.md" cp "$orig" "$tmp" # Replace the exact old fragment string in the file # If current doesn't match, skip if rg -q --fixed-string "$frag" "$tmp"; then echo "Candidate already present: $frag" fi sed -i "s/\#$(${current} | sed 's/.*//')/$frag/g" "$tmp" 2>/dev/null || true # More robust: replace the exact current fragment text with the candidate fragment sed -i "s/\#$current/$frag/g" "$tmp" out="$(npx -y markdownlint-cli2 "$tmp" -r "MD051" 2>&1 || true)" # Check whether the specific error still exists if ! echo "$out" | rg -q "MD051.*`#local-models`"; then echo "✅ Valid fragment per MD051: $frag" echo "$out" | head -n 20 exit 0 else echo "❌ Still invalid: $frag" echo "$out" | rg -n "MD051.*local-models" || true fi done echo "No candidate removed the MD051 local-models error. Need further investigation."

Repository: rohitg00/agentmemory

Length of output: 2485

🏁 Script executed:

#!/bin/bash set -euo pipefail npx -y markdownlint-cli2 README.md -r "MD051" 2>&1 | rg "MD051" | rg "`#local-models`" || true

Repository: rohitg00/agentmemory

Length of output: 206

🏁 Script executed:

#!/bin/bash set -euo pipefail # Resolve markdownlint package location node -e "console.log(require.resolve('markdownlint'))" 2>/dev/null || true # Try to locate md051/link-fragments implementation # Use npx to ensure markdownlint is available, then locate the source file npx -y markdownlint-cli2 -v >/dev/null 2>&1 || true npx -y markdownlint --version >/dev/null 2>&1 || true # Get the markdownlint package root via node root="$(node -e "console.log(require('module').createRequire(process.cwd() + '/').resolve('markdownlint/package.json'))" 2>/dev/null | xargs dirname)" echo "markdownlint package dir: $root" # Search for "link-fragments" and "MD051" in markdownlint source rg -n "link-fragments|MD051" "$root" || true # Search for slugger/anchor generation helpers rg -n "slug|anchor|github|remark|gitlab|github-slug|githubSlug|Slugger|toSlug" "$root" || true

Repository: rohitg00/agentmemory

Length of output: 134

🏁 Script executed:

#!/bin/bash set -euo pipefail node - <<'NODE' const path = require('path'); let p; try { p = require.resolve('markdownlint'); console.log('markdownlint main:', p); console.log('markdownlint dir:', path.dirname(p)); } catch (e) { console.error('Could not resolve markdownlint:', e.message); process.exit(1); } NODE root="$(node - <<'NODE' const path = require('path'); try { const main = require.resolve('markdownlint'); console.log(path.dirname(main)); } catch (e) { process.exit(1); } NODE )" echo "Using root: $root" echo "== Search for link-fragments / MD051 rules ==" rg -n "link-fragments|MD051|Link fragments should be valid" "$root" || true echo "== Search for heading slug / anchor generation ==" rg -n "anchor|slug|github|github-slug|toSlug|heading.*id|id.*heading" "$root" || true echo "== Search for GitHub-flavored slugger helper (if any) ==" rg -n "github|remark|slugger|GithubSlugger|githubSlugger|github.com" "$root" || true

Repository: rohitg00/agentmemory

Length of output: 174

🏁 Script executed:

#!/bin/bash set -euo pipefail orig="README.md" tmpdir="$(mktemp -d)" trap 'rm -rf "$tmpdir"' EXIT current_slug="$(rg -o --no-filename '\`#local-models-`[a-zA-Z0-9\-]+' "$orig" | head -n 1 | sed 's/^`#//`')" echo "Current fragment slug: $current_slug" candidates=( "local-models-ollama-lm-studio-vllm" "local-models-ollama--lm-studio--vllm" "local-models--ollama--lm-studio--vllm" "local-models-ollama-lm-studio--vllm" "local-models-ollama--lm-studio-vllm" "local-models-ollama-lm-studio-vllm/" "local-models-ollama-lm-studio-vllm-" "local-models-ollama-lm-studio-vllm--" ) for cand in "${candidates[@]}"; do tmp="$tmpdir/README.md" cp "$orig" "$tmp" # Replace only the exact current fragment with the candidate perl -pi -e "s/\#${current_slug}/\#${cand}/g" "$tmp" # Check if MD051 still flags this fragment out="$(npx -y markdownlint-cli2 "$tmp" -r "MD051" 2>&1 || true)" if ! echo "$out" | rg -q "MD051.*\`#local-models`"; then echo "✅ MD051 no longer flags for: #$cand" echo "$out" | rg -n "MD051.*local-models" || true exit 0 else echo "❌ Still flagged for: #$cand" echo "$out" | rg -n "MD051.*local-models" | head -n 1 || true fi done echo "No candidate removed the MD051 local-models error."

Repository: rohitg00/agentmemory

Length of output: 420

Fix the broken “Local models” anchor fragment in README.md.

README.md:1140 links to #local-models-ollama-lm-studio-vllm, but the ### Local models (Ollama / LM Studio / vLLM) header slug expected by markdownlint-cli2 is #local-models-ollama--lm-studio--vllm, so MD051/link-fragments fires.

Proposed fix

-| **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](`#local-models-ollama-lm-studio-vllm`) below. | +| **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](`#local-models-ollama--lm-studio--vllm`) below. |

📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change

| **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](#local-models-ollama-lm-studio-vllm) below. |

| **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](`#local-models-ollama--lm-studio--vllm`) below. |

🧰 Tools

🪛 markdownlint-cli2 (0.22.1)

[warning] 1140-1140: Link fragments should be valid

(MD051, link-fragments)

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` at line 1140, The table link to the "Local models (Ollama / LM Studio / vLLM)" section uses the wrong fragment; update the link target in the table row that contains the `OPENAI_API_KEY=local` entry so it matches the header slug generated for `### Local models (Ollama / LM Studio / vLLM)` (i.e., change `#local-models-ollama-lm-studio-vllm` to `#local-models-ollama--lm-studio--vllm`) so the anchor fragment matches markdownlint-cli2/MD051 expectations.

coderabbitai · 2026-05-28T09:26:48Z

+Local embeddings ship out of the box via `@xenova/transformers` — `EMBEDDING_PROVIDER=local` (default) gives you BGE-small entirely on-device. No extra config needed.
+


⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash # Description: Find the actual model name used by the local embedding provider # Search for local embedding provider implementation echo "Searching for local embedding provider model specification:" rg -nC3 'all-MiniLM-L6-v2|BGE-small|bge-small' --type=ts # Also check for `@xenova/transformers` usage echo -e "\nSearching for `@xenova/transformers` model initialization:" rg -nC5 '`@xenova/transformers`' --type=ts | rg -C5 'model.*=|AutoModel|pipeline'

Repository: rohitg00/agentmemory

Length of output: 4718

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "README.md around 858-862:" nl -ba README.md | sed -n '850,870p' echo echo "README.md around 1184-1188:" nl -ba README.md | sed -n '1178,1192p'

Repository: rohitg00/agentmemory

Length of output: 151

🏁 Script executed:

#!/bin/bash set -euo pipefail echo "README.md around 858-862 (with line numbers):" awk 'NR>=850 && NR<=870 {printf "%d:%s\n", NR, $0}' README.md echo echo "README.md around 1184-1188 (with line numbers):" awk 'NR>=1178 && NR<=1192 {printf "%d:%s\n", NR, $0}' README.md

Repository: rohitg00/agentmemory

Length of output: 2757

Fix README local embedding model mismatch (BGE-small vs all-MiniLM-L6-v2).

EMBEDDING_PROVIDER=local is documented as BGE-small at the “Local embeddings ship out of the box…” paragraph, but the local embedding provider (and the providers table) uses all-MiniLM-L6-v2.

Suggested patch

-Local embeddings ship out of the box via `@xenova/transformers` — `EMBEDDING_PROVIDER=local` (default) gives you BGE-small entirely on-device. No extra config needed. +Local embeddings ship out of the box via `@xenova/transformers` — `EMBEDDING_PROVIDER=local` (default) gives you all-MiniLM-L6-v2 entirely on-device. No extra config needed.

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@README.md` around lines 1185 - 1186, The README incorrectly states the local embedding model is "BGE-small" while the code and providers table use "all-MiniLM-L6-v2"; update the documentation so the "Local embeddings ship out of the box…" paragraph, the EMBEDDING_PROVIDER=local default mention, and the providers table consistently reference "all-MiniLM-L6-v2" (or if you intend to change the implementation, update the provider implementation and table to use "BGE-small"); ensure any mention of default provider or example env var aligns with the chosen model.

Bumps version across 9 files + adds CHANGELOG entry summarizing the 18 commits since v0.9.22. Highlights: - GitHub Copilot CLI first-class support (#534) — plugin + hooks + MCP with LSP-style Content-Length framing on the standalone stdio transport. - Five new MCP adapters: Warp, Cline, Continue, Zed, Droid (#677); ADAPTERS count 11 → 17. - Three silent DX bugs fixed: graph extraction never fired on session end (#666 / #698), status reported zero memories (#666), consolidation defaulted off even with an LLM provider configured (#612 / #696). - Nine telemetry hooks switched to fire-and-forget so they don't block Claude Code's next-prompt boundary (#573 / #688). - Hook project field now sends repo basename instead of full filesystem path so auto-injected context isn't silently filtered out (#474 / #687). - Local-LLM docs: Ollama / LM Studio / vLLM section added (#671 / #697). Version-bump files: package.json, plugin/.claude-plugin/plugin.json, plugin/plugin.json, plugin/.codex-plugin/plugin.json, packages/mcp/package.json, src/version.ts, src/types.ts, src/functions/export-import.ts, test/export-import.test.ts.

* chore(release): v0.9.23 Bumps version across 9 files + adds CHANGELOG entry summarizing the 18 commits since v0.9.22. Highlights: - GitHub Copilot CLI first-class support (#534) — plugin + hooks + MCP with LSP-style Content-Length framing on the standalone stdio transport. - Five new MCP adapters: Warp, Cline, Continue, Zed, Droid (#677); ADAPTERS count 11 → 17. - Three silent DX bugs fixed: graph extraction never fired on session end (#666 / #698), status reported zero memories (#666), consolidation defaulted off even with an LLM provider configured (#612 / #696). - Nine telemetry hooks switched to fire-and-forget so they don't block Claude Code's next-prompt boundary (#573 / #688). - Hook project field now sends repo basename instead of full filesystem path so auto-injected context isn't silently filtered out (#474 / #687). - Local-LLM docs: Ollama / LM Studio / vLLM section added (#671 / #697). Version-bump files: package.json, plugin/.claude-plugin/plugin.json, plugin/plugin.json, plugin/.codex-plugin/plugin.json, packages/mcp/package.json, src/version.ts, src/types.ts, src/functions/export-import.ts, test/export-import.test.ts. * chore(release): add #701 + #709 to v0.9.23 CHANGELOG

vercel Bot deployed to Preview May 28, 2026 09:19 View deployment

coderabbitai Bot reviewed May 28, 2026

View reviewed changes

rohitg00 mentioned this pull request May 28, 2026

Status shows Memories/Observations as 0. Graph is always empty #666

Closed

rohitg00 merged commit bfde288 into main May 28, 2026
3 checks passed

rohitg00 deleted the docs/local-models branch May 28, 2026 09:39

rohitg00 mentioned this pull request May 28, 2026

chore(release): v0.9.23 #702

Merged

5 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(readme): document Ollama / LM Studio / vLLM local-model setup (#671)#697

docs(readme): document Ollama / LM Studio / vLLM local-model setup (#671)#697
rohitg00 merged 1 commit into
mainfrom
docs/local-models

rohitg00 commented May 28, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

vercel Bot commented May 28, 2026 •

edited

Loading

Uh oh!

coderabbitai Bot commented May 28, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot May 28, 2026

Uh oh!

coderabbitai Bot May 28, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

	\| Local (Ollama / LM Studio / vLLM / llama.cpp) \| `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` \| Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](#local-models-ollama-lm-studio-vllm) below. \|
	\| Local (Ollama / LM Studio / vLLM / llama.cpp) \| `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` \| Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](`#local-models-ollama--lm-studio--vllm`) below. \|

		Local embeddings ship out of the box via `@xenova/transformers` — `EMBEDDING_PROVIDER=local` (default) gives you BGE-small entirely on-device. No extra config needed.

Conversation

rohitg00 commented May 28, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Changes (README.md only)

Test plan

Summary by CodeRabbit

Uh oh!

vercel Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

coderabbitai Bot commented May 28, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

Poem

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot May 28, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

rohitg00 commented May 28, 2026 •

edited by coderabbitai Bot

Loading

vercel Bot commented May 28, 2026 •

edited

Loading

coderabbitai Bot commented May 28, 2026 •

edited

Loading