docs(readme): document Ollama / LM Studio / vLLM local-model setup (#671)#697
Conversation
Users were missing the OpenAI-compatible local-model path because it was buried in a comment inside the env-example block. xerudro asked "why all the tools runs on paid LMs" in #671 — agentmemory already supports any OpenAI-API-compatible server; the docs just didn't say so clearly. Adds a dedicated "Local models (Ollama / LM Studio / vLLM)" subsection under "LLM Providers" with concrete config examples for both Ollama (port 11434) and LM Studio (port 1234), plus a model-pick table (qwen2.5-coder:7b, llama3.2:3b, mistral:7b-instruct, deepseek-r1:7b) sized for memory compression workloads. Calls out the reasoning-class empty-content pitfall and the EMBEDDING_PROVIDER=local default for fully-local stacks. Also adds OpenAI as an explicit row in the provider table (it was already supported via OPENAI_API_KEY auto-detection but never listed alongside Anthropic / Gemini / OpenRouter / MiniMax).
|
The latest updates on your projects. Learn more about Vercel for GitHub.
|
📝 WalkthroughWalkthroughDocumentation-only PR expanding the README configuration guide with explicit OpenAI API provider configuration and a new subsection explaining local LLM model setup via Ollama, LM Studio, and vLLM, including environment variable examples and model recommendations. ChangesLLM Provider Configuration
Estimated code review effort🎯 1 (Trivial) | ⏱️ ~3 minutes Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@README.md`:
- Around line 1185-1186: The README incorrectly states the local embedding model
is "BGE-small" while the code and providers table use "all-MiniLM-L6-v2"; update
the documentation so the "Local embeddings ship out of the box…" paragraph, the
EMBEDDING_PROVIDER=local default mention, and the providers table consistently
reference "all-MiniLM-L6-v2" (or if you intend to change the implementation,
update the provider implementation and table to use "BGE-small"); ensure any
mention of default provider or example env var aligns with the chosen model.
- Line 1140: The table link to the "Local models (Ollama / LM Studio / vLLM)"
section uses the wrong fragment; update the link target in the table row that
contains the `OPENAI_API_KEY=local` entry so it matches the header slug
generated for `### Local models (Ollama / LM Studio / vLLM)` (i.e., change
`#local-models-ollama-lm-studio-vllm` to
`#local-models-ollama--lm-studio--vllm`) so the anchor fragment matches
markdownlint-cli2/MD051 expectations.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
| | Gemini | `GEMINI_API_KEY` | Also enables embeddings | | ||
| | OpenRouter | `OPENROUTER_API_KEY` | Any model | | ||
| | OpenAI API | `OPENAI_API_KEY` | Default `gpt-4o-mini`, override with `OPENAI_MODEL` | | ||
| | **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](#local-models-ollama-lm-studio-vllm) below. | |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Verify anchor link resolution for the local models section
# Check what anchor would be generated from the header
echo "Section header at line 1143:"
sed -n '1143p' README.md
echo -e "\nExpected anchor (GitHub/most processors):"
# GitHub lowercases, replaces spaces with -, strips special chars except -
sed -n '1143p' README.md | sed 's/^###* //' | tr '[:upper:]' '[:lower:]' | sed 's/[^a-z0-9 -]//g' | tr -s ' ' '-'
echo -e "\nLink fragment used at line 1140:"
rg -n '`#local-models`' README.mdRepository: rohitg00/agentmemory
Length of output: 592
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Show potential markdownlint-cli2 configuration
echo "== Repo markdownlint-cli2 config files ==";
ls -la . | rg -n "markdownlint|markdown-lint" || true
fd -t f -a "markdownlint" . || true
fd -t f -a "markdownlint-cli2" . || true
fd -t f -a ".markdownlint" . || true
echo
echo "== Search for MD051 configuration ==";
rg -n "MD051" . || true
echo
echo "== package.json (scripts/devDeps) ==";
if [ -f package.json ]; then
cat package.json | sed -n '1,200p'
else
echo "No package.json found."
fi
echo
echo "== Run markdownlint-cli2 for MD051 (best effort) ==";
# Prefer local bin if present; fall back to npx. If neither works, we still get output/errors.
if [ -d node_modules ]; then
if [ -x node_modules/.bin/markdownlint-cli2 ]; then
node_modules/.bin/markdownlint-cli2 --help >/dev/null 2>&1 || true
node_modules/.bin/markdownlint-cli2 README.md -r "MD051" || true
fi
fi
npx -y markdownlint-cli2 README.md -r "MD051" || trueRepository: rohitg00/agentmemory
Length of output: 53195
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Capture only MD051-related lines mentioning the local models fragment
# (limit output to avoid truncation)
set +e
npx -y markdownlint-cli2 README.md -r "MD051" 2>&1 | rg "README\.md:|MD051|local-models" | rg -i "local-models|`#local-models`|1140" | head -n 50
exit 0Repository: rohitg00/agentmemory
Length of output: 206
🏁 Script executed:
#!/bin/bash
set -euo pipefail
orig="README.md"
tmpdir="$(mktemp -d)"
trap 'rm -rf "$tmpdir"' EXIT
# Extract current local-models fragment from the Local models link
current="$(rg -o --no-filename '\`#local-models-`[a-zA-Z0-9\-]+' "$orig" | head -n 1)"
echo "Current fragment: $current"
candidates=(
"local-models-ollama-lm-studio-vllm" # current
"local-models-ollama--lm-studio--vllm" # double-dash as proposed
"local-models--ollama--lm-studio--vllm" # also includes leading double dash
"local-models-ollama--lm-studio-vllm" # single dash before last
"local-models-ollama-lm-studio--vllm" # single dash between first two
"local-models--ollama-lm-studio--vllm" # mix
"local-models-ollama-lm-studio-vllm-" # trailing hyphen
"local-models-ollama-lm-studio-vllm--" # trailing double hyphen
"local-models-ollama-lm-studio-vllm--vllm" # unlikely (but cheap)
"local-models-ollama--lm-studio--vllm-" # trailing hyphen after double-dash
)
# Ensure we only swap the specific fragment in the Local models link
# (safer than global replace)
for slug in "${candidates[@]}"; do
frag="#$slug"
tmp="$tmpdir/README.md"
cp "$orig" "$tmp"
# Replace the exact old fragment string in the file
# If current doesn't match, skip
if rg -q --fixed-string "$frag" "$tmp"; then
echo "Candidate already present: $frag"
fi
sed -i "s/\#$(${current} | sed 's/.*//')/$frag/g" "$tmp" 2>/dev/null || true
# More robust: replace the exact current fragment text with the candidate fragment
sed -i "s/\#$current/$frag/g" "$tmp"
out="$(npx -y markdownlint-cli2 "$tmp" -r "MD051" 2>&1 || true)"
# Check whether the specific error still exists
if ! echo "$out" | rg -q "MD051.*`#local-models`"; then
echo "✅ Valid fragment per MD051: $frag"
echo "$out" | head -n 20
exit 0
else
echo "❌ Still invalid: $frag"
echo "$out" | rg -n "MD051.*local-models" || true
fi
done
echo "No candidate removed the MD051 local-models error. Need further investigation."Repository: rohitg00/agentmemory
Length of output: 2485
🏁 Script executed:
#!/bin/bash
set -euo pipefail
npx -y markdownlint-cli2 README.md -r "MD051" 2>&1 | rg "MD051" | rg "`#local-models`" || trueRepository: rohitg00/agentmemory
Length of output: 206
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# Resolve markdownlint package location
node -e "console.log(require.resolve('markdownlint'))" 2>/dev/null || true
# Try to locate md051/link-fragments implementation
# Use npx to ensure markdownlint is available, then locate the source file
npx -y markdownlint-cli2 -v >/dev/null 2>&1 || true
npx -y markdownlint --version >/dev/null 2>&1 || true
# Get the markdownlint package root via node
root="$(node -e "console.log(require('module').createRequire(process.cwd() + '/').resolve('markdownlint/package.json'))" 2>/dev/null | xargs dirname)"
echo "markdownlint package dir: $root"
# Search for "link-fragments" and "MD051" in markdownlint source
rg -n "link-fragments|MD051" "$root" || true
# Search for slugger/anchor generation helpers
rg -n "slug|anchor|github|remark|gitlab|github-slug|githubSlug|Slugger|toSlug" "$root" || trueRepository: rohitg00/agentmemory
Length of output: 134
🏁 Script executed:
#!/bin/bash
set -euo pipefail
node - <<'NODE'
const path = require('path');
let p;
try {
p = require.resolve('markdownlint');
console.log('markdownlint main:', p);
console.log('markdownlint dir:', path.dirname(p));
} catch (e) {
console.error('Could not resolve markdownlint:', e.message);
process.exit(1);
}
NODE
root="$(node - <<'NODE'
const path = require('path');
try {
const main = require.resolve('markdownlint');
console.log(path.dirname(main));
} catch (e) { process.exit(1); }
NODE
)"
echo "Using root: $root"
echo "== Search for link-fragments / MD051 rules =="
rg -n "link-fragments|MD051|Link fragments should be valid" "$root" || true
echo "== Search for heading slug / anchor generation =="
rg -n "anchor|slug|github|github-slug|toSlug|heading.*id|id.*heading" "$root" || true
echo "== Search for GitHub-flavored slugger helper (if any) =="
rg -n "github|remark|slugger|GithubSlugger|githubSlugger|github.com" "$root" || trueRepository: rohitg00/agentmemory
Length of output: 174
🏁 Script executed:
#!/bin/bash
set -euo pipefail
orig="README.md"
tmpdir="$(mktemp -d)"
trap 'rm -rf "$tmpdir"' EXIT
current_slug="$(rg -o --no-filename '\`#local-models-`[a-zA-Z0-9\-]+' "$orig" | head -n 1 | sed 's/^`#//`')"
echo "Current fragment slug: $current_slug"
candidates=(
"local-models-ollama-lm-studio-vllm"
"local-models-ollama--lm-studio--vllm"
"local-models--ollama--lm-studio--vllm"
"local-models-ollama-lm-studio--vllm"
"local-models-ollama--lm-studio-vllm"
"local-models-ollama-lm-studio-vllm/"
"local-models-ollama-lm-studio-vllm-"
"local-models-ollama-lm-studio-vllm--"
)
for cand in "${candidates[@]}"; do
tmp="$tmpdir/README.md"
cp "$orig" "$tmp"
# Replace only the exact current fragment with the candidate
perl -pi -e "s/\#${current_slug}/\#${cand}/g" "$tmp"
# Check if MD051 still flags this fragment
out="$(npx -y markdownlint-cli2 "$tmp" -r "MD051" 2>&1 || true)"
if ! echo "$out" | rg -q "MD051.*\`#local-models`"; then
echo "✅ MD051 no longer flags for: #$cand"
echo "$out" | rg -n "MD051.*local-models" || true
exit 0
else
echo "❌ Still flagged for: #$cand"
echo "$out" | rg -n "MD051.*local-models" | head -n 1 || true
fi
done
echo "No candidate removed the MD051 local-models error."Repository: rohitg00/agentmemory
Length of output: 420
Fix the broken “Local models” anchor fragment in README.md.
README.md:1140 links to #local-models-ollama-lm-studio-vllm, but the ### Local models (Ollama / LM Studio / vLLM) header slug expected by markdownlint-cli2 is #local-models-ollama--lm-studio--vllm, so MD051/link-fragments fires.
Proposed fix
-| **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](`#local-models-ollama-lm-studio-vllm`) below. |
+| **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](`#local-models-ollama--lm-studio--vllm`) below. |📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| | **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](#local-models-ollama-lm-studio-vllm) below. | | |
| | **Local (Ollama / LM Studio / vLLM / llama.cpp)** | `OPENAI_API_KEY=local` + `OPENAI_BASE_URL=http://localhost:11434/v1` (Ollama) or `http://localhost:1234/v1` (LM Studio) + `OPENAI_MODEL=<your model>` | Anything OpenAI-API-compatible. Zero cost, runs on your hardware. See [Local models](`#local-models-ollama--lm-studio--vllm`) below. | |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 1140-1140: Link fragments should be valid
(MD051, link-fragments)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` at line 1140, The table link to the "Local models (Ollama / LM
Studio / vLLM)" section uses the wrong fragment; update the link target in the
table row that contains the `OPENAI_API_KEY=local` entry so it matches the
header slug generated for `### Local models (Ollama / LM Studio / vLLM)` (i.e.,
change `#local-models-ollama-lm-studio-vllm` to
`#local-models-ollama--lm-studio--vllm`) so the anchor fragment matches
markdownlint-cli2/MD051 expectations.
| Local embeddings ship out of the box via `@xenova/transformers` — `EMBEDDING_PROVIDER=local` (default) gives you BGE-small entirely on-device. No extra config needed. | ||
|
|
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Description: Find the actual model name used by the local embedding provider
# Search for local embedding provider implementation
echo "Searching for local embedding provider model specification:"
rg -nC3 'all-MiniLM-L6-v2|BGE-small|bge-small' --type=ts
# Also check for `@xenova/transformers` usage
echo -e "\nSearching for `@xenova/transformers` model initialization:"
rg -nC5 '`@xenova/transformers`' --type=ts | rg -C5 'model.*=|AutoModel|pipeline'Repository: rohitg00/agentmemory
Length of output: 4718
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "README.md around 858-862:"
nl -ba README.md | sed -n '850,870p'
echo
echo "README.md around 1184-1188:"
nl -ba README.md | sed -n '1178,1192p'Repository: rohitg00/agentmemory
Length of output: 151
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "README.md around 858-862 (with line numbers):"
awk 'NR>=850 && NR<=870 {printf "%d:%s\n", NR, $0}' README.md
echo
echo "README.md around 1184-1188 (with line numbers):"
awk 'NR>=1178 && NR<=1192 {printf "%d:%s\n", NR, $0}' README.mdRepository: rohitg00/agentmemory
Length of output: 2757
Fix README local embedding model mismatch (BGE-small vs all-MiniLM-L6-v2).
EMBEDDING_PROVIDER=local is documented as BGE-small at the “Local embeddings ship out of the box…” paragraph, but the local embedding provider (and the providers table) uses all-MiniLM-L6-v2.
Suggested patch
-Local embeddings ship out of the box via `@xenova/transformers` — `EMBEDDING_PROVIDER=local` (default) gives you BGE-small entirely on-device. No extra config needed.
+Local embeddings ship out of the box via `@xenova/transformers` — `EMBEDDING_PROVIDER=local` (default) gives you all-MiniLM-L6-v2 entirely on-device. No extra config needed.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@README.md` around lines 1185 - 1186, The README incorrectly states the local
embedding model is "BGE-small" while the code and providers table use
"all-MiniLM-L6-v2"; update the documentation so the "Local embeddings ship out
of the box…" paragraph, the EMBEDDING_PROVIDER=local default mention, and the
providers table consistently reference "all-MiniLM-L6-v2" (or if you intend to
change the implementation, update the provider implementation and table to use
"BGE-small"); ensure any mention of default provider or example env var aligns
with the chosen model.
Bumps version across 9 files + adds CHANGELOG entry summarizing the 18 commits since v0.9.22. Highlights: - GitHub Copilot CLI first-class support (#534) — plugin + hooks + MCP with LSP-style Content-Length framing on the standalone stdio transport. - Five new MCP adapters: Warp, Cline, Continue, Zed, Droid (#677); ADAPTERS count 11 → 17. - Three silent DX bugs fixed: graph extraction never fired on session end (#666 / #698), status reported zero memories (#666), consolidation defaulted off even with an LLM provider configured (#612 / #696). - Nine telemetry hooks switched to fire-and-forget so they don't block Claude Code's next-prompt boundary (#573 / #688). - Hook project field now sends repo basename instead of full filesystem path so auto-injected context isn't silently filtered out (#474 / #687). - Local-LLM docs: Ollama / LM Studio / vLLM section added (#671 / #697). Version-bump files: package.json, plugin/.claude-plugin/plugin.json, plugin/plugin.json, plugin/.codex-plugin/plugin.json, packages/mcp/package.json, src/version.ts, src/types.ts, src/functions/export-import.ts, test/export-import.test.ts.
* chore(release): v0.9.23 Bumps version across 9 files + adds CHANGELOG entry summarizing the 18 commits since v0.9.22. Highlights: - GitHub Copilot CLI first-class support (#534) — plugin + hooks + MCP with LSP-style Content-Length framing on the standalone stdio transport. - Five new MCP adapters: Warp, Cline, Continue, Zed, Droid (#677); ADAPTERS count 11 → 17. - Three silent DX bugs fixed: graph extraction never fired on session end (#666 / #698), status reported zero memories (#666), consolidation defaulted off even with an LLM provider configured (#612 / #696). - Nine telemetry hooks switched to fire-and-forget so they don't block Claude Code's next-prompt boundary (#573 / #688). - Hook project field now sends repo basename instead of full filesystem path so auto-injected context isn't silently filtered out (#474 / #687). - Local-LLM docs: Ollama / LM Studio / vLLM section added (#671 / #697). Version-bump files: package.json, plugin/.claude-plugin/plugin.json, plugin/plugin.json, plugin/.codex-plugin/plugin.json, packages/mcp/package.json, src/version.ts, src/types.ts, src/functions/export-import.ts, test/export-import.test.ts. * chore(release): add #701 + #709 to v0.9.23 CHANGELOG
Summary
Discussion #671 — user asked "why do all the tools run on paid LLMs? we have hardware to run them locally". This is a docs gap: agentmemory already speaks OpenAI-API-compatible to any local server (Ollama, LM Studio, vLLM, llama.cpp), but the path was buried in a single-line comment inside the env-example block at README:1255. Users missed it.
Changes (README.md only)
OPENAI_BASE_URL=http://localhost:11434/v1)OPENAI_BASE_URL=http://localhost:1234/v1)qwen2.5-coder:7b,llama3.2:3b,mistral:7b-instruct,deepseek-r1:7b— sized at 2-5GB so they fit on consumer hardware<think>block + missingcontentfield problem that bites local users of o1-style models)EMBEDDING_PROVIDER=local(the default) gives you BGE-small on-device, so fully-local stacks need no extra configTest plan
#local-models-ollama-lm-studio-vllmworks.OpenAIProvider+OPENAI_BASE_URL. Just naming the path so users find it.Closes #671.
Summary by CodeRabbit