A self-contained RAG toolkit for AI assistants — vector database, document ingestion, and semantic search in one package
Tools • Quickstart • Embedding Providers • Client Configuration • Configuration • Docker • Troubleshooting
Qdrant MCP Server gives AI assistants the ability to store, search, and manage documents using a built-in Qdrant vector database. It implements the Model Context Protocol (MCP), which means any MCP-compatible client — Claude Desktop, Claude Code, Claude.ai, or third-party tools — can use it as a knowledge base through natural conversation.
How it works: You ingest documents (PDFs, text files, or raw text). The server chunks them, generates embeddings via your choice of provider, and stores the vectors in Qdrant. When you ask a question, it embeds your query and returns the most relevant passages with source attribution and similarity scores.
What's included:
- Qdrant vector database — bundled in Docker, starts automatically with
docker compose up. No separate installation needed. (Can also point at an existing Qdrant instance if you have one.) - 6 MCP tools — search, ingest text, ingest files (PDF/text/markdown/HTML), list collections, collection details, and delete — all accessible through conversation
- Provider-agnostic embeddings — works with any OpenAI-compatible API: local (Ollama, llama.cpp) or cloud (OpenAI, Anthropic/Voyage, Google, Azure). Switch providers by changing one URL.
- Dual transport — stdio for Claude Desktop and Claude Code, streamable HTTP for Claude.ai or any HTTP-capable MCP client
- Cross-platform — runs on Linux, macOS, and Windows via native Python or Docker
Built with FastMCP (Python).
Once the server is running and connected to your MCP client, interact through natural conversation:
You: Search my documents for contract termination clauses
Search results for: contract termination clauses
Collection: documents | Results: 3
[1] 0.8234 — services-agreement.pdf p.12 (chunk 23)
Either party may terminate this Agreement upon thirty (30) days written
notice to the other party. In the event of a material breach...
[2] 0.7891 — employee-handbook.pdf p.45 (chunk 67)
Employment may be terminated at will by either party with or without
cause, subject to the notice requirements in Section 8.2...
You: Ingest this PDF: /home/user/Documents/quarterly-report.pdf
{
"status": "ok",
"file": "/home/user/Documents/quarterly-report.pdf",
"collection": "documents",
"pages_extracted": 24,
"chunks_created": 47,
"points_upserted": 47,
"text_length": 38291
}You: Ingest this text as "meeting-notes-mar-25": We discussed the Q1 roadmap and agreed to prioritize the API redesign. Budget was approved for two additional headcount...
{
"status": "ok",
"source": "meeting-notes-mar-25",
"collection": "documents",
"chunks_created": 1,
"points_upserted": 1,
"text_length": 142
}You: What collections do I have?
Qdrant Collections (2 total)
| Collection | Points | Dimension | Distance |
|------------|--------|-----------|----------|
| documents | 198 | 768 | Cosine |
| research | 43 | 768 | Cosine |
You: Delete all chunks from source "old-report.pdf"
The server requires explicit confirmation for destructive operations — the LLM must pass confirm=true:
{
"status": "ok",
"collection": "documents",
"points_before": 198,
"points_after": 185,
"points_deleted": 13,
"filter_source": "/home/user/Documents/old-report.pdf"
}Files must be in a directory listed in ALLOWED_INGEST_PATHS (configured in config.py). By default, this is set to your home directory — narrow it for your environment:
# config.py — Linux / macOS
ALLOWED_INGEST_PATHS = [
"/home/youruser/Documents/",
"/home/youruser/projects/",
"/data/shared/",
]
# config.py — Windows
ALLOWED_INGEST_PATHS = [
"C:\\Users\\youruser\\Documents\\",
"C:\\Users\\youruser\\projects\\",
"D:\\shared\\",
]Any path outside these directories will be rejected with an error. This is a security measure to prevent unintended file access. Paths are validated using os.path.abspath(), so both forward slashes and backslashes work on Windows.
| Format | Extensions | Notes |
|---|---|---|
.pdf |
Text extracted via pymupdf; page boundaries preserved in metadata | |
| Plain text | .txt |
Direct ingestion |
| Markdown | .md |
Ingested as plain text (markup preserved in chunks) |
| CSV | .csv |
Ingested as raw text — rows become part of chunks |
| JSON | .json |
Ingested as raw text |
| HTML | .html, .htm |
Raw HTML ingested (tags included); pre-process for cleaner results |
| Log files | .log |
Direct ingestion |
| reStructuredText | .rst |
Ingested as plain text |
Files with unrecognized extensions are attempted as plain text.
Not supported: Word documents (
.docx), Excel (.xlsx), images, audio, or scanned PDFs without OCR text layers. Convert these to PDF or text before ingesting.
Before ingesting:
- Clean your PDFs. Scanned documents need an OCR text layer — check by trying to select text in a PDF viewer. If you can't select text, the PDF contains only images and will extract as empty.
- Remove boilerplate. Headers, footers, page numbers, and legal disclaimers in every page create noise. If you can pre-process them out, your search quality improves.
- Split large documents by topic. A 500-page manual ingested as one file works, but searching is more precise if you split it into logical sections (one file per chapter or topic).
- Use descriptive source names. The
sourceparameter is searchable viasource_filter. Names likeq1-2026-financial-revieware more useful thandoc1.pdf.
Tuning chunk settings:
- Default chunk size (512 words) works well for most documents. Larger chunks (1000+) preserve more context but reduce precision. Smaller chunks (200) are more precise but may lose context.
- Overlap (64 words) ensures concepts that span chunk boundaries aren't lost. Increase to 100-128 for technical documents with long multi-sentence concepts.
- Re-ingesting a source with the same name overwrites previous chunks (deterministic IDs). You can safely re-ingest after updating a document.
Organizing with collections:
- Use the default
documentscollection for general use - Create separate collections for different domains:
qdrant_ingest_fileaccepts acollectionparameter - Example:
legalfor contracts,technicalfor specs,researchfor papers — then search within a specific collection for focused results
| Tool | Description | readOnly | destructive | idempotent |
|---|---|---|---|---|
qdrant_search |
Semantic search with score threshold and source filtering | yes | no | yes |
qdrant_ingest_text |
Chunk, embed, and store raw text | no | no | no |
qdrant_ingest_file |
Extract text from PDF/text files, then ingest | no | no | no |
qdrant_list_collections |
List all collections with point counts and vector config | yes | no | yes |
qdrant_collection_info |
Detailed stats on a specific collection | yes | no | yes |
qdrant_delete |
Remove points by ID or source filter (requires confirm=true) |
no | yes | yes |
All tools support both markdown and json response formats via the response_format parameter. Tool annotations follow the MCP specification and inform clients about tool behavior for permission and safety decisions.
┌──────────────────────────────────────────────────┐
│ MCP Clients │
│ Claude Code (stdio) │ Claude.ai / OpenClaw │
│ │ (streamable HTTP :8090) │
└──────────┬────────────┴──────────┬────────────────┘
│ │
▼ ▼
┌──────────────────────────────────────────────────┐
│ qdrant_mcp (FastMCP) │
│ │
│ embeddings.py ──→ Any OpenAI-compatible API │
│ tools/ ──→ Qdrant client │
│ rag/lib/ ──→ Chunker, PDF extractor │
└──────────┬───────────────────────┬────────────────┘
│ │
▼ ▼
Embedding Provider Qdrant :6333
(Ollama, OpenAI, (vector store)
llama.cpp, etc.)
- Python 3.10+ (tested on 3.10 and 3.11) — or Docker (which bundles everything)
- Qdrant — bundled in
docker compose up, or install separately for native Python - An embedding provider — any OpenAI-compatible
/v1/embeddingsendpoint (see Embedding Providers)
| Resource | Minimum | Recommended | Notes |
|---|---|---|---|
| RAM | 512 MB | 1 GB+ | MCP server uses ~50 MB. Qdrant uses 100-500 MB depending on collection size. Local embedding models add 500 MB–4 GB. |
| Disk | 500 MB | 1 GB+ | Project + Python dependencies (~200 MB), Qdrant data (varies with ingested documents), pymupdf (~25 MB). |
| GPU | None | Optional | Only needed for local GPU-accelerated embeddings (Ollama). Cloud providers (OpenAI, Voyage) require no GPU. CPU embedding via llama.cpp works without a GPU. |
| OS | Linux, macOS, Windows | Linux | All platforms supported for native Python and Docker. |
| Docker | Not required | Optional | docker compose up starts both Qdrant and the MCP server. Native Python works without Docker (requires separate Qdrant install). |
| Network | localhost | LAN | Qdrant and embedding provider must be reachable. HTTP transport exposes port 8090 on the configured bind address. |
Scaling note: RAM and disk grow with the amount of ingested data. Each 768-dimension vector (nomic-embed-text) uses ~3 KB in Qdrant. 10,000 document chunks ≈ 30 MB of vector data plus payload text storage. For most personal/team use, this is negligible.
Linux / macOS:
git clone https://github.com/ncasfl/Qdrant-MCP.git
cd Qdrant-MCP
# Create and activate a virtual environment (recommended)
python3 -m venv .venv
source .venv/bin/activate
# Install dependencies
pip install -r requirements.txt
pip install pymupdf # for PDF supportUbuntu/Debian note: If
python3 -m venvfails with "ensurepip is not available", install the venv package first:sudo apt install python3.10-venv(adjust version to match your Python).
Windows (PowerShell):
git clone https://github.com/ncasfl/Qdrant-MCP.git
cd Qdrant-MCP
# Create and activate a virtual environment
python -m venv .venv
.venv\Scripts\Activate.ps1
# Install dependencies
pip install -r requirements.txt
pip install pymupdfWindows note: Use
python(notpython3) — Windows Python installers register aspython. If you get a "not recognized" error, ensure Python is in your system PATH.
Why a virtual environment? The server depends on
mcp,qdrant-client,pydantic, and their transitive dependencies. A venv keeps these isolated from your system Python. The Claude Code MCP config and Docker image both handle this automatically — the venv is only needed for native/manual installs.
Edit qdrant_mcp/config.py or set environment variables:
Linux / macOS (bash):
export EMBED_BASE_URL=http://localhost:11434/v1
export EMBED_MODEL=nomic-embed-text
# For cloud providers:
export EMBED_BASE_URL=https://api.openai.com/v1
export EMBED_MODEL=text-embedding-3-small
export EMBED_API_KEY=sk-...Windows (PowerShell):
$env:EMBED_BASE_URL = "http://localhost:11434/v1"
$env:EMBED_MODEL = "nomic-embed-text"
# For cloud providers:
$env:EMBED_BASE_URL = "https://api.openai.com/v1"
$env:EMBED_MODEL = "text-embedding-3-small"
$env:EMBED_API_KEY = "sk-..."Linux / macOS:
# stdio (for Claude Code / Claude Desktop):
PYTHONPATH="$(pwd)" python3 -m qdrant_mcp --transport stdio
# HTTP (for Claude.ai or other HTTP clients):
PYTHONPATH="$(pwd)" python3 -m qdrant_mcp --transport streamable-http --port 8090Windows (PowerShell):
# stdio:
$env:PYTHONPATH = (Get-Location).Path
python -m qdrant_mcp --transport stdio
# HTTP:
$env:PYTHONPATH = (Get-Location).Path
python -m qdrant_mcp --transport streamable-http --port 8090See Client Configuration below.
The server uses the OpenAI-compatible /v1/embeddings API format. This works with a wide range of providers out of the box.
Ollama (local GPU/CPU)
EMBED_BASE_URL=http://localhost:11434/v1
EMBED_MODEL=nomic-embed-text
# No API key neededInstall Ollama and pull a model:
ollama pull nomic-embed-textllama-server / llama.cpp (local CPU/GPU)
EMBED_BASE_URL=http://localhost:8080/v1
EMBED_MODEL=your-model-name
# No API key neededStart llama-server with an embedding model:
llama-server -m your-embedding-model.gguf --embedding --port 8080OpenAI
EMBED_BASE_URL=https://api.openai.com/v1
EMBED_MODEL=text-embedding-3-small
EMBED_API_KEY=sk-...
EMBED_QUERY_PREFIX=
EMBED_DOCUMENT_PREFIX=Models: text-embedding-3-small (1536-dim), text-embedding-3-large (3072-dim)
Anthropic (Claude) via Voyage AI
Anthropic recommends Voyage AI for embeddings:
EMBED_BASE_URL=https://api.voyageai.com/v1
EMBED_MODEL=voyage-3
EMBED_API_KEY=pa-...
EMBED_QUERY_PREFIX=
EMBED_DOCUMENT_PREFIX=Models: voyage-3 (1024-dim), voyage-3-lite (512-dim), voyage-code-3 (code-optimized)
Google Gemini (via OpenAI-compatible proxy)
Use LiteLLM as a proxy to expose Gemini's embedding API in OpenAI format:
# Start LiteLLM proxy
litellm --model gemini/text-embedding-004 --port 4000
# Configure qdrant_mcp
EMBED_BASE_URL=http://localhost:4000/v1
EMBED_MODEL=gemini/text-embedding-004
EMBED_API_KEY=your-gemini-key
EMBED_QUERY_PREFIX=
EMBED_DOCUMENT_PREFIX=Azure OpenAI
EMBED_BASE_URL=https://your-resource.openai.azure.com/openai/deployments/your-deployment
EMBED_MODEL=text-embedding-3-small
EMBED_API_KEY=your-azure-key
EMBED_QUERY_PREFIX=
EMBED_DOCUMENT_PREFIX=Asymmetric vs symmetric models: Models like
nomic-embed-textuse different prefixes for queries (search_query:) and documents (search_document:) to improve retrieval quality. OpenAI, Voyage, and most cloud models are symmetric — set both prefixes to empty strings.
Add to your Claude Desktop config file:
| Platform | Config path |
|---|---|
| macOS | ~/Library/Application Support/Claude/claude_desktop_config.json |
| Windows | %APPDATA%\Claude\claude_desktop_config.json |
| Linux | ~/.config/Claude/claude_desktop_config.json |
macOS / Linux:
{
"mcpServers": {
"qdrant": {
"command": "/path/to/Qdrant-MCP/.venv/bin/python3",
"args": ["-m", "qdrant_mcp", "--transport", "stdio"],
"env": {
"PYTHONPATH": "/path/to/Qdrant-MCP"
}
}
}
}Windows:
{
"mcpServers": {
"qdrant": {
"command": "C:\\path\\to\\Qdrant-MCP\\.venv\\Scripts\\python.exe",
"args": ["-m", "qdrant_mcp", "--transport", "stdio"],
"env": {
"PYTHONPATH": "C:\\path\\to\\Qdrant-MCP",
"PYTHONIOENCODING": "utf-8"
}
}
}
}Important: Claude Desktop requires a full restart (quit and reopen, not just close the window) to pick up config changes. Use absolute paths — Claude Desktop does not inherit your shell's
PATHor working directory. On Windows, addPYTHONIOENCODING: utf-8to avoid encoding issues.
Debugging Claude Desktop connection
Check the MCP logs if the server doesn't appear:
| Platform | Log path |
|---|---|
| macOS | ~/Library/Logs/Claude/mcp.log and mcp-server-qdrant.log |
| Windows | %APPDATA%\Claude\logs\ |
Common issues: JSON syntax errors in config (silent failure), wrong Python path, missing PYTHONPATH.
Add to your ~/.mcp.json (or project-level .mcp.json):
Linux / macOS:
{
"mcpServers": {
"qdrant": {
"command": "/path/to/Qdrant-MCP/.venv/bin/python3",
"args": ["-m", "qdrant_mcp", "--transport", "stdio"],
"cwd": "/path/to/Qdrant-MCP",
"env": {
"PYTHONPATH": "/path/to/Qdrant-MCP"
}
}
}
}Windows:
{
"mcpServers": {
"qdrant": {
"command": "C:\\path\\to\\Qdrant-MCP\\.venv\\Scripts\\python.exe",
"args": ["-m", "qdrant_mcp", "--transport", "stdio"],
"cwd": "C:\\path\\to\\Qdrant-MCP",
"env": {
"PYTHONPATH": "C:\\path\\to\\Qdrant-MCP"
}
}
}
}Note: Point
commandto the venv Python binary so Claude Code uses the isolated environment. If you installed dependencies globally or use Docker,python3(orpythonon Windows) works instead.
Restart Claude Code to pick up the new server.
- Start the server in HTTP mode (see Quickstart or Docker)
- In Claude.ai settings, add a remote MCP server:
http://your-host:8090/mcp
For MCP clients that only support stdio, use mcp-remote as a bridge:
{
"mcpServers": {
"qdrant": {
"command": "npx",
"args": ["mcp-remote", "http://127.0.0.1:8090/mcp"]
}
}
}The HTTP endpoint at /mcp speaks the standard MCP streamable HTTP protocol. Any MCP-compatible client can connect.
All settings can be configured in qdrant_mcp/config.py or overridden with environment variables.
| Env Var | Default | Description |
|---|---|---|
EMBED_BASE_URL |
http://localhost:11434/v1 |
OpenAI-compatible embeddings API base URL |
EMBED_MODEL |
nomic-embed-text |
Embedding model name |
EMBED_API_KEY |
(empty) | API key for cloud providers |
EMBED_QUERY_PREFIX |
search_query: |
Prefix prepended to search queries |
EMBED_DOCUMENT_PREFIX |
search_document: |
Prefix prepended to document chunks |
| Setting | Default | Description |
|---|---|---|
QDRANT_HOST |
localhost |
Qdrant server host. Docker compose sets this to qdrant (container name) automatically. Override with QDRANT_HOST env var. |
QDRANT_PORT |
6333 |
Qdrant REST API port. Override with QDRANT_PORT env var. |
| Setting | Default | Description |
|---|---|---|
MCP_HTTP_PORT |
8090 |
HTTP transport port |
MCP_HTTP_HOST |
0.0.0.0 |
HTTP bind address |
DEFAULT_COLLECTION |
documents |
Default Qdrant collection |
DEFAULT_CHUNK_SIZE |
512 |
Words per chunk |
DEFAULT_CHUNK_OVERLAP |
64 |
Word overlap between chunks |
| Setting | Default | Description |
|---|---|---|
ALLOWED_INGEST_PATHS |
(configure per deployment) | Whitelist of directories for qdrant_ingest_file |
Important: Update
ALLOWED_INGEST_PATHSinconfig.pyto match your environment. The default restricts file ingestion to specific directories.
The docker-compose.yaml bundles both Qdrant and the MCP server — one command starts everything:
docker compose build
docker compose up -dThis starts:
- Qdrant (vector database) on port 6333 with persistent storage
- qdrant-mcp (MCP server) on port 8090, pre-configured to connect to the bundled Qdrant
Verify:
# Qdrant health
curl -s http://localhost:6333/healthz
# MCP server
curl -s http://localhost:8090/mcpQdrant data is stored in a named Docker volume (qdrant_data). Your vectors and collections persist across:
- Container restarts (
docker compose restart) - Container rebuilds (
docker compose build && docker compose up -d) docker compose down(stops containers, keeps volumes)
Data is only destroyed by explicitly removing the volume: docker volume rm qdrant-mcp_qdrant_data
The MCP server needs an embedding provider. By default, the compose file points at Ollama on your host machine via host.docker.internal:
environment:
- EMBED_BASE_URL=http://host.docker.internal:11434/v1For cloud providers (no local Ollama needed), override in the compose file or a .env file:
environment:
- EMBED_BASE_URL=https://api.openai.com/v1
- EMBED_MODEL=text-embedding-3-small
- EMBED_API_KEY=${OPENAI_API_KEY}Cloud providers work on all platforms (Linux, macOS, Windows) with no additional configuration — embedding calls go out over the internet, not to localhost.
Linux
Works out of the box. host.docker.internal resolves to the host via the extra_hosts directive in the compose file. Ollama and other host services are reachable.
macOS / Windows (Docker Desktop)
Works out of the box. Docker Desktop natively supports host.docker.internal. Ollama and other host services are reachable.
If you prefer not to use Docker, the native Python approach works well on macOS and Windows:
PYTHONPATH="$(pwd)" python3 -m qdrant_mcp --transport streamable-http --port 8090Note: you'll still need Qdrant running separately (via Docker or Qdrant Cloud).
If you already have Qdrant running, you can skip the bundled one. Create a docker-compose.override.yaml:
services:
qdrant:
# Replace bundled Qdrant with a no-op
image: busybox
command: ["true"]
restart: "no"
ports: !reset []
volumes: !reset []
deploy: !reset {}
qdrant-mcp:
environment:
# Point at your existing Qdrant
- QDRANT_HOST=your-qdrant-host
- QDRANT_PORT=6333Tip: Add
docker-compose.override.yamlto your.gitignore— it's deployment-specific.
For native Python installs, set QDRANT_HOST in your environment or edit config.py directly.
| Component | Image | Size |
|---|---|---|
| Qdrant | qdrant/qdrant:v1.13.6 |
~100 MB |
| qdrant-mcp | Custom (python:3.11-slim + deps) |
~250 MB |
Both containers have resource limits (Qdrant: 1 GB RAM, MCP: 512 MB RAM).
Qdrant-MCP/
├── qdrant_mcp/
│ ├── __init__.py
│ ├── __main__.py # Entry point: --transport stdio|streamable-http
│ ├── server.py # FastMCP server, lifespan, logging setup
│ ├── config.py # All configuration constants
│ ├── embeddings.py # Provider-agnostic OpenAI-compatible embedding client
│ └── tools/
│ ├── search.py # qdrant_search
│ ├── ingest.py # qdrant_ingest_text, qdrant_ingest_file
│ ├── collections.py # qdrant_list_collections, qdrant_collection_info
│ └── delete.py # qdrant_delete
├── rag/
│ └── lib/
│ ├── chunker.py # Text chunking with overlap + page tracking
│ ├── extract.py # PDF/text extraction via pymupdf
│ └── store.py # Qdrant CRUD + search operations
├── Dockerfile
├── docker-compose.yaml
├── requirements.txt
├── LICENSE
└── README.md
The server includes shared modules for text chunking (chunker.py), PDF extraction (extract.py), and Qdrant storage operations (store.py). These are bundled in the repository under rag/lib/ and require no additional setup — they're included in PYTHONPATH automatically when you follow the quickstart instructions.
| Mode | Destination | Notes |
|---|---|---|
| stdio | stderr | Keeps stdout clean for MCP protocol |
| HTTP | Log file (configurable) | Default: ~/logs/qdrant-mcp.log |
Log format: 2026-03-25 19:38:11 INFO qdrant_mcp.search Embedding query: ...
Each tool module has its own logger (qdrant_mcp.search, qdrant_mcp.ingest, qdrant_mcp.delete, qdrant_mcp.collections).
All tools return actionable error messages rather than raw tracebacks:
| Failure | Message |
|---|---|
| Embedding provider unreachable | "Ollama is not reachable (embedding failed). Is it running on port 11434?" |
| Qdrant unreachable | "cannot reach Qdrant" |
| Collection not found | "collection 'X' not found" |
| File not found | "file not found: /path/to/file" |
| Path outside whitelist | "path is outside allowed directories" |
| Invalid parameters | Specific range/constraint messages |
| PDF extraction failed | "failed to extract text from /path — {details}" |
Server not found / ENOENT error
Symptom: spawn python3 ENOENT or server doesn't appear in client.
Cause: The MCP client can't find the Python executable. Claude Desktop and some editors launch servers with a minimal PATH that doesn't include your shell's customizations (pyenv, nvm, conda, etc.).
Fix: Use the absolute path to the venv Python in your config:
# Find your venv Python path
which python3 # system
/path/to/Qdrant-MCP/.venv/bin/python3 # venv (Linux/macOS)
/path/to/Qdrant-MCP/.venv/Scripts/python.exe # venv (Windows)ModuleNotFoundError: No module named 'qdrant_mcp'
Cause: PYTHONPATH is not set or points to the wrong directory.
Fix: Ensure your client config includes PYTHONPATH pointing to the project root:
"env": {
"PYTHONPATH": "/path/to/Qdrant-MCP"
}If you also use rag/lib/ modules, include the parent directory:
"env": {
"PYTHONPATH": "/path/to/Qdrant-MCP:/path/to/parent-containing-rag"
}ModuleNotFoundError: No module named 'mcp' (or qdrant_client, etc.)
Cause: The command in your config points to system Python, but dependencies are installed in a virtual environment.
Fix: Point command to the venv Python binary, not python3:
"command": "/path/to/Qdrant-MCP/.venv/bin/python3"Embedding failed / Ollama not reachable
Cause: The embedding provider is not running or the URL is wrong.
Fix:
- Verify your provider is running:
curl http://localhost:11434/v1/embeddings -d '{"model":"nomic-embed-text","input":["test"]}' - Check
EMBED_BASE_URLmatches your provider's actual URL - For cloud providers, verify
EMBED_API_KEYis set (check env var, not just config.py)
Windows: npx not found for mcp-remote bridge
Cause: Windows spawn doesn't invoke a shell by default, and npx is a shell script.
Fix: Use cmd /c as the command wrapper:
{
"command": "cmd",
"args": ["/c", "npx", "mcp-remote", "http://127.0.0.1:8090/mcp"]
}Connection works from CLI but not from Claude Desktop
Cause: Claude Desktop inherits a minimal environment. Variables like PATH, HOME, PYTHONPATH, and anything set in .bashrc/.zshrc are not available.
Fix: Set all required environment variables explicitly in the env block of your config. Use absolute paths everywhere — no ~, $HOME, or relative paths.
Silent config errors in Claude Desktop
Cause: JSON syntax errors in claude_desktop_config.json fail silently — no error dialog.
Fix: Validate your JSON before saving:
python3 -m json.tool < ~/Library/Application\ Support/Claude/claude_desktop_config.jsonTest your server directly without a client using the MCP Inspector:
# stdio mode
npx @modelcontextprotocol/inspector python3 -m qdrant_mcp --transport stdio
# HTTP mode (start server first, then point inspector at it)
npx @modelcontextprotocol/inspector --url http://localhost:8090/mcpThe Inspector provides an interactive UI to list tools, call them with parameters, and inspect responses.
| Mode | Location |
|---|---|
| stdio | stderr (visible in terminal or client logs) |
| HTTP / Docker | ~/logs/qdrant-mcp.log (configurable via LOG_FILE in config.py) |
| Claude Desktop | ~/Library/Logs/Claude/mcp-server-qdrant.log (macOS) |
- The server runs with user-level permissions. Any file accessible to the user running the server can be read by
qdrant_ingest_file(withinALLOWED_INGEST_PATHS). Configure the path whitelist carefully. - No authentication on HTTP transport. The streamable HTTP endpoint on port 8090 has no auth. If you expose it beyond localhost, anyone who can reach the port can search, ingest, and delete data. Bind to
127.0.0.1(changeMCP_HTTP_HOSTin config.py) if you don't need LAN access, or use a reverse proxy with auth. - API keys in environment variables.
EMBED_API_KEYis read from the environment or config.py. Avoid committing API keys to config.py — use environment variables or a.envfile (already in.gitignore). - Qdrant access. The server connects to Qdrant with no authentication by default. If your Qdrant instance contains sensitive data, enable Qdrant API key authentication.
- No streaming for large ingestions. Ingesting a large PDF (hundreds of pages) blocks until complete. The server remains responsive for other tool calls (async), but the calling client will wait.
- No OCR. Scanned PDFs without a text layer extract as empty. Pre-process with an OCR tool (e.g.,
ocrmypdf) before ingesting. - No
.docx/.xlsxsupport. Convert to PDF or text first. Adding support viapython-docx/openpyxlis straightforward if needed. - Chunk size is word-based, not token-based. The chunker splits on whitespace, which approximates but doesn't exactly match model tokenization. This is sufficient for retrieval but not exact for token budgets.
- Single embedding model per server instance. All collections use the same embedding model/dimension. To use different models for different collections, run separate server instances with different configs.
This started as a personal infrastructure project and is shared as a reference implementation. Forks and adaptations are welcome. If you build something interesting on top of it, feel free to open an issue to share.