rust-mcp — Local Semantic Code Search via MCP

A high-performance MCP (Model Context Protocol) server that indexes your codebase into a local PostgreSQL + pgvector database and exposes semantic search to AI assistants (Claude Desktop, VS Code Copilot, etc.) — no cloud services required, no per-query cost.

Why?

	rust-mcp (this)	Zilliz Cloud + OpenAI
Embeddings	Local (FastEmbed / ONNX)	OpenAI API — ~$0.0001/1k tokens
Vector storage	Self-hosted PostgreSQL	Zilliz Cloud — $25+/month
Privacy	100% local — code never leaves your machine	Code sent to OpenAI + Milvus
Index speed	~12 000 chunks / 30 s	Network-bound
Offline use	Yes	No

Features

Semantic search — find code by meaning, not just keywords ("find payment logic" finds process_transaction())
Multi-provider embeddings — FastEmbed (default), LocalONNX, Ollama, OpenAI, HuggingFace
Rich search filters — file extensions, path pattern (SQL LIKE), tags, date ranges, similarity threshold
Auto-tagging — chunks auto-tagged as api_endpoint, database_model, authentication, etc.
Multi-project — index multiple repos into one DB, search per-project or across all
pgvector indexes — IVFFlat (fast build) or HNSW (better recall) cosine similarity
MCP protocol — works natively with Claude Desktop and VS Code Copilot Chat

Requirements

Rust 1.75+
PostgreSQL 14+ with pgvector extension
(Optional) Ollama for local Ollama embeddings

Quick Start

1. Install pgvector

# Ubuntu / Debian
sudo apt install postgresql-16-pgvector

# macOS (Homebrew)
brew install pgvector

# Windows: see https://github.com/pgvector/pgvector#windows

2. Create database

CREATE DATABASE claude_context;

Apply the schema (or let the server auto-initialize on first run):

psql -d claude_context -f schema.sql

3. Configure environment

cp .env.example .env
# Edit .env — set POSTGRES_URL at minimum

4. Build

cargo build --release --bin rust-mcp-server

5. Wire into your MCP client

Claude Desktop — add to claude_desktop_config.json:

{
  "mcpServers": {
    "vector-database": {
      "command": "C:/path/to/rust-mcp-server.exe",
      "args": ["local-onnx"],
      "env": {
        "POSTGRES_URL": "postgresql://postgres:password@localhost:5432/claude_context",
        "RUST_LOG": "info"
      }
    }
  }
}

VS Code Copilot — add to mcp.json (User or workspace):

{
  "servers": {
    "vector-database": {
      "type": "stdio",
      "command": "C:/path/to/rust-mcp-server.exe",
      "args": ["local-onnx"],
      "env": {
        "POSTGRES_URL": "postgresql://postgres:password@localhost:5432/claude_context",
        "RUST_LOG": "info"
      }
    }
  }
}

The local-onnx argument selects the embedded ONNX model (mxbai-embed-large-v1, 1024D). No additional setup needed.

MCP Tools Reference

Once connected, the following tools are available to your AI assistant:

Indexing

Tool	Description
`vector-index_codebase_incremental`	Index a project (or update existing). Creates the project if it does not exist.
`vector-reindex_local_onnx`	Re-embed all chunks for an existing project using the local ONNX model.
`vector-remove_project`	Delete a project and all its chunks from the database.
`vector-list_projects`	List all indexed projects with chunk counts and metadata.
`vector-get_project_stats`	Detailed stats for one project (chunk count, file types, tags, etc.).

Searching

Tool	Description
`vector-advanced_search_code`	Semantic search with full filter support (main search tool).
`vector-search_code`	Simple semantic search — query + limit only.
`vector-search_by_tags`	Find chunks matching specific auto-tags.
`vector-find_similar_code`	Find code similar to a given snippet.

Utilities

Tool	Description
`vector-get_chunk_by_id`	Retrieve a specific chunk by its database ID.
`vector-get_file_chunks`	Get all chunks for a specific file path.
`vector-database_health`	Check PostgreSQL + pgvector connectivity and index status.

Usage Examples

Index a project

Ask your AI assistant:

"Index my project at /home/user/myrepo as my-api"

The assistant calls vector-index_codebase_incremental:

{
  "path": "/home/user/myrepo",
  "project_name": "my-api"
}

Basic semantic search

"Find authentication middleware in my-api"

{
  "path": "/home/user/myrepo",
  "query": "authentication middleware token validation",
  "limit": 10
}

Search with filters

By file extension:

{
  "path": "/home/user/myrepo",
  "query": "database connection pool",
  "file_extensions": ["py", "rs"],
  "limit": 10
}

By path pattern (SQL LIKE syntax):

{
  "path": "/home/user/myrepo",
  "query": "surge pricing calculation",
  "path_pattern": "%pricing%",
  "file_extensions": ["py"],
  "limit": 10
}

Narrow to a specific module:

{
  "path": "/home/user/myrepo",
  "query": "A/B test variant selection",
  "path_pattern": "%smart_engine%",
  "limit": 10
}

By auto-tags:

{
  "path": "/home/user/myrepo",
  "query": "payment processing",
  "tags": ["payment_system", "api_endpoint"],
  "tag_logic": "AND",
  "limit": 10
}

With similarity threshold (filter out weak matches):

{
  "path": "/home/user/myrepo",
  "query": "order state machine transitions",
  "min_similarity": 0.15,
  "limit": 20
}

Search tips

No results? Try a broader query. Semantic search matches concepts, not exact keywords.
Too many irrelevant results? Add path_pattern to scope the search to a module, or raise min_similarity.
Large codebase (50k+ chunks)? Always use path_pattern or file_extensions to reduce noise.
path_pattern uses SQL LIKE syntax: %auth% matches any path containing "auth", %modules/api/% matches that subdirectory.

Embedding Providers

Provider	Quality	Speed	Requires
`local-onnx` (recommended)	High	Fast	ONNX model in `models/` dir
`fastembed`	High	Very fast	Nothing — bundled model
`ollama`	High	Medium	Ollama running locally
`openai`	Very high	Network	`OPENAI_API_KEY`
`huggingface`	High	Network	`HUGGINGFACE_API_KEY`

Default model: mixedbread-ai/mxbai-embed-large-v1 @ 1024 dimensions.

Pass the provider as the first argument to the server binary, e.g.:

rust-mcp-server.exe local-onnx
rust-mcp-server.exe ollama
rust-mcp-server.exe openai

Auto-Tags

During indexing, each chunk is automatically tagged based on its content and path. You can filter searches by tag:

Tag	Detected when
`api_endpoint`	Flask/FastAPI route decorators, Express handlers
`api_layer`	Files named `api_`, `routes_`, `views_*`
`database_model`	SQLAlchemy models, ORM classes, schema definitions
`database_schema`	SQL DDL statements, migration files
`authentication`	Login, token, JWT, OAuth patterns
`business_logic`	Core logic files not matching other categories
`order_management`	Order, booking, ride, dispatch patterns
`payment_system`	Payment, invoice, billing, transaction patterns
`configuration`	Config files, settings, environment loading
`test_code`	Files in `tests/`, `test_*` prefix/suffix
`documentation`	Markdown, docstrings, README files
`filetype_{ext}`	One tag per file extension (e.g. `filetype_py`)
`module_{dir}`	Top-level directory name (e.g. `module_src`)

Performance

Benchmarked on e5-2680c2 / 32 GB RAM:

Indexing: ~12 000 chunks in ~30 seconds (local-onnx, CPU)
Search: < 50 ms cosine similarity over 500k+ chunks (IVFFlat index)

Indexing throughput depends heavily on file count and chunk size. A 43k-file Python project (49k chunks) indexes in ~143 seconds.

Troubleshooting

Server fails to start — "extension vector does not exist"

# Enable pgvector in your database
psql -d claude_context -c "CREATE EXTENSION IF NOT EXISTS vector;"

Search returns unrelated results

Add path_pattern to narrow the search scope
Increase min_similarity (try 0.1 or 0.2)
Use more specific query terms — describe the behavior, not just names

"Project not found" on reindex

Use vector-index_codebase_incremental first to create the project, then vector-reindex_local_onnx for subsequent updates.

Verify pgvector is installed

psql -d claude_context -c "SELECT * FROM pg_extension WHERE extname = 'vector';"

Verify Ollama is running (if using Ollama provider)

curl http://localhost:11434/api/tags

License

MIT

Contributing

PRs welcome. Run cargo clippy -- -D warnings and cargo test before submitting.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
models		models
src		src
.env.example		.env.example
.gitignore		.gitignore
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
README.md		README.md
schema.sql		schema.sql

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

rust-mcp — Local Semantic Code Search via MCP

Why?

Features

Requirements

Quick Start

1. Install pgvector

2. Create database

3. Configure environment

4. Build

5. Wire into your MCP client

MCP Tools Reference

Indexing

Searching

Utilities

Usage Examples

Index a project

Basic semantic search

Search with filters

Search tips

Embedding Providers

Auto-Tags

Performance

Troubleshooting

License

Contributing

About

Uh oh!

Releases 1

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

rust-mcp — Local Semantic Code Search via MCP

Why?

Features

Requirements

Quick Start

1. Install pgvector

2. Create database

3. Configure environment

4. Build

5. Wire into your MCP client

MCP Tools Reference

Indexing

Searching

Utilities

Usage Examples

Index a project

Basic semantic search

Search with filters

Search tips

Embedding Providers

Auto-Tags

Performance

Troubleshooting

License

Contributing

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages