diff --git a/README.md b/README.md
index e1f2e05..9d23da4 100644
--- a/README.md
+++ b/README.md
@@ -1,15 +1,26 @@
-# semble
-Instant local code search for agents.
+
+ 
+ Fast Code Search for Agents
+
-## Install
+
+
+[Quickstart](#quickstart) •
+[Main Features](#main-features) •
+[MCP Server](#mcp-server) •
+[Benchmarks](#benchmarks)
+
+
+
+Semble is a fast code search library for local and remote repositories. It combines static [Model2Vec](https://github.com/MinishLab/model2vec) embeddings using [potion-code-16M](https://huggingface.co/minishlab/potion-code-16M) with [BM25](https://github.com/xhluca/bm25s) and a specialized hybrid reranking stack to deliver near-transformer accuracy at a fraction of the cost. As an [MCP server](#mcp-server), it gives agents (Claude Code, Cursor, Codex, OpenCode, etc.) instant access to any codebase: repos are cloned and indexed on demand.
+
+## Quickstart
```bash
pip install semble
```
-## Python API
-
```python
from semble import SembleIndex
@@ -19,19 +30,29 @@ index = SembleIndex.from_path("./my-project")
# Index a remote git repository
index = SembleIndex.from_git("https://github.com/MinishLab/model2vec")
-results = index.search("load a pretrained model", top_k=3)
-for result in results:
- print(result)
+results = index.search("save model to disk", top_k=3)
+
+# Each result exposes the matched chunk
+result = results[0]
+result.chunk.file_path # "model2vec/model.py"
+result.chunk.start_line # 127
+result.chunk.end_line # 150
+result.chunk.content # "def save_pretrained(self, path: PathLike, ..."
```
-## MCP server
+## Main Features
-Semble can run as an MCP server so agents (Claude Code, Cursor, etc.) can search any codebase directly.
+- **Fast**: indexes a repo in ~250 ms and answers queries in ~1.5 ms, all on CPU.
+- **Accurate**: NDCG@10 of 0.854 on our benchmarks, on par with code-specialized transformer models, at a fraction of the size and cost.
+- **Local and remote**: pass a local path or a git URL.
+- **MCP server**: drop-in tool for Claude Code, Cursor, Codex, OpenCode, and any other MCP-compatible agent.
+- **Lightweight**: CPU-only, minimal dependencies.
-The agent will clone and index repos on demand as you ask questions. Indexes are cached for the lifetime of the session.
+## MCP Server
+Semble can run as an MCP server so agents can search any codebase directly. Repos are cloned and indexed on demand, and indexes are cached for the lifetime of the session.
-### Installation
+### Setup
#### Claude Code
```bash
@@ -59,7 +80,43 @@ Add to `~/.opencode/config.json`:
}
```
+### Tools
+
| Tool | Description |
|------|-------------|
| `search` | Search a codebase with a natural-language or code query. Pass `repo` as a git URL or local path. |
| `find_related` | Given a file path and line number, return chunks semantically similar to the code at that location. |
+
+## Benchmarks
+
+Quality and speed across all methods on ~1,250 queries over 63 repositories in 19 languages.
+
+| Method | NDCG@10 | Index time | Query p50 |
+|--------|--------:|-----------:|----------:|
+| ripgrep | 0.126 | — | 12 ms |
+| ColGREP | 0.693 | 5.8 s | 124 ms |
+| CodeRankEmbed | 0.765 | 57 s | 16 ms |
+| semble | 0.854 | **263 ms** | **1.5 ms** |
+| CodeRankEmbed Hybrid | **0.862** | 57 s | 16 ms |
+
+The 137M-parameter CodeRankEmbed Hybrid leads NDCG@10 by 0.008. Semble indexes 218x faster and answers queries 11x faster. See [benchmarks](benchmarks/README.md) for per-language results, ablations, and methodology.
+
+## License
+
+MIT
+
+## Citing
+
+If you use Semble in your research, please cite the following:
+
+```bibtex
+@software{minishlab2026semble,
+ author = {{van Dongen}, Thomas and Stephan Tulkens},
+ title = {Semble: Fast Code Search for Agents},
+ year = {2026},
+ publisher = {Zenodo},
+ doi = {10.5281/zenodo.XXXXXXX},
+ url = {https://github.com/MinishLab/semble},
+ license = {MIT}
+}
+```
diff --git a/assets/images/semble_logo.png b/assets/images/semble_logo.png
new file mode 100644
index 0000000..92ade0e
Binary files /dev/null and b/assets/images/semble_logo.png differ
diff --git a/pyproject.toml b/pyproject.toml
index 22c5ec3..620610f 100644
--- a/pyproject.toml
+++ b/pyproject.toml
@@ -4,7 +4,7 @@ build-backend = "setuptools.build_meta"
[project]
name = "semble"
-description = "Instant local code search for agents."
+description = "Fast Code Search for Agents"
authors = [{name = "Thomas van Dongen", email = "thomas123@live.nl"}, { name = "Stéphan Tulkens", email = "stephantul@gmail.com"}]
readme = { file = "README.md", content-type = "text/markdown" }
dynamic = ["version"]
diff --git a/uv.lock b/uv.lock
index 9ec6f74..40218a0 100644
--- a/uv.lock
+++ b/uv.lock
@@ -3084,7 +3084,7 @@ mcp = [
[package.metadata]
requires-dist = [
{ name = "bm25s", specifier = ">=0.2.0" },
- { name = "chonkie", extras = ["code"], specifier = ">=1.6.2" },
+ { name = "chonkie", extras = ["code"], specifier = "==1.6.2" },
{ name = "einops", marker = "extra == 'benchmark'", specifier = ">=0.8.2" },
{ name = "matplotlib", marker = "extra == 'benchmark'", specifier = ">=3.7" },
{ name = "mcp", marker = "extra == 'mcp'", specifier = ">=1.0" },