diff --git a/README.md b/README.md index 28dfd5b..2e0c25e 100644 --- a/README.md +++ b/README.md @@ -11,38 +11,16 @@ pip install semble ## Python API ```python -from semble import SearchMode, SembleIndex +from semble import SembleIndex index = SembleIndex.from_path("./my-project") -# Hybrid search (semantic + BM25, default) results = index.search("how does authentication work?", top_k=5) -for r in results: - print(r.chunk.location, f"score={r.score:.3f}") - print(r.chunk.content[:200]) - -# Keyword-only -results = index.search("JWT token", mode=SearchMode.BM25) -``` - -## Search modes - -| Mode | Description | -|------|-------------| -| `hybrid` | Semantic + BM25, normalized and combined (default) | -| `semantic` | Embedding similarity only | -| `bm25` | Keyword search only | - -## Disk embedding cache - -Embeddings are cached to `~/.cache/semble` by default so re-indexing unchanged files is instant. When using a custom encoder, pass `model_name` to enable caching: - -```python -index = SembleIndex.from_path("./my-project", model=my_model, model_name="my-org/my-model") +for result in results: + print(result.chunk.location, f"score={result.score:.3f}") + print(result.chunk.content[:200]) ``` -Only embeddings are cached; BM25 and the ANNS index are always rebuilt fresh. - ## MCP server Semble can run as an MCP server so agents (Claude Code, Cursor, etc.) can search your codebase directly. @@ -63,5 +41,5 @@ This indexes the directory at startup and exposes two tools: | Tool | Description | |------|-------------| -| `search` | Search with a natural-language or code query. Supports `hybrid` (default), `semantic`, and `bm25` modes. | +| `search` | Search your codebase with a natural-language or code query. | | `find_related` | Given a file path and line number, return chunks semantically similar to the code at that location. |