Indexed grep for codebases. Build a trigram index once, then search 2 to 3x faster than ripgrep on literal queries.
sift index build . # one-time index
sift "PM_RESUME" # instant resultscurl -fsSL https://raw.githubusercontent.com/botirk38/sift/master/scripts/install.sh | shOr from source: cargo build --release -p sift-grep
Upgrade: sift update
Install the agent skill so your agent knows how to use sift:
npx skills add botirk38/siftThis works with Claude Code, Cursor, Codex, Devin, and other agents that support the SKILL.md format. The skill teaches the agent when and how to build indexes, run queries, and interpret results.
See skills/ for details.
- Build: extract overlapping 3-byte trigrams from every file, persist as memory-mapped tables.
- Plan: decompose the regex into trigram terms, intersect posting lists to produce a candidate set.
- Search: scan only candidate files with the full regex engine.
Queries with index hits skip most of the corpus. Full-scan fallback (e.g. \p{Greek}) matches ripgrep performance.
Linux kernel source tree, 79K files, 1.3 GB. End-to-end CLI wall-clock (includes startup, index open, daemon).
| Query type | rg |
sift |
Speedup |
|---|---|---|---|
| Literal | 1.17s | 0.42s | 2.8x |
| Word match | 1.15s | 0.41s | 2.8x |
| Regex with literal | 1.17s | 0.44s | 2.7x |
| Alternation | 0.88s | 0.48s | 1.8x |
| Unicode (full scan) | 2.05s | 2.40s | 0.9x |
| No-literal (full scan) | 2.31s | 2.42s | 1.0x |
11/11 benchmarks produce identical line counts. See benchsuite/.
The search engine itself runs in ~18 ms for indexed literals (
--stats). Wall-clock is dominated by process startup and daemon coordination.
Sift is built around composable on-disk indexes. The Indexes registry opens all configured indexes under a .sift directory and intersects their candidate sets at query time.
pattern --> Planner --> [ngram-3] [Future Index B] [Future Index C]
\ | /
intersect / union
|
candidate set --> regex scan
Today the default configured index is ngram-3, a runtime-width N-gram index. Adding a new index family means adding configured identity plus opened runtime dispatch while leaving the planner, search engine, and CLI flow unchanged.
| Path | Role |
|---|---|
crates/core/ |
Index registry, query planner, search engine |
crates/cli/ |
sift binary (ripgrep-compatible flags) |
benchsuite/ |
Comparative rg vs sift benchmarks + chart generation |
fuzz/ |
Cargo-fuzz targets |
- Requires
sift index buildbefore searching (async via daemon by default,--waitfor blocking). - Search paths must sit under the indexed corpus root.
SIFT_NO_DAEMON=1disables background indexing.
See docs/rg-compat-matrix.md for the full flag matrix.
cargo fmt --all -- --check
cargo clippy --workspace --all-targets --all-features -- -D warnings
cargo test --workspace --all-featuresDual-licensed under MIT or Apache 2.0, at your option.

