Skip to content

Releases: johunsang/semble_rs

v0.9.1 — deps/impact --tree (ASCII dependency tree)

15 May 23:14

Choose a tag to compare

deps --tree and impact --tree

Two new flags render dependency relationships as an ASCII tree, no Graphviz required.

semble_rs deps   src/lib.rs --tree              # what this file imports (transitive)
semble_rs deps   src/lib.rs --tree --max-depth 2
semble_rs impact src/types.rs --tree            # who depends on this file (reverse)

Example output:

src/lib.rs
├── src/encoder.rs
├── src/search.rs
│   ├── src/bm25.rs
│   ├── src/encoder.rs  (cycle)
│   └── src/tokens.rs
├── src/index/mod.rs
│   └── src/index/create.rs
│       └── src/chunking.rs  …
└── src/digest.rs
  • Cycle detection. Repeated nodes in the same path are marked (cycle) instead of expanding.
  • Depth cap. --max-depth N truncates with when more children exist.
  • Existing --json / --dot / human output paths are untouched.

Bug fix — multibyte indentation in tree

The glyph is 3 bytes in UTF-8. The previous prefix.pop()-per-byte loop chewed through more chars than it added, garbling indentation past depth ~3 on semble_rs tree --symbols. Switched to prefix.truncate(prefix.len() - push.len()). Both tree and the new dep-tree renderer are now byte-safe.

Notes

  • Additive release; no flag removals.
  • 19/19 sanity checks pass on this repo and a 6,693-file Python codebase.

v0.9.0 — tree command, model2vec-rs integration, README rewrite

15 May 22:44

Choose a tag to compare

tree — codebase map without the ls -R token explosion

New semble_rs tree command prints a gitignore-aware codebase tree. Measured on real repos:

Project tree ls -R Reduction
this repo (Rust + target/) 533 B 398 KB 747×
6,693-file Python backend 3,950 B 254 KB 64×

Options: -d, --max-depth N, --symbols (top-level fn / struct / class per file), --lang rust,python.

encode — embedding model as a CLI

semble_rs encode "<text>" returns the Model2Vec embedding as JSON. Useful for scripting, debugging, and external pipelines.

semble_rs encode "search result scoring"
echo -e "auth\nlogin\ntoken" | semble_rs encode
semble_rs encode "x" --model minishlab/potion-multilingual-128M

model2vec-rs integration

The hand-rolled safetensors loader is replaced with model2vec-rs's StaticModel. SIF token weights are now applied (previously silently ignored). Drops 4 direct dependencies (tokenizers, safetensors, hf-hub, half) — encoder code shrinks from ~140 lines to ~60.

--model option on search / find-related / plan

semble_rs search "auth flow" . --model minishlab/potion-multilingual-128M

Priority: --model > SEMBLE_MODEL_PATH env > default (minishlab/potion-code-16M).

README rewrite

README.md and README.ko.md rewritten in upstream MinishLab/semble style: tagline + nav + feature sections + collapsible ranking signals + benchmark tables + acknowledgements.

Notes

  • All commands keep their existing flags; this release is additive.
  • Default model unchanged (minishlab/potion-code-16M).
  • 100-query self-benchmark unchanged.

v0.8.0 — Ruby / PHP / Swift dependency graph

14 May 17:56

Choose a tag to compare

v0.8.0 — Ruby / PHP / Swift dependency graph

deps and impact now work for Ruby (require_relative + class/
module/method), PHP (namespace + use + require/include), and Swift
(module imports + struct/class/protocol/extension/enum).

Languages with full AST chunking + dependency graph:
Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Kotlin,
Ruby, PHP, Swift.

89 tests pass; 50-query self-benchmark unchanged.

v0.7.0 — Ruby, PHP, Swift AST chunking

14 May 17:46

Choose a tag to compare

v0.7.0 — AST chunking for Ruby, PHP, Swift

Ruby / PHP / Swift now get proper AST-aligned chunks (function /
class / module boundaries) instead of line-based fallback. 86 tests
pass; 50-query self-benchmark unchanged.

Languages with full AST chunking + dependency graph:

  • Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Kotlin
  • Ruby, PHP, Swift (new in this release; AST chunking only)

v0.6.0 — DOT graph viz + ast-grep wrapper + HTML/CSS indexing

14 May 17:35

Choose a tag to compare

v0.6.0 — DOT graph viz, ast-grep integration, HTML/CSS indexing

  • deps --dot / impact --dot: Graphviz DOT output (pipe into dot -Tpng > graph.png)
  • find-pattern "fn $name($$$)" --lang rust: thin wrapper around ast-grep for structural patterns
  • HTML / CSS / SCSS / Vue / Svelte are now indexed (line-based fallback)
  • README: "How does semble_rs compare to ripgrep / ast-grep / IDE built-in" comparison section

50-query self-benchmark unchanged. 83 tests passed.

v0.5.0 — plan + language coverage + minified filter

14 May 15:21

Choose a tag to compare

v0.5.0 — semble_rs plan + broader language coverage + minified filter

  • New plan subcommand: token-efficient exploration flow recommender.
    Runs a small search, ranks likely files, prints the next outline /
    compact / deps / impact commands. Reports confidence (high/medium/
    low) so agents treat low-confidence candidates as leads, not facts.
  • file_walker: bash, csharp, dart, elixir, haskell, json, lua,
    markdown, scala, sql, swift, and others now discovered.
  • Skip *.min.js / *.bundle.js / *.min.css / -min.js by filename
    pattern (cuts vendor noise in JS repos).
  • README slimmed; Korean split into README.ko.md; measured
    performance table added.

50-query self-benchmark unchanged. 83 tests passed.

v0.4.0 — digest subcommand (build/test/CI output compression)

14 May 11:10

Choose a tag to compare

v0.4.0 — digest subcommand for build/test/CI output compression

A new subcommand that locks down the second-largest agent-token sink
(after raw file reads) — cargo, pnpm/npm/yarn/bun, tsc,
pytest, go test, gradle, ruff, mypy, clang/cmake/make/
swift, and GitHub Actions output are auto-detected and compressed
while every error, traceback, and failure context is preserved.

Highlights (measured on 15 real-world fixtures):

  • cargo build: -99.2%
  • pnpm install: -73.6%
  • GitHub Actions log (3.3 MB): -98.9%
  • TOTAL across 15 fixtures: -98.7%

Usage:
cargo build 2>&1 | semble_rs digest
gh run view --log-failed | semble_rs digest

Tests: 75 passed (30 new digest tests). 50-query search eval unchanged.

v0.3.0 — Kotlin support + experimental model swap

14 May 09:24

Choose a tag to compare

v0.3.0 — Kotlin support + experimental model swap

  • Kotlin first-class support (AST chunking + dependency graph), plus
    dotted-import false-positive fix affecting Kotlin/Java/Python
    (PR #1 by @clroot)
  • New SEMBLE_MODEL_PATH env var: point the encoder at a local model2vec
    output directory to swap in any distilled embedding without recompiling
  • README: 50-query ground-truth eval published for the default model
    (R@1 70%, R@5 98%, R@10 100%, Korean R@5 80%) + guidance on which
    CoIR teachers to consider for further distillation

v0.2.0 — --outline and --group modes

14 May 07:14

Choose a tag to compare

v0.2.0 — --outline and --group modes

New output modes for further agent-token reduction beyond --compact:

  • --outline: one signature line per chunk (100% well-formed, no truncation)
  • --group: directory grouping + match lines capped at 3 per chunk

Measured on 33 queries against the semble_rs repo:
--outline: -47.1% vs --compact
--group: -46.7% vs --compact

Recommended workflow:

  1. --outline for first-pass structural scan
  2. --compact for matching-line context
  3. --json --strip when chunk bodies are required