Releases · johunsang/semble_rs

15 May 23:14

johunsang

v0.9.1

0fbee16

v0.9.1 — deps/impact --tree (ASCII dependency tree) Latest

Latest

`deps --tree` and `impact --tree`

Two new flags render dependency relationships as an ASCII tree, no Graphviz required.

semble_rs deps   src/lib.rs --tree              # what this file imports (transitive)
semble_rs deps   src/lib.rs --tree --max-depth 2
semble_rs impact src/types.rs --tree            # who depends on this file (reverse)

Example output:

src/lib.rs
├── src/encoder.rs
├── src/search.rs
│   ├── src/bm25.rs
│   ├── src/encoder.rs  (cycle)
│   └── src/tokens.rs
├── src/index/mod.rs
│   └── src/index/create.rs
│       └── src/chunking.rs  …
└── src/digest.rs

Cycle detection. Repeated nodes in the same path are marked (cycle) instead of expanding.
Depth cap. --max-depth N truncates with … when more children exist.
Existing --json / --dot / human output paths are untouched.

Bug fix — multibyte indentation in `tree`

The │ glyph is 3 bytes in UTF-8. The previous prefix.pop()-per-byte loop chewed through more chars than it added, garbling indentation past depth ~3 on semble_rs tree --symbols. Switched to prefix.truncate(prefix.len() - push.len()). Both tree and the new dep-tree renderer are now byte-safe.

Notes

Additive release; no flag removals.
19/19 sanity checks pass on this repo and a 6,693-file Python codebase.

Assets 2

15 May 22:44

johunsang

v0.9.0

fba1700

v0.9.0 — tree command, model2vec-rs integration, README rewrite

`tree` — codebase map without the `ls -R` token explosion

New semble_rs tree command prints a gitignore-aware codebase tree. Measured on real repos:

Project	`tree`	`ls -R`	Reduction
this repo (Rust + `target/`)	533 B	398 KB	747×
6,693-file Python backend	3,950 B	254 KB	64×

Options: -d, --max-depth N, --symbols (top-level fn / struct / class per file), --lang rust,python.

`encode` — embedding model as a CLI

semble_rs encode "<text>" returns the Model2Vec embedding as JSON. Useful for scripting, debugging, and external pipelines.

semble_rs encode "search result scoring"
echo -e "auth\nlogin\ntoken" | semble_rs encode
semble_rs encode "x" --model minishlab/potion-multilingual-128M

`model2vec-rs` integration

The hand-rolled safetensors loader is replaced with model2vec-rs's StaticModel. SIF token weights are now applied (previously silently ignored). Drops 4 direct dependencies (tokenizers, safetensors, hf-hub, half) — encoder code shrinks from ~140 lines to ~60.

`--model` option on `search` / `find-related` / `plan`

semble_rs search "auth flow" . --model minishlab/potion-multilingual-128M

Priority: --model > SEMBLE_MODEL_PATH env > default (minishlab/potion-code-16M).

README rewrite

README.md and README.ko.md rewritten in upstream MinishLab/semble style: tagline + nav + feature sections + collapsible ranking signals + benchmark tables + acknowledgements.

Notes

All commands keep their existing flags; this release is additive.
Default model unchanged (minishlab/potion-code-16M).
100-query self-benchmark unchanged.

Assets 2

14 May 17:56

johunsang

v0.8.0

e15fedb

v0.8.0 — Ruby / PHP / Swift dependency graph

deps and impact now work for Ruby (require_relative + class/
module/method), PHP (namespace + use + require/include), and Swift
(module imports + struct/class/protocol/extension/enum).

Languages with full AST chunking + dependency graph:
Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Kotlin,
Ruby, PHP, Swift.

89 tests pass; 50-query self-benchmark unchanged.

Assets 2

14 May 17:46

johunsang

v0.7.0

f375cec

v0.7.0 — Ruby, PHP, Swift AST chunking

v0.7.0 — AST chunking for Ruby, PHP, Swift

Ruby / PHP / Swift now get proper AST-aligned chunks (function /
class / module boundaries) instead of line-based fallback. 86 tests
pass; 50-query self-benchmark unchanged.

Languages with full AST chunking + dependency graph:

Rust, Python, JavaScript, TypeScript, Go, Java, C, C++, Kotlin
Ruby, PHP, Swift (new in this release; AST chunking only)

Assets 2

14 May 17:35

johunsang

v0.6.0

12903cb

v0.6.0 — DOT graph viz + ast-grep wrapper + HTML/CSS indexing

v0.6.0 — DOT graph viz, ast-grep integration, HTML/CSS indexing

deps --dot / impact --dot: Graphviz DOT output (pipe into dot -Tpng > graph.png)
find-pattern "fn $name($$$)" --lang rust: thin wrapper around ast-grep for structural patterns
HTML / CSS / SCSS / Vue / Svelte are now indexed (line-based fallback)
README: "How does semble_rs compare to ripgrep / ast-grep / IDE built-in" comparison section

50-query self-benchmark unchanged. 83 tests passed.

Assets 2

14 May 15:21

johunsang

v0.5.0

d2ae951

v0.5.0 — plan + language coverage + minified filter

v0.5.0 — semble_rs plan + broader language coverage + minified filter

New plan subcommand: token-efficient exploration flow recommender.
Runs a small search, ranks likely files, prints the next outline /
compact / deps / impact commands. Reports confidence (high/medium/
low) so agents treat low-confidence candidates as leads, not facts.
file_walker: bash, csharp, dart, elixir, haskell, json, lua,
markdown, scala, sql, swift, and others now discovered.
Skip *.min.js / *.bundle.js / *.min.css / -min.js by filename
pattern (cuts vendor noise in JS repos).
README slimmed; Korean split into README.ko.md; measured
performance table added.

50-query self-benchmark unchanged. 83 tests passed.

Assets 2

14 May 11:10

johunsang

v0.4.0

06f6d27

v0.4.0 — digest subcommand (build/test/CI output compression)

v0.4.0 — digest subcommand for build/test/CI output compression

A new subcommand that locks down the second-largest agent-token sink
(after raw file reads) — cargo, pnpm/npm/yarn/bun, tsc,
pytest, go test, gradle, ruff, mypy, clang/cmake/make/
swift, and GitHub Actions output are auto-detected and compressed
while every error, traceback, and failure context is preserved.

Highlights (measured on 15 real-world fixtures):

cargo build: -99.2%
pnpm install: -73.6%
GitHub Actions log (3.3 MB): -98.9%
TOTAL across 15 fixtures: -98.7%

Usage:
cargo build 2>&1 | semble_rs digest
gh run view --log-failed | semble_rs digest

Tests: 75 passed (30 new digest tests). 50-query search eval unchanged.

Assets 2

14 May 09:24

johunsang

v0.3.0

238fd83

v0.3.0 — Kotlin support + experimental model swap

Kotlin first-class support (AST chunking + dependency graph), plus
dotted-import false-positive fix affecting Kotlin/Java/Python
(PR #1 by @clroot)
New SEMBLE_MODEL_PATH env var: point the encoder at a local model2vec
output directory to swap in any distilled embedding without recompiling
README: 50-query ground-truth eval published for the default model
(R@1 70%, R@5 98%, R@10 100%, Korean R@5 80%) + guidance on which
CoIR teachers to consider for further distillation

Contributors

clroot

Assets 2

14 May 07:14

johunsang

v0.2.0

7931b1d

v0.2.0 — --outline and --group modes

New output modes for further agent-token reduction beyond --compact:

--outline: one signature line per chunk (100% well-formed, no truncation)
--group: directory grouping + match lines capped at 3 per chunk

Measured on 33 queries against the semble_rs repo:
--outline: -47.1% vs --compact
--group: -46.7% vs --compact

Recommended workflow:

--outline for first-pass structural scan
--compact for matching-line context
--json --strip when chunk bodies are required

Assets 2

Releases: johunsang/semble_rs

v0.9.1 — deps/impact --tree (ASCII dependency tree)

deps --tree and impact --tree

Bug fix — multibyte indentation in tree

Notes

Uh oh!

v0.9.0 — tree command, model2vec-rs integration, README rewrite

tree — codebase map without the ls -R token explosion

encode — embedding model as a CLI

model2vec-rs integration

--model option on search / find-related / plan

README rewrite

Notes

Uh oh!

v0.8.0 — Ruby / PHP / Swift dependency graph

Uh oh!

v0.7.0 — Ruby, PHP, Swift AST chunking

Uh oh!

v0.6.0 — DOT graph viz + ast-grep wrapper + HTML/CSS indexing

Uh oh!

v0.5.0 — plan + language coverage + minified filter

Uh oh!

v0.4.0 — digest subcommand (build/test/CI output compression)

Uh oh!

v0.3.0 — Kotlin support + experimental model swap

Contributors

Uh oh!

v0.2.0 — --outline and --group modes

Uh oh!

`deps --tree` and `impact --tree`

Bug fix — multibyte indentation in `tree`

`tree` — codebase map without the `ls -R` token explosion

`encode` — embedding model as a CLI

`model2vec-rs` integration

`--model` option on `search` / `find-related` / `plan`