v1.5.0
New
- Guard-aware normalization — in
:abstractmode, all calls inside
whenguard clauses are abstracted so that functions differing only in
guard predicates are detected as clones. Covers Kernel guards, Erlang
BIFs,defguardmacros, and library guards likeInteger.is_even/1. - Boolean operator canonicalization —
&&/||/!are rewritten to
and/or/notso stylistic choice between short-circuit and keyword
operators doesn’t prevent clone matching. - Sigil
~wexpansion —~w(foo bar)ais expanded to[:foo, :bar]
so sigil word-lists match their literal equivalents. - MinHash-accelerated fuzzy detection — large posting lists (>50
entries) now use MinHash signatures for O(k) approximate Jaccard instead
of O(|A|+|B|) exact set operations. Removes the hard posting-list cap,
improving recall for large monorepos without sacrificing precision. - HTML report syntax highlighting via Makeup — proper Elixir
tokenization with dark/light theme support, replacing the regex-based
highlighter. - Configurable detection tuning — previously hardcoded constants are
now available as config options and CLI flags:max_window_size(--max-window-size, default: 4) — max consecutive
sibling functions combined into a single fingerprint for cross-module
clone detection.mass_tolerance(--mass-tolerance, default: 0.3) — max relative size
difference for Type-III comparison.
Changed
ignored_attributesdefault — derived from
Module.reserved_attributes/0instead of a hardcoded list. Picks up
5 previously missing attributes and stays current with future Elixir
versions automatically.
Performance
- Fused normalizer — metadata stripping, boolean canonicalization, sigil
expansion, pipe normalization, and variable renaming run in a single AST
walk instead of 4 separate traversals. Ash (572 files) ~14% faster.
Benchmarked on real-world projects with full Type-I/II/III detection
(literal_mode: :abstract, min_similarity: 0.85, normalize_pipes: true):
| Project | Files | Clones | Time |
|---|---|---|---|
| Broadway | 22 | 1 | 45ms |
| Nx | 42 | 12 | 674ms |
| Nerves | 50 | 2 | 172ms |
| Ecto | 56 | 19 | 525ms |
| Commanded | 63 | 8 | 147ms |
| Oban | 66 | 16 | 193ms |
| Phoenix | 74 | 14 | 607ms |
| Elixir stdlib | 105 | 84 | 1.6s |
| Surface | 109 | 31 | 513ms |
| Absinthe | 263 | 63 | 590ms |
| Livebook | 265 | 62 | 2.1s |
| Plausible | 465 | 80 | 2.4s |
| Ash | 572 | 535 | 5.8s |