Code duplication detector for Elixir, inspired by jscpd but built on Elixir's native AST instead of token matching.
Because ExDNA understands code structure β not just text β
fn(a, b) -> a + b end and fn(x, y) -> x + y end are recognized as the
same code. It also tells you how to fix each clone: extract a function, a
macro, or a behaviour callback.
- Three clone types β exact copies (I), renamed variables / changed literals (II), and near-miss clones via structural similarity (III)
- Multi-clause awareness β consecutive
def/defpclauses with the same name/arity are analyzed as a single unit, catching duplicated pattern-matching functions that individual clauses are too small to flag - Delegation pattern detection β
def foo(x), do: foo(x, [])followed bydef foo(x, opts)are grouped as one unit, catching duplicated wrapper+body pairs across modules - Sibling window detection β adjacent functions copied between modules are caught even when the surrounding code differs
- Refactoring suggestions β extract function, extract macro, extract
behaviour with
@callback - Smart naming β suggestions are named after the dominant struct, call,
or pattern (
build_changeset,contact_step) instead ofextracted_function - Pipe normalization β
x |> f()andf(x)match as the same code - Field order normalization β
%User{name: x, age: y}and%User{age: y, name: x}match in Type-II mode - Cross-file grouping β
actions/ β tools/ (6 clones, 298 nodes)instead of listing each pair @no_cloneannotation β suppress known/intentional duplicates- Incremental
Mix.Task.Compilerβ only re-analyzes changed files - LSP server β pushes clone diagnostics to your editor alongside Expert or ElixirLS
- Credo integration β drop-in replacement for
DuplicatedCode, reuses Credo's parsed ASTs - CI-ready β exits with code 1 when clones are found, or use
--max-clonesfor a clone budget - Four output formats β Credo-style console, JSON, self-contained HTML, and SARIF for GitHub Code Scanning
- Fast β parallel file parsing, Plausible (465 files) in ~1 second, Ash (554 files) in ~6 seconds with full Type-I/II/III detection
def deps do
[{:ex_dna, "~> 1.3", only: [:dev, :test], runtime: false}]
endmix ex_dna # scan lib/
mix ex_dna lib/accounts lib/admin # specific paths
mix ex_dna --literal-mode abstract # enable Type-II (renamed vars)
mix ex_dna --min-similarity 0.85 # enable Type-III (near-miss)
mix ex_dna --min-mass 50 # fewer, larger clones
mix ex_dna --max-clones 10 # fail only above budget
mix ex_dna --format json # machine-readable
mix ex_dna --format html # browsable report
mix ex_dna --format sarif # GitHub Code ScanningDeep-dive into a specific clone:
mix ex_dna.explain 3Shows the full anti-unification breakdown β common structure, divergence points, and the suggested extraction with call sites.
report = ExDNA.analyze("lib/")
report = ExDNA.analyze(["lib/", "test/"])
report = ExDNA.analyze(paths: ["lib/"], min_mass: 20, literal_mode: :abstract)
report.clones #=> [%ExDNA.Detection.Clone{}, ...]
report.stats #=> %{files_analyzed: 42, total_clones: 3, ...}Options are layered: defaults β .ex_dna.exs β CLI flags.
Create .ex_dna.exs in your project root:
%{
min_mass: 25,
ignore: ["lib/my_app_web/templates/**"],
excluded_macros: [:schema, :pipe_through, :plug],
normalize_pipes: true
}| Option | CLI flag | Default | Description |
|---|---|---|---|
min_mass |
--min-mass |
30 |
Minimum AST nodes for a fragment |
min_similarity |
--min-similarity |
1.0 |
Threshold for Type-III (set < 1.0 to enable) |
literal_mode |
--literal-mode |
keep |
keep = Type-I only, abstract = also Type-II |
normalize_pipes |
--normalize-pipes |
false |
Treat x |> f() same as f(x) |
excluded_macros |
--exclude-macro |
[] |
Macro calls to skip entirely |
ignored_attributes |
--ignore-attribute |
(see below) | Module attribute names to skip |
parse_timeout |
β | 5000 |
Max ms per file (kills hung parses) |
ignore |
--ignore |
[] |
Glob patterns to exclude |
| β | --max-clones |
β | Clone budget (exit 1 only above this) |
| β | --format |
console |
console, json, html, or sarif |
Default ignored attributes: moduledoc, doc, typedoc, type, typep,
opaque, spec, callback, macrocallback, impl, behaviour,
optional_callbacks, deprecated, derive, enforce_keys,
before_compile, after_compile, after_verify, compile, dialyzer,
external_resource, on_load, on_definition, vsn, no_clone.
Custom module attributes like @extensions, @timeout, or @fields are
fingerprinted and will be reported as duplicates when they appear with the
same value in multiple modules.
@no_clone true
def validate(params) do
# intentional duplication, won't be flagged
endAdd ExDNA as a compiler for automatic detection on mix compile:
def project do
[compilers: Mix.compilers() ++ [:ex_dna]]
endOnly changed files are re-analyzed. Cache is stored in .ex_dna_cache (add to
.gitignore).
ExDNA ships an LSP server that pushes warnings inline on every save. It runs alongside your primary Elixir LSP.
mix ex_dna.lspvim.lsp.config('ex_dna', {
cmd = { 'mix', 'ex_dna.lsp' },
root_markers = { 'mix.exs' },
filetypes = { 'elixir' },
})ExDNA ships a Credo check that replaces the built-in DuplicatedCode with
full Type-I/II/III detection and refactoring suggestions. It reuses Credo's
already-parsed ASTs β no double parsing.
Use as a Credo plugin (recommended) β automatically registers the check and
disables the built-in DuplicatedCode:
# .credo.exs
%{
configs: [
%{
name: "default",
plugins: [{ExDNA.Credo, []}]
}
]
}Or add directly to the :enabled checks list:
{ExDNA.Credo, []}And disable the built-in check:
{Credo.Check.Design.DuplicatedCode, false}All ExDNA options are available as check/plugin params. By default the Credo check uses the same path scope as mix ex_dna (lib/); pass paths: ["lib/", "test/"] if you want Credo to include test files too.
{ExDNA.Credo, [
paths: ["lib/", "test/"],
min_mass: 40,
literal_mode: :abstract,
excluded_macros: [:schema, :pipe_through],
normalize_pipes: true,
min_similarity: 0.85
]}- Parse β
Code.string_to_quoted/2on every.ex/.exsfile (parallel, with per-file timeout) - Normalize β strip line/column metadata β rename variables to positional
placeholders (
$0,$1) β optionally abstract literals β optionally flatten pipes β sort struct/map fields - Fingerprint β walk every subtree above
min_massnodes, hash with BLAKE2b; also generate sliding windows over module-level sibling sequences and compute structural sub-hashes for fuzzy candidate pruning - Detect β group by hash (Type I/II); use inverted index on sub-hashes + Jaccard similarity + tree edit distance for Type III
- Filter β prune nested clones, keep the largest match per location
- Suggest β anti-unify each clone pair to compute the common structure, generate extract-function/macro/behaviour suggestions