Semantic data reconciliation for the command line.
Validate structured datasets using declarative YAML rules and produce deterministic JSON reconciliation reports suitable for CI/CD pipelines.
Fully local. No data leaves your machine.
Typical use cases:
- ETL validation
- Data migration verification
- Financial transaction reconciliation
- CI pipeline dataset checks
- Log comparison with normalization rules
source.csv
txn_id,amount
1,100
2,200
target.csv
txn_id,amount
1,100
2,210
config.yaml
type: tabular
source: source.csv
target: target.csv
keys:
- txn_idreconlify run config.yamlExit code: 1 (differences found). Report highlights:
rows_with_mismatches: 1
missing_in_source: 0
missing_in_target: 0
diff / fc |
Excel (VLOOKUP) | Beyond Compare | Reconlify CLI | |
|---|---|---|---|---|
| Comparison model | Line-by-line text | Manual formulas | Visual file diff | Key-based semantic matching |
| Key-based matching | No | Manual | No | Declarative keys |
| Missing-row detection | No concept of rows | Manual | Manual inspection | Automatic, both directions |
| Column-level comparison | No | Manual per-cell | Limited | Automatic with include/exclude |
| Normalization rules | No | No | Limited | Trim, case, nulls, regex, virtual columns |
| Deterministic JSON output | No | No | No | Yes |
| CI/CD integration | Barely | No | No | Exit codes 0/1/2 + JSON report |
| Local execution | Yes | Yes | Yes | Yes — zero network calls |
- Key-based dataset reconciliation (single or composite keys)
- Automatic missing-row detection (both directions)
- Column-level mismatch detection with include/exclude control
- Numeric tolerance support (per-column absolute tolerance)
- Normalization rules (trim, case-insensitive, null handling, regex, virtual columns)
- Row filters and exclusions
- Deterministic JSON reconciliation reports
- Machine-readable exit codes (0 / 1 / 2)
- Two engines: tabular (CSV/TSV) and text (line-by-line / unordered)
- CI/CD pipeline friendly
- Fully local execution — no network calls
Reconlify uses DuckDB-backed tabular processing, streaming text comparison, and a local-first architecture. It processes large datasets locally without requiring a database or external service.
| Dataset | Rows / Lines | Mode | Time |
|---|---|---|---|
| CSV reconciliation (exact match) | 200k rows | tabular | ~2 s |
| CSV reconciliation (high mismatch) | 200k rows | tabular | ~12 s |
| Log comparison (positional diffs) | 500k lines | line_by_line | ~3 s |
| Log comparison (unordered) | 250k lines | unordered_lines | < 1 s |
Benchmarks were executed on a MacBook (Apple Silicon / Python 3.11) with default fixture settings.
Performance depends on dataset structure, rule complexity, and system hardware.
Full benchmark methodology and results: PERF_TESTING.md
Requires Python 3.11+.
pip install reconlify-cliOr with pipx for isolated installs:
pipx install reconlify-cliPackage name on PyPI:
reconlify-cli
For development:
git clone https://github.com/testuteab/reconlify-cli.git && cd reconlify-cli
make install # runs: poetry install
reconlify --helpreconlify run <config.yaml> # default output: report.json
reconlify run <config.yaml> --out out.json # custom output path| Option | Default | Description |
|---|---|---|
--out PATH |
report.json |
Output path for the JSON report |
--include-line-numbers / --no-include-line-numbers |
on | Include original line numbers in text report samples |
--max-line-numbers N |
0 (unlimited) |
Max line numbers per distinct line in unordered mode |
--debug-report |
off | Include processed line numbers in text report samples |
| Code | Meaning |
|---|---|
| 0 | No differences found |
| 1 | Differences found |
| 2 | Error (config validation, file not found, runtime failure) |
reconlify --version
# reconlify 0.1.0Reconlify follows Semantic Versioning: MAJOR.MINOR.PATCH.
A finance team exports transactions from two systems (ERP and bank ledger) and needs to reconcile them nightly. Here's how Reconlify handles it.
cat <<'EOF' > source.csv
txn_id,booking_date,account,amount,currency,counterparty,status,memo
TXN-001,2026-01-15,4100,1500.00,USD,Acme Corp,BOOKED,Invoice 9201
TXN-002,2026-01-16,4100,320.50,EUR,Globex Inc,BOOKED,Wire transfer
TXN-003,2026-01-17,4200,75.00,USD,Jane Doe,BOOKED,Expense report
TXN-004,2026-01-18,4100,10000.00,USD,Initech,CANCELLED,Reversed
TXN-005,2026-01-19,4300,249.99,USD,Umbrella Ltd,BOOKED,Subscription
TXN-006,2026-01-20,4100,,USD,Soylent Corp,BOOKED,Pending allocation
EOF
cat <<'EOF' > target.csv
txn_id,booking_date,account,amount,currency,counterparty,status,memo
TXN-001,2026-01-15,4100,1500.00,USD, acme corp ,BOOKED,Invoice 9201
TXN-002,2026-01-16,4100,320.75,EUR,Globex Inc,BOOKED,Wire transfer
TXN-003,2026-01-17,4200,75.00,USD,Jane Doe,BOOKED,Expense report
TXN-005,2026-01-19,4300,249.99,USD,Umbrella Ltd,BOOKED,Subscription
TXN-006,2026-01-20,4100,NULL,USD,Soylent Corp,BOOKED,Pending allocation
TXN-007,2026-01-21,4100,430.00,USD,Wayne Ent,BOOKED,New deposit
EOFWhat's different between the two files:
| Scenario | Row | Detail |
|---|---|---|
| Matches after normalization | TXN-001 | counterparty has extra spaces + lowercase in target — should match with trim + case rules |
| Value mismatch | TXN-002 | amount is 320.50 vs 320.75 (exceeds tolerance) |
| Exact match | TXN-003 | Identical |
| Filtered out | TXN-004 | Status CANCELLED — excluded by row filter |
| Exact match | TXN-005 | Identical |
| NULL normalization | TXN-006 | Source has blank amount, target has NULL string — should match |
| Missing in source | TXN-007 | Exists only in target |
cat <<'EOF' > recon.yaml
type: tabular
source: source.csv
target: target.csv
keys:
- txn_id
csv:
delimiter: ","
header: true
encoding: utf-8
compare:
trim_whitespace: true
case_insensitive: true
normalize_nulls: ["", "NULL", "null"]
exclude_columns:
- memo
filters:
row_filters:
both:
- column: status
op: equals
value: CANCELLED
EOFKey config choices:
keys: [txn_id]— match rows by transaction IDtrim_whitespace+case_insensitive— ignore formatting noisenormalize_nulls— treat blank,NULL, andnullas equivalentexclude_columns: [memo]— skip free-text fieldsrow_filters— dropCANCELLEDtransactions before comparison
reconlify run recon.yaml --out report.json
echo $?Expected exit code: 1 (differences found).
The report will contain:
- missing_in_target: 0 — TXN-004 was filtered out, so not counted
- missing_in_source: 1 — TXN-007 exists only in target
- rows_with_mismatches: 1 — TXN-002 has an
amountmismatch (320.50vs320.75)
TXN-001 and TXN-006 match cleanly thanks to normalization rules.
The report is written to report.json. It is deterministic — identical inputs and config always produce identical output (except the generated_at timestamp).
Reconlify also compares text files line-by-line or as unordered line sets:
type: text
source: expected.log
target: actual.log
mode: unordered_lines
normalize:
trim_lines: true
ignore_blank_lines: truereconlify run text_recon.yamlIn unordered_lines mode, line order is ignored — Reconlify compares occurrence counts of each distinct line. In line_by_line mode (default), lines are compared positionally.
See the examples/ directory for config samples.
- Key-based reconciliation — Single or composite keys. Detects missing rows on either side and cell-level value mismatches.
- Column control — Include or exclude specific columns from comparison.
- Numeric tolerance — Absolute tolerance per column (e.g.
amount: 0.01). - String rules — Per-column normalization:
trim,case_insensitive,contains,regex_extract. - Source-side normalization — Virtual columns via ops:
map,concat,substr,add,sub,mul,div,coalesce,date_format,upper,lower,trim,round. - Row filters — Exclude specific key values or filter rows by column rules.
- TSV support — Set
csv.delimiter: "\t".
- Two comparison modes:
line_by_line(positional) andunordered_lines(multiset). - Normalization:
trim_lines,collapse_whitespace,case_insensitive,ignore_blank_lines,normalize_newlines. - Regex rules:
drop_lines_regexto remove lines,replace_regexto transform lines before comparison.
Every run produces a JSON report with a consistent structure:
| Section | Description |
|---|---|
summary |
Aggregate counts (rows, mismatches, missing). Zero differences = exit code 0 |
details |
Metadata: keys used, columns compared, filters applied, per-column mismatch stats |
samples |
Concrete examples of differences (tabular: missing_in_target, missing_in_source, value_mismatches, excluded; text: flat list or samples_agg) |
error |
Present only on exit code 2. Machine-readable code, human-readable message, and details |
warnings |
Optional list of warning strings (e.g. large line-number arrays in unordered mode) |
reconlify run recon.yaml --out report.json
rc=$?
if [ $rc -eq 2 ]; then
echo "ERROR: config or runtime failure" >&2
exit 1
elif [ $rc -eq 1 ]; then
echo "WARN: differences found — see report.json" >&2
fiExit code 1 means differences were found — your pipeline decides whether that's a warning or a failure. Exit code 2 is always an error.
- name: Reconcile data
run: |
reconlify run recon.yaml --out report.json
exit_code=$?
if [ $exit_code -eq 2 ]; then
echo "::error::Reconciliation failed with error"
exit 1
elif [ $exit_code -eq 1 ]; then
echo "::warning::Differences found — see report.json"
fi
- name: Upload report
if: always()
uses: actions/upload-artifact@v4
with:
name: recon-report
path: report.json- User Guide — In-depth guide covering both engines and best practices
- YAML Config Schema — Full reference for all configuration options
- Report Schema — Complete specification of the JSON report format
- Performance Testing — Benchmark methodology and baseline results
Reconlify Desktop is a graphical interface for Reconlify CLI. It allows users to:
- Visually build YAML reconciliation configs
- Run reconciliations without using the terminal
- Inspect reconciliation reports interactively
Reconlify CLI remains the core reconciliation engine.
make install # install dependencies
make test # unit + integration tests (excludes e2e and perf)
make e2e # end-to-end CLI tests
make lint # ruff linter
make format # auto-fix lint + format
make clean # remove build artifacts and cachesmake perf # generate fixtures + run full benchmark suite
make perf-smoke # lightweight perf smoke tests
make perf-clean # remove generated fixturesSee Performance Testing for details and baseline results.
See CHANGELOG.md for release history.
Reconlify sits between simple file diff tools and heavy enterprise reconciliation systems, providing a deterministic, developer-friendly workflow for validating structured data locally.
Released under the MIT License. Copyright (c) 2026 Testute AB.