Development

Prerequisites

Node.js 20.19.x, 22.12+, or 24+ and npm
Rust stable
Tauri prerequisites for your platform
On Windows: Microsoft C++ Build Tools and WebView2

Install

npm install

Run

Native app:

npm run tauri dev

Browser mock mode:

npm run dev

Browser mode does not use the Rust backend. It loads src/data/mockSession.ts after clicking Open a repository.

Test

npm run test:rust
npm run build
npm run test:unit
npm run test:e2e:web

test:rust includes integration tests in src-tauri/tests/ that spawn the built diff-drift binary. test:unit runs the vitest component tests in tests/unit/.

Native E2E:

npm run test:e2e:tauri

Native E2E builds a debug Tauri app and launches it with isolated environment variables:

DIFF_DRIFT_E2E_REPO
DIFF_DRIFT_E2E_EXPORT_PATH
DIFF_DRIFT_E2E_STATE_FILE
DIFF_DRIFT_E2E_BIN

Benchmarks

Rust analyzer benchmarks are local-only and are not part of CI:

npm run bench

For optimization work, capture a Criterion baseline before changing code, then compare after:

cargo bench --manifest-path src-tauri/Cargo.toml -- --save-baseline pre-opt
cargo bench --manifest-path src-tauri/Cargo.toml -- --baseline pre-opt

Evaluation Harness

The deterministic product benchmark is part of CI:

npm run eval:engine

It builds temporary git repos from eval/cases/*.case.mjs, runs diff-drift check --json, and scores the result against each case oracle. Generated repos, reports, packets, answers, and latest score JSON live under .eval/ and are ignored.

Generate blind-review packets for another agent or human:

npm run eval:packets

Packets are written to .eval/packets/<case-id>/ with a prompt, Diff Drift markdown report, raw git diff, and metadata without the oracle. Save blind-agent JSON answers under .eval/answers/<case-id>.json — include an evaluator: { id, kind: "model" | "human", note } field so the scorecard can attribute them — then score:

npm run eval:score-agent

Blind-agent scoring writes .eval/results/agents/latest.json, .eval/results/agents/latest.md, and .eval/results/agents/latest.html. This scorecard is advisory, not a blocker: use it to see whether reviewers reach the right decisions, cite the right evidence, and where the report or rubric is confusing. Engine eval remains the CI gate; blind-agent scorecards are local product-quality telemetry with no network calls. Working outputs under .eval/ stay uncommitted; the one exception is published benchmark snapshots in eval/benchmarks/<version>/ (answers + scorecard), which exist so anyone can rescore the headline number from a fresh clone: npm run eval:score-agent -- eval/benchmarks/v4/answers. The scorecard reports decision accuracy, severity-weighted recall, localization, precision, and per-rule recall, lists every evaluator, and shows an "independent external validation pending" banner until a non-project human evaluator has contributed. The rubric, aliases, and accepted decisions are frozen before answers are generated; scorer changes start a new benchmark version instead of rewriting older scores. See Eval Methodology for the contract and A/B Study Design for the planned packet-vs-raw-diff study.

Diff Drift blind-agent benchmark scorecard

Use --case <case-id> with eval:engine or eval:packets to narrow a run while developing a fixture. Use --keep to preserve the generated temp repo path printed by the script for debugging.

Measure flag noise on real repos you choose (nothing bundled, nothing uploaded):

npm run eval:fp-replay

It reads fp-replay.config.json (copy fp-replay.config.example.json) and writes per-rule flag counts to .eval/results/fp-replay/latest.md — see Eval Methodology.

Visual Baselines

Visual regression checks are local-only and are not run in CI. They cover the browser mock onboarding, loaded session, and dismissed states.

npm run test:visual

Regenerate baselines intentionally after a reviewed UI change:

npm run test:visual:update
npm run test:visual

CI

.github/workflows/ci.yml runs on every push/PR: cargo test + cargo clippy -- -D warnings on Windows; npm run build + unit tests + the web E2E on Ubuntu; a native Tauri build smoke (debug, no bundle) plus an advisory native E2E on Windows; and cargo audit + npm audit dependency gates on Ubuntu. Keep clippy clean — warnings fail the build.

Headless Check

The CLI has a dedicated console bin: cargo run --bin diff-drift-cli -- check <path> --json (or run the built diff-drift-cli.exe check …). The app binary serves the same subcommand (cargo run -- check …), but its release build is windows-subsystem, so PowerShell/cmd don't wait for it — anything scripted should use diff-drift-cli. The check is read-only by design; see the User Guide for flags and exit codes.

Fixtures

Rust unit tests use src-tauri/src/test_fixture.rs to build temporary git repos with git2. They should not depend on demo/ or a globally installed git binary.

The optional demo repo can be seeded with:

npm run seed:demo

Expected PR Scope

Keep changes narrow:

Backend analyzer changes need Rust tests.
Frontend copy or interaction changes need web E2E updates when assertions change.
Native behavior changes should run native E2E locally when practical.
Documentation changes should avoid repeating the README in every page.

Useful Files

src-tauri/src/rules.rs: rule predicates and tests.
src-tauri/src/session.rs: analysis orchestration and counts.
src-tauri/src/watcher.rs: live updates.
src/App.tsx: main frontend state.
src/types.ts and src-tauri/src/model.rs: shared contract.

Diff Drift Wiki

Repository · Discussions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Development

Development

Prerequisites

Install

Run

Test

Benchmarks

Evaluation Harness

Visual Baselines

CI

Headless Check

Fixtures

Expected PR Scope

Useful Files

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Diff Drift Wiki

Clone this wiki locally