pr-2120/mmontalbo/mm/structural-diff-backend-clean-v4
tagged this
14 Jun 18:59
Language-aware diff tools (e.g., Difftastic) and format-specific analyzers can produce better line matching than Git's builtin diff algorithm, but diff.<driver>.command replaces Git's output entirely, losing downstream features like word diff, function context, color, and blame. This series adds diff.<driver>.process, a long-running subprocess protocol that lets an external tool control which lines Git considers changed while Git handles all output formatting. The protocol follows filter.<driver>.process: pkt-line over stdin/stdout, capability negotiation, one process per Git invocation. The tool receives both file versions and returns changed regions (line ranges in the old and new file). Git validates and feeds them into the xdiff pipeline in place of the builtin diff algorithm. When the tool returns no hunks, Git treats the files as having no changes. * Patch 1: xdiff plumbing for externally supplied hunks. * Patch 2: diff.<driver>.process config key. * Patch 3: refactor subprocess API to separate process lifecycle from hashmap management, since the diff process stores its subprocess on the userdiff driver rather than in a hashmap. * Patch 4: the main feature. * Patch 5: bypass knobs (--no-ext-diff, format-patch). * Patch 6: blame integration so the tool can declare commits as having no changes. Changes since v3: * Replaced Python test backend with C test-tool helper (thanks to Johannes Schindelin). * Added test coverage cases for deleted file, malformed hunk line, and missing capability. * Fixed potential overflow in synchronization invariant check by counting from changed[] arrays instead of accumulating. * Accept start=0 with count=0 in the hunk protocol, matching what git diff itself emits for empty file sides. * Warn on external hunk validation failure with specific reasons (range exceeded, overlap, sync mismatch) to help tool authors debug their implementations. * Test backend follows the same convention (start=0 when count=0 for empty file sides). Michael Montalbo (6): xdiff: support external hunks via xpparam_t userdiff: add diff.<driver>.process config sub-process: separate process lifecycle from hashmap management diff: add long-running diff process via diff.<driver>.process diff: bypass diff process with --no-ext-diff and in format-patch blame: consult diff process for no-hunk detection Documentation/config/diff.adoc | 5 + Documentation/diff-algorithm-option.adoc | 3 + Documentation/diff-options.adoc | 4 +- Documentation/gitattributes.adoc | 143 ++++++ Makefile | 2 + blame.c | 40 +- builtin/log.c | 7 + diff-process.c | 297 ++++++++++++ diff-process.h | 39 ++ diff.c | 29 +- diff.h | 5 + meson.build | 1 + sub-process.c | 28 +- sub-process.h | 9 +- t/helper/meson.build | 1 + t/helper/test-diff-process-backend.c | 299 ++++++++++++ t/helper/test-tool.c | 1 + t/helper/test-tool.h | 1 + t/meson.build | 1 + t/t4080-diff-process.sh | 553 +++++++++++++++++++++++ userdiff.c | 7 + userdiff.h | 5 + xdiff-interface.c | 7 +- xdiff/xdiff.h | 14 + xdiff/xdiffi.c | 123 ++++- xdiff/xprepare.c | 10 + xdiff/xprepare.h | 1 + 27 files changed, 1614 insertions(+), 21 deletions(-) create mode 100644 diff-process.c create mode 100644 diff-process.h create mode 100644 t/helper/test-diff-process-backend.c create mode 100755 t/t4080-diff-process.sh base-commit: ea97ad8d017de0c9037451a78008a0fd60abea0c Submitted-As: https://lore.kernel.org/git/pull.2120.v4.git.1781463564.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.2120.git.1779415884.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.2120.v2.git.1779733799.gitgitgadget@gmail.com In-Reply-To: https://lore.kernel.org/git/pull.2120.v3.git.1780087700.gitgitgadget@gmail.com