Google's Diff Match and Patch library, packaged for modern Python.
diff-match-patch is supported on Python 2.7 or Python 3.4 or newer. You can install it from PyPI:
python -m pip install diff-match-patch
Generating a patchset (analogous to unified diff) between two texts:
from diff_match_patch import diff_match_patch dmp = diff_match_patch() patches = dmp.patch_make(text1, text2) diff = dmp.patch_toText(patches)
Applying a patchset to a text can then be done with:
from diff_match_patch import diff_match_patch dmp = diff_match_patch() patches = dmp.patch_fromText(diff) new_text, _ = dmp.patch_apply(patches, text)
The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text.
- Compare two blocks of plain text and efficiently return a list of differences.
- Diff Demo
- Given a search string, find its best fuzzy match in a block of plain text. Weighted for both accuracy and location.
- Match Demo
- Apply a list of patches onto plain text. Use best-effort to apply patch even when the underlying text doesn't match.
- Patch Demo
- API - Common API across all languages.
- Line or Word Diffs - Less detailed diffs.
- Plain Text vs. Structured Content - How to deal with data like XML.
- Unidiff - The patch serialization format.
- Support - Newsgroup for developers.
Although each language port of Diff Match Patch uses the same API, there are some language-specific notes.
A standardized speed test tracks the relative performance of diffs in each language.
This library implements Myer's diff algorithm which is generally considered to be the best general-purpose diff. A layer of pre-diff speedups and post-diff cleanups surround the diff algorithm, improving both performance and output quality.