Diff Match Patch is a high-performance library in multiple languages that manipulates plain text.
Clone or download
Pull request Compare This branch is 13 commits ahead, 34 commits behind google:master.
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
diff_match_patch
.gitignore
.travis.yml
AUTHORS
CODE_OF_CONDUCT.md
CONTRIBUTING.md
LICENSE
MANIFEST.in
README.md
makefile
setup.py

README.md

diff-match-patch

Google's Diff Match and Patch library, packaged for modern Python.

build status version license

Install

diff-match-patch is supported on Python 2.7 or Python 3.4 or newer. You can install it from PyPI:

python -m pip install diff-match-patch

Usage

Generating a patchset (analogous to unified diff) between two texts:

from diff_match_patch import diff_match_patch

dmp = diff_match_patch()
patches = dmp.patch_make(text1, text2)
diff = dmp.patch_toText(patches)

Applying a patchset to a text can then be done with:

from diff_match_patch import diff_match_patch

dmp = diff_match_patch()
patches = dmp.patch_fromText(diff)
new_text, _ = dmp.patch_apply(patches, text)

Original README

The Diff Match and Patch libraries offer robust algorithms to perform the operations required for synchronizing plain text.

  1. Diff:
    • Compare two blocks of plain text and efficiently return a list of differences.
    • Diff Demo
  2. Match:
    • Given a search string, find its best fuzzy match in a block of plain text. Weighted for both accuracy and location.
    • Match Demo
  3. Patch:
    • Apply a list of patches onto plain text. Use best-effort to apply patch even when the underlying text doesn't match.
    • Patch Demo

Originally built in 2006 to power Google Docs, this library is now available in C++, C#, Dart, Java, JavaScript, Lua, Objective C, and Python.

Reference

Languages

Although each language port of Diff Match Patch uses the same API, there are some language-specific notes.

A standardized speed test tracks the relative performance of diffs in each language.

Algorithms

This library implements Myer's diff algorithm which is generally considered to be the best general-purpose diff. A layer of pre-diff speedups and post-diff cleanups surround the diff algorithm, improving both performance and output quality.

This library also implements a Bitap matching algorithm at the heart of a flexible matching and patching strategy.