Skip to content

gpakosz/Nested-Diff.py

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

92 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Nested-Diff.py

Recursive diff for python nested structures, implementation of Nested-Diff

Build Status Coverage Status Supported Python versions License

Diff format

Diff is a dict and may contain following keys:

  • A stands for 'added', it's value - added item.
  • D means 'different' and contains subdiff.
  • E diffed entity (optional), value - empty instance of entity's class.
  • I index for sequence item, used only when prior item was omitted.
  • N is a new value for changed item.
  • O is a changed item's old value.
  • R key used for removed item.
  • U represent unchanged item.

Diff metadata alternates with actual data; simple types specified as is, dicts, lists and tuples contain subdiffs for their items with native for such types addressing: indexes for lists and tuples and keys for dictionaries. Each status type, except D. E and I, may be omitted during diff computation. E tag is used with D when entity unable to contain diff by itself (set, frozenset); D contain a list of subdiffs in this case.

Annotated example:

a:  {"one": [5,7]}
b:  {"one": [5], "two": 2}
opts: U=False  # omit unchanged items

diff:
{"D": {"one": {"D": [{"I": 1, "R": 7}]}, "two": {"A": 2}}}
| |   |  |    | |   || |   |   |   |       |    | |   |
| |   |  |    | |   || |   |   |   |       |    | |   +- with value 2
| |   |  |    | |   || |   |   |   |       |    | +- key 'two' was added
| |   |  |    | |   || |   |   |   |       |    +- subdiff for it
| |   |  |    | |   || |   |   |   |       +- another key from top-level
| |   |  |    | |   || |   |   |   +- what it was (item's value: 7)
| |   |  |    | |   || |   |   +- what happened to item (removed)
| |   |  |    | |   || |   +- list item's actual index
| |   |  |    | |   || +- prior item was omitted
| |   |  |    | |   |+- subdiff for list item
| |   |  |    | |   +- it's value - list
| |   |  |    | +- it is deeply changed
| |   |  |    +- subdiff for key 'one'
| |   |  +- it has key 'one'
| |   +- top-level thing is a dict
| +- changes somewhere deeply inside
+- diff is always a dict

Examples

>>> from nested_diff import diff, patch
>>>
>>> a = {'one': 1, 'two': 2, 'three': 3}
>>> b = {'one': 1, 'two': 42}
>>>
>>> diff(a, b)
{'D': {'three': {'R': 3}, 'two': {'O': 2, 'N': 42}, 'one': {'U': 1}}}
>>>
>>> diff(a, b, O=False, U=False)
{'D': {'two': {'N': 42}, 'three': {'R': 3}}}
>>>
>>>
>>> c = [0,1,2,3]
>>> d = [  1,2,4,5]
>>>
>>> diff(c, d, O=False, U=False)
{'D': [{'R': 0}, {'I': 3, 'N': 4}, {'A': 5}]}
>>>
>>>
>>> c = patch(c, diff(c, d))
>>> assert c == d

Subclassing

from nested_diff import Differ


class CustomDiffer(Differ):
    """
    Use custom precision for floats.

    """
    def __init__(self, float_precision=2, *args, **kwargs):
        super().__init__(*args, **kwargs)
        self.set_differ(float, self.diff_float)
        self.float_precision = float_precision

    def diff_float(self, a, b):
        if round(a, self.float_precision) == round(b, self.float_precision):
            return {'U': a} if self.op_u else {}

        return super().diff__default(a, b)


differ = CustomDiffer(float_precision=1, U=False)

a = [0.001, 0.01, 0.1]
b = [0.002, 0.02, 0.2]

assert {'D': [{'I': 2, 'N': 0.2, 'O': 0.1}]} == differ.diff(a, b)

Formatting diffs

>>> from nested_diff import diff
>>> from nested_diff.fmt import TextFormatter
>>>
>>> a = {'one': 1, 'two': 'some\ntext\ninside'}
>>> b = {'one': 0, 'two': 'some\ntext'}
>>>
>>> d = diff(a, b, U=False, multiline_diff_context=3)
>>> print(TextFormatter().format(d))
  {'two'}
    @@ -1,3 +1,2 @@
    some
    text
-   inside
  {'one'}
-   1
+   0
>>>

Command line tools

mixas@mixas-laptop:~/github/Nested-Diff.py$ cat a.json b.json
[0, [1],    3]
[0, [1, 2], 3]
mixas@mixas-laptop:~/github/Nested-Diff.py$ nested_diff a.json b.json
  [1]
+   [1]
+     2
mixas@mixas-laptop:~/github/Nested-Diff.py$

License

Licensed under the terms of the Apache License, Version 2.0.

See Also

deepdiff, jsondiff, jsonpatch, json-delta

About

Recursive diff for nested python structures

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 100.0%