dictdiffer: pragmatic merge #53

mvesper · 2015-04-13T14:56:41Z

As discussed with @jirikuncar a while ago, here the pull request of a more pragmatic dictdiffer.merge implementation.

Dictdiffer v0.5.0: merging

Alters the dictdiffer.diff function to allow expanding patches and to utilize a path_limit parameter, which stops the extraction process at a given point
Adds the following files to handle the merging process:
- merge.py
- conflict.py
- resolve.py
- unify.py

Which is executed in the following steps:

The differences between a latest common ancestor (lca) and two other data structures are extracted
The two occuring differences lists are checked for conflicts
Those conficts are resolved
The patches are unified

Signed-off-by: Martin Vesper martin.vesper@cern.ch

coveralls · 2015-04-14T13:16:09Z

Coverage remained the same at 100.0% when pulling 08db2ee on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

coveralls · 2015-04-14T13:16:10Z

Coverage remained the same at 100.0% when pulling 08db2ee on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

coveralls · 2015-04-14T13:37:56Z

Coverage remained the same at 100.0% when pulling 08db2ee on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

jirikuncar · 2015-04-15T17:25:03Z

dictdiffer/utils.py

+def get_path(patch):
+    """Return the path for a given dictdiffer.diff patch."""
+    if patch[1] != '':
+        _str = (str, unicode)


string_types?

coveralls · 2015-04-16T07:37:57Z

Coverage remained the same at 100.0% when pulling 062dfaa on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

jirikuncar · 2015-04-16T07:43:24Z

dictdiffer/__init__.py

@@ -38,17 +32,47 @@ def diff(first, second, node=None, ignore=None):
        >>> list(result)
        [('change', 'a', ('b', 'c'))]

+    PathLimit:
+    list(diff({}, {'a': {'b': 'c'}}))


can you re-format the docstring?

""" Path Limit: >>> list(...) [...] """"

PS: check how does it look like if you compile them locally.

coveralls · 2015-04-16T09:18:04Z

Coverage remained the same at 100.0% when pulling 2e721f2 on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

jirikuncar · 2015-04-16T11:07:20Z

s/addtion/addition/ see 8a8d5bd
please re-format the bullet points (one empty line between, fill blocks, etc.)

coveralls · 2015-04-16T12:00:22Z

Coverage remained the same at 100.0% when pulling 3a2affb on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

coveralls · 2015-04-16T12:00:24Z

Coverage remained the same at 100.0% when pulling 3a2affb on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

jirikuncar · 2015-05-07T19:38:32Z

@mvesper I've run some simple test on two random files from CDS.

Data

$ wget http://cds.cern.ch/record/2014647?of=recjson -O a.json
$ wget http://cds.cern.ch/record/2014586?of=recjson -O b.json

Code

import json, dictdiffer
a, b = json.loads(open('a.json').read()), json.loads(open('b.json').read())
%timeit d = list(dictdiffer.diff(a, b))

Results `master`

1000 loops, best of 3: 1.09 ms per loop

%prun d = list(dictdiffer.diff(a, b))
         2330 function calls (1967 primitive calls) in 0.004 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   427/64    0.002    0.000    0.004    0.000 __init__.py:32(diff)
      917    0.001    0.000    0.001    0.000 {isinstance}
      134    0.000    0.000    0.001    0.000 {map}
      431    0.000    0.000    0.001    0.000 __init__.py:53(<lambda>)
      110    0.000    0.000    0.000    0.000 {method 'encode' of 'unicode' objects}
      110    0.000    0.000    0.001    0.000 __init__.py:62(check)
      134    0.000    0.000    0.000    0.000 {all}
        1    0.000    0.000    0.004    0.004 <string>:1(<module>)
       24    0.000    0.000    0.000    0.000 {range}
       24    0.000    0.000    0.000    0.000 {min}
       16    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}

Results `#53`

1000 loops, best of 3: 1.15 ms per loop

%prun d = list(dictdiffer.diff(a, b))
         2330 function calls (1967 primitive calls) in 0.004 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   427/64    0.002    0.000    0.004    0.000 __init__.py:26(diff)
      917    0.000    0.000    0.000    0.000 {isinstance}
      134    0.000    0.000    0.001    0.000 {map}
      431    0.000    0.000    0.001    0.000 __init__.py:83(<lambda>)
      110    0.000    0.000    0.000    0.000 __init__.py:92(check)
      110    0.000    0.000    0.000    0.000 {method 'encode' of 'unicode' objects}
      134    0.000    0.000    0.000    0.000 {all}
        1    0.000    0.000    0.004    0.004 <string>:1(<module>)
       24    0.000    0.000    0.000    0.000 {range}
       24    0.000    0.000    0.000    0.000 {min}
       16    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

Conclusion

No major performance regression spotted.

jirikuncar · 2015-05-07T20:14:52Z

@mvesper

add yourself to AUTHORS file
for new files use only current year in copyright header

* Alters the dictdiffer.diff function to allow expanding of patches (`expand` parameter) and to utilize a `path_limit` parameter, which stops the extraction process at a given point. Signed-off-by: Martin Vesper <martin.vesper@cern.ch>

* Adds the following files to handle the merging process: merge.py, conflict.py, resolve.py and unify.py. * The merge process is executed in the following steps: The differences between a latest common ancestor (lca) and two other data structures are extracted. Then the two occurring differences lists are checked for conflicts, which in turn are resolved. The patches are then unified. Signed-off-by: Martin Vesper <martin.vesper@cern.ch>

Signed-off-by: Martin Vesper <martin.vesper@cern.ch>

coveralls · 2015-05-08T09:26:37Z

Coverage remained the same at 100.0% when pulling eb5dca0 on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

mvesper force-pushed the m/wip/pragmatic_merge branch 2 times, most recently from dbd7f63 to 08db2ee Compare April 14, 2015 12:35

jirikuncar reviewed Apr 15, 2015
View reviewed changes

jirikuncar added this to the v0.5.0 milestone Apr 15, 2015

jirikuncar added the Type: enhancement label Apr 15, 2015

mvesper force-pushed the m/wip/pragmatic_merge branch 2 times, most recently from 1231d87 to 062dfaa Compare April 16, 2015 07:35

jirikuncar reviewed Apr 16, 2015
View reviewed changes

mvesper force-pushed the m/wip/pragmatic_merge branch from 062dfaa to 2e721f2 Compare April 16, 2015 09:12

mvesper force-pushed the m/wip/pragmatic_merge branch from 2e721f2 to 3a2affb Compare April 16, 2015 11:31

jirikuncar self-assigned this May 7, 2015

tiborsimko and others added 3 commits May 8, 2015 11:10

diff: addition of path_limit and expand parameter

a1e2333

* Alters the dictdiffer.diff function to allow expanding of patches (`expand` parameter) and to utilize a `path_limit` parameter, which stops the extraction process at a given point. Signed-off-by: Martin Vesper <martin.vesper@cern.ch>

global: version bump to 0.5.0.dev20150414

eb5dca0

Signed-off-by: Martin Vesper <martin.vesper@cern.ch>

mvesper force-pushed the m/wip/pragmatic_merge branch from 3a2affb to eb5dca0 Compare May 8, 2015 09:24

jirikuncar merged commit eb5dca0 into inveniosoftware:master May 8, 2015

This was referenced May 8, 2015

WIP dictdiffer: diff methods and merging #46

Closed

RFC Detailed patches and additional difference algorithms #42

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dictdiffer: pragmatic merge #53

dictdiffer: pragmatic merge #53

mvesper commented Apr 13, 2015

coveralls commented Apr 14, 2015

coveralls commented Apr 14, 2015

coveralls commented Apr 14, 2015

jirikuncar Apr 15, 2015

coveralls commented Apr 16, 2015

jirikuncar Apr 16, 2015

coveralls commented Apr 16, 2015

jirikuncar commented Apr 16, 2015

coveralls commented Apr 16, 2015

coveralls commented Apr 16, 2015

jirikuncar commented May 7, 2015

jirikuncar commented May 7, 2015

coveralls commented May 8, 2015

dictdiffer: pragmatic merge #53

dictdiffer: pragmatic merge #53

Conversation

mvesper commented Apr 13, 2015

coveralls commented Apr 14, 2015

coveralls commented Apr 14, 2015

coveralls commented Apr 14, 2015

jirikuncar Apr 15, 2015

Choose a reason for hiding this comment

coveralls commented Apr 16, 2015

jirikuncar Apr 16, 2015

Choose a reason for hiding this comment

coveralls commented Apr 16, 2015

jirikuncar commented Apr 16, 2015

coveralls commented Apr 16, 2015

coveralls commented Apr 16, 2015

jirikuncar commented May 7, 2015

Data

Code

Results master

Results #53

Conclusion

jirikuncar commented May 7, 2015

coveralls commented May 8, 2015

Results `master`

Results `#53`