Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dictdiffer: pragmatic merge #53

Merged
merged 3 commits into from
May 8, 2015

Conversation

mvesper
Copy link
Collaborator

@mvesper mvesper commented Apr 13, 2015

As discussed with @jirikuncar a while ago, here the pull request of a more pragmatic dictdiffer.merge implementation.

Dictdiffer v0.5.0: merging

  • Alters the dictdiffer.diff function to allow expanding patches and to utilize a path_limit parameter, which stops the extraction process at a given point
  • Adds the following files to handle the merging process:
    • merge.py
    • conflict.py
    • resolve.py
    • unify.py

Which is executed in the following steps:

  • The differences between a latest common ancestor (lca) and two other data structures are extracted
  • The two occuring differences lists are checked for conflicts
  • Those conficts are resolved
  • The patches are unified

Signed-off-by: Martin Vesper martin.vesper@cern.ch

@mvesper mvesper force-pushed the m/wip/pragmatic_merge branch 2 times, most recently from dbd7f63 to 08db2ee Compare April 14, 2015 12:35
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 08db2ee on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 08db2ee on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 08db2ee on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

def get_path(patch):
"""Return the path for a given dictdiffer.diff patch."""
if patch[1] != '':
_str = (str, unicode)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

string_types?

@jirikuncar jirikuncar added this to the v0.5.0 milestone Apr 15, 2015
@mvesper mvesper force-pushed the m/wip/pragmatic_merge branch 2 times, most recently from 1231d87 to 062dfaa Compare April 16, 2015 07:35
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 062dfaa on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

@@ -38,17 +32,47 @@ def diff(first, second, node=None, ignore=None):
>>> list(result)
[('change', 'a', ('b', 'c'))]

PathLimit:
list(diff({}, {'a': {'b': 'c'}}))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you re-format the docstring?

    """
    Path Limit:

        >>> list(...)
        [...]
    """"

PS: check how does it look like if you compile them locally.

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 2e721f2 on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

@jirikuncar
Copy link
Member

  • s/addtion/addition/ see 8a8d5bd
  • please re-format the bullet points (one empty line between, fill blocks, etc.)

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 3a2affb on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

1 similar comment
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 3a2affb on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

@jirikuncar jirikuncar self-assigned this May 7, 2015
@jirikuncar
Copy link
Member

@mvesper I've run some simple test on two random files from CDS.

Data
$ wget http://cds.cern.ch/record/2014647?of=recjson -O a.json
$ wget http://cds.cern.ch/record/2014586?of=recjson -O b.json
Code
import json, dictdiffer
a, b = json.loads(open('a.json').read()), json.loads(open('b.json').read())
%timeit d = list(dictdiffer.diff(a, b))
Results master

1000 loops, best of 3: 1.09 ms per loop

%prun d = list(dictdiffer.diff(a, b))
         2330 function calls (1967 primitive calls) in 0.004 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   427/64    0.002    0.000    0.004    0.000 __init__.py:32(diff)
      917    0.001    0.000    0.001    0.000 {isinstance}
      134    0.000    0.000    0.001    0.000 {map}
      431    0.000    0.000    0.001    0.000 __init__.py:53(<lambda>)
      110    0.000    0.000    0.000    0.000 {method 'encode' of 'unicode' objects}
      110    0.000    0.000    0.001    0.000 __init__.py:62(check)
      134    0.000    0.000    0.000    0.000 {all}
        1    0.000    0.000    0.004    0.004 <string>:1(<module>)
       24    0.000    0.000    0.000    0.000 {range}
       24    0.000    0.000    0.000    0.000 {min}
       16    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
        1    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
Results #53

1000 loops, best of 3: 1.15 ms per loop

%prun d = list(dictdiffer.diff(a, b))
         2330 function calls (1967 primitive calls) in 0.004 seconds

   Ordered by: internal time

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
   427/64    0.002    0.000    0.004    0.000 __init__.py:26(diff)
      917    0.000    0.000    0.000    0.000 {isinstance}
      134    0.000    0.000    0.001    0.000 {map}
      431    0.000    0.000    0.001    0.000 __init__.py:83(<lambda>)
      110    0.000    0.000    0.000    0.000 __init__.py:92(check)
      110    0.000    0.000    0.000    0.000 {method 'encode' of 'unicode' objects}
      134    0.000    0.000    0.000    0.000 {all}
        1    0.000    0.000    0.004    0.004 <string>:1(<module>)
       24    0.000    0.000    0.000    0.000 {range}
       24    0.000    0.000    0.000    0.000 {min}
       16    0.000    0.000    0.000    0.000 {len}
        1    0.000    0.000    0.000    0.000 {method 'join' of 'str' objects}
        1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}
Conclusion

No major performance regression spotted.

@jirikuncar
Copy link
Member

@mvesper

  • add yourself to AUTHORS file
  • for new files use only current year in copyright header

tiborsimko and others added 3 commits May 8, 2015 11:10
* Alters the dictdiffer.diff function to allow expanding of patches
  (`expand` parameter) and to utilize a `path_limit` parameter, which
  stops the extraction process at a given point.

Signed-off-by: Martin Vesper <martin.vesper@cern.ch>
* Adds the following files to handle the merging process: merge.py,
  conflict.py, resolve.py and unify.py.

* The merge process is executed in the following steps: The differences
  between a latest common ancestor (lca) and two other data structures are
  extracted. Then the two occurring differences lists are checked for
  conflicts, which in turn are resolved. The patches are then unified.

Signed-off-by: Martin Vesper <martin.vesper@cern.ch>
Signed-off-by: Martin Vesper <martin.vesper@cern.ch>
@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling eb5dca0 on mvesper:m/wip/pragmatic_merge into 1210935 on inveniosoftware:master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants