Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better control with swap, select which of (change/add/remove) to swap. #128

Open
oppsig opened this issue May 30, 2019 · 6 comments
Open

Comments

@oppsig
Copy link

oppsig commented May 30, 2019

I want to only swap add / remove of the diff, not change positions on all changes.
Or if i only wanted to change positions on changes for an example.
Any way to implement this?

data_old contains more keys than data_new.
i want to keep these old keys in the patch, but change values for newer values.

result = diff(data_old, data_new)
result = swap(result, 'remove') # I want to only swap remove to add, and not swap all changes.
patched = patch(result, data_old)
assert patched == data_old
@oppsig oppsig changed the title Only swap add / remove, not changes. Only swap add / remove, not changes, or only swap changes. May 30, 2019
@jirikuncar
Copy link
Member

@oppsig can you provide an example data for data_old and data_new?

@oppsig
Copy link
Author

oppsig commented May 31, 2019

Ok, I will update this later with some examples.

@oppsig
Copy link
Author

oppsig commented Jun 1, 2019

Ok, sorry for this lengthy response.
I provided some bogus data for old and new, and avoided nested dictionary for simplicity.
As you can see from diffx:
new data contains some new keys with values, some changes to key values, and also some keys removed.

My goal here is to merge the old dict with the new, changing keys changed in new data, adding new keys, but also adding keys that were in old but not new.

When i try to use swap it does not give me enough control of the changes.
Maybe swap can get a parameter of which (changes,add,remove) to swap.
It changes positions for all changes. (I don't want this change)
It changes add to -> remove (I don't want this change)
It changes remove -> add. ( I want this change)

I don't really get what patch does either, because patch is just a copy of new data.
You can see this from bool(new == patched)
Maybe you can explain patch a little more.

Thanks in advance, see code and output below, just ask if you need any more information.

Code:

import dictdiffer
import json


def json_dump_sort(data):
    if type(data) is list:
        data.sort()
    try:
        returndata = json.dumps(data, indent=2, sort_keys=True, ensure_ascii=False)
    except:
        returndata = None
    return returndata

def main():
    old = {
        "addresse": "Bogusstreet 1, 0154 Bogus, Bogusland",
        "distance": 16.5,
        "duration": 20.9,
        "price": 4990000,
        "last_changed": "2018-01-17 11:17:00",
        "beds": 2,
        "phone": None,
        "sqf": 314,
        "date_observed": "2019-03-28 11:20:34",
        "kode": 444444326,
        "hash": "0fbc5df6412d8d5d1db6caa131a6788c",
        "idx": 84,
        "sold": False,
        "update": None,
        "url": "https://www.somebogusurl.com/?id=444444326"
    }
    new = {
        "addresse": "Bogusstreet 1, 0154 Bogus, Bogusland",
        "floor": 1,
        "price": 5120000,
        "roms": 4,
        "last_changed": "2019-03-19 21:29:00",
        "phone": "+4711223344",
        "date_observed": "2019-03-28 11:20:34",
        "kode": 444444326,
        "hash": "391af0187560a648498978f6fb68f99a",
        "idx": 84,
        "sold": False,
        "update": "2019-06-01 07:38:24",
        "url": "https://www.somebogusurl.com/?id=444444326"
    }
    print("old")
    print(json_dump_sort(old))
    print("\nnew")
    print(json_dump_sort(new))

    print("\ndiffx")
    diffx = dictdiffer.diff(old, new)
    [print(i) for i in list(diffx)]

    print("\ndiffx patched")
    diffx = dictdiffer.diff(old,new)
    patched = dictdiffer.patch(diffx, old)
    print(json_dump_sort(patched))
    print("is new dict same as patched: %s" % bool(new == patched))

    print("\ndiffx with swap")
    diffy = dictdiffer.diff(old, new)
    diffyswap = dictdiffer.swap(diffy)
    [print(i) for i in list(diffyswap)]
    diffyswap = dictdiffer.swap(diffy) # i need to declare it again here to avoid error after list comprehension.
    patched = dictdiffer.patch(diffyswap, old)
    print("\nswapped and patched")
    print(json_dump_sort(patched))


if __name__ == '__main__':
    main()

Output from this:

old
{
  "addresse": "Bogusstreet 1, 0154 Bogus, Bogusland",
  "beds": 2,
  "date_observed": "2019-03-28 11:20:34",
  "distance": 16.5,
  "duration": 20.9,
  "hash": "0fbc5df6412d8d5d1db6caa131a6788c",
  "idx": 84,
  "kode": 444444326,
  "last_changed": "2018-01-17 11:17:00",
  "phone": null,
  "price": 4990000,
  "sold": false,
  "sqf": 314,
  "update": null,
  "url": "https://www.somebogusurl.com/?id=444444326"
}

new
{
  "addresse": "Bogusstreet 1, 0154 Bogus, Bogusland",
  "date_observed": "2019-03-28 11:20:34",
  "floor": 1,
  "hash": "391af0187560a648498978f6fb68f99a",
  "idx": 84,
  "kode": 444444326,
  "last_changed": "2019-03-19 21:29:00",
  "phone": "+4711223344",
  "price": 5120000,
  "roms": 4,
  "sold": false,
  "update": "2019-06-01 07:38:24",
  "url": "https://www.somebogusurl.com/?id=444444326"
}

diffx
('change', 'update', (None, '2019-06-01 07:38:24'))
('change', 'phone', (None, '+4711223344'))
('change', 'last_changed', ('2018-01-17 11:17:00', '2019-03-19 21:29:00'))
('change', 'hash', ('0fbc5df6412d8d5d1db6caa131a6788c', '391af0187560a648498978f6fb68f99a'))
('change', 'price', (4990000, 5120000))
('add', '', [('roms', 4), ('floor', 1)])
('remove', '', [('beds', 2), ('distance', 16.5), ('duration', 20.9), ('sqf', 314)])

diffx patched
{
  "addresse": "Bogusstreet 1, 0154 Bogus, Bogusland",
  "date_observed": "2019-03-28 11:20:34",
  "floor": 1,
  "hash": "391af0187560a648498978f6fb68f99a",
  "idx": 84,
  "kode": 444444326,
  "last_changed": "2019-03-19 21:29:00",
  "phone": "+4711223344",
  "price": 5120000,
  "roms": 4,
  "sold": false,
  "update": "2019-06-01 07:38:24",
  "url": "https://www.somebogusurl.com/?id=444444326"
}
is new dict same as patched: True

diffx with swap
('change', 'update', ('2019-06-01 07:38:24', None))
('change', 'phone', ('+4711223344', None))
('change', 'last_changed', ('2019-03-19 21:29:00', '2018-01-17 11:17:00'))
('change', 'hash', ('391af0187560a648498978f6fb68f99a', '0fbc5df6412d8d5d1db6caa131a6788c'))
('change', 'price', (5120000, 4990000))
('remove', '', [('floor', 1), ('roms', 4)])
('add', '', [('beds', 2), ('distance', 16.5), ('duration', 20.9), ('sqf', 314)])

swapped and patched
{
  "addresse": "Bogusstreet 1, 0154 Bogus, Bogusland",
  "beds": 2,
  "date_observed": "2019-03-28 11:20:34",
  "distance": 16.5,
  "duration": 20.9,
  "hash": "0fbc5df6412d8d5d1db6caa131a6788c",
  "idx": 84,
  "kode": 444444326,
  "last_changed": "2018-01-17 11:17:00",
  "phone": null,
  "price": 4990000,
  "sold": false,
  "sqf": 314,
  "update": null,
  "url": "https://www.somebogusurl.com/?id=444444326"
}

@oppsig oppsig changed the title Only swap add / remove, not changes, or only swap changes. Better control with swap, select which of (change/add/remove) to swap. Jun 3, 2019
@jirikuncar
Copy link
Member

I am not sure about line patched = dictdiffer.patch(diffyswap, old). Shouldn't it be patched = dictdiffer.patch(diffyswap, new)?

My goal here is to merge the old dict with the new, changing keys changed in new data, adding new keys, but also adding keys that were in old but not new.

Have you tried the Merger class from https://github.com/inveniosoftware/dictdiffer/blob/master/dictdiffer/merge.py ?

@oppsig
Copy link
Author

oppsig commented Jun 20, 2019

Thanks, will do some more tests and update this issue.

@oppsig
Copy link
Author

oppsig commented Jul 9, 2019

Just wondering how to use the Merger class.
merged = merge.Merger(lca={},first=old, second=new, actions={})
not sure what to put in lca in this case?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants