Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better diff for asserts on dicts #1531

Closed
SuperDoxin opened this issue Apr 22, 2016 · 15 comments · Fixed by #11571
Closed

Better diff for asserts on dicts #1531

SuperDoxin opened this issue Apr 22, 2016 · 15 comments · Fixed by #11571
Labels
topic: rewrite related to the assertion rewrite mechanism type: enhancement new feature or API change, should be merged into features branch

Comments

@SuperDoxin
Copy link

Currently in some edge cases py.test returns utterly useless diffs for asserts on dicts, it doesn't provide the context needed to find which parts are differing in deeply nested dicts.

A better, and simpler, solution would be to use a json serializer to serialize the dicts and then use difflibs unified diff to generate the final diff. this solution will produce a diff in an easy to read format that people are familiar with and provides enough context to find the right place in the dict where changes need to be made.

any objects not json serializable in the dict can be adequately dealt with by using repr as serializer.

this whole function could be replaced with ~5 lines of code.

@RonnyPfannschmidt
Copy link
Member

would you like to turn that into a pull request?

also note, that for "proper" diff of nested structures one needs a path to the object being diffed as well

@SuperDoxin
Copy link
Author

I'd love to, but I can't seem to figure out the correct way to run pytest from source.

@The-Compiler
Copy link
Member

I usually do something like:

$ virtualenv .venv
$ ./.venv/bin/pip install -e .
$ ./.venv/bin/python -m pytest ...

And to run the tests/linters, simply run tox.

@TimDaub
Copy link

TimDaub commented Sep 1, 2016

I'd also be interested in this :)

@nicoddemus nicoddemus added the type: enhancement new feature or API change, should be merged into features branch label Sep 1, 2016
@SuperDoxin
Copy link
Author

I've had a good look at this recently, and it turns out to be a way bigger project than I had initially estimated. I'm afraid I can't quite find the time to get this done. sorry.

@nicoddemus
Copy link
Member

No problem, thanks for the heads up.

miracle2k added a commit to miracle2k/pytest that referenced this issue Apr 13, 2018
@sabrown89
Copy link

I am interested in tackling this. Just now starting to look into it.

@Zac-HD Zac-HD added the topic: rewrite related to the assertion rewrite mechanism label Oct 21, 2018
@four43
Copy link

four43 commented Oct 15, 2020

Just to add to this old (but important to me!) issue - it seems if you call assert {} == {} from within a function that is called from a test, pytest doesn't really help you. If I move my assertions to within the main test body they work great, but if they are called via a function it doesn't help.

@The-Compiler
Copy link
Member

@four43 That is unrelated to this issue and intended - pytest's assertion introspection/rewriting is only active for test_*.py files (and a few more cases) by default, to avoid changing the behavior of your code under test.

See the docs for details:

@four43
Copy link

four43 commented Oct 15, 2020

Hey okay that's pretty slick. Thanks for the tip, I bet that solves my problem. I appreciate it.

@m-aciek
Copy link

m-aciek commented Nov 5, 2020

Just for the record (and inspiration) there are two pytest plugins that prints better dict comparisons:

Source: https://eshlox.net/2020/04/07/better-diffs-in-pytest

@nicoddemus
Copy link
Member

Thanks @m-aciek,

I had a few minutes so I tried pytest-dictsdiff, here's the output using the example on the README:

E               Failed:  => CHANGE: at key 'dob.age' values are different | Left: 44 | Right: 34
E                => CHANGE: at key 'dob.date' values are different | Left: '1983-11-04T01:21:14Z' | Right: '1953-11-04T01:21:04Z'
E                => CHANGE: at key 'location.city' values are different | Left: 'tholen' | Right: 'Tholen'
E                => CHANGE: at key 'location.postcode' values are different | Left: 64504 | Right: 64509
E                => CHANGE: at key 'name.first' values are different | Left: 'zeyneb' | Right: 'Zeyneb'
E                => CHANGE: at key 'name.last' values are different | Left: 'elfring' | Right: 'Elfring'

Style aside, the diff is readable to me.

While at it decided to take a stab at @SuperDoxin's idea of using json+difflib (using the same data sample from pytest-dictsdiff):

def test_compare_dicts():
    import json
    left = json.dumps(RESULT, indent=2, sort_keys=True)
    right = json.dumps(EXPECTED_DATA, indent=2, sort_keys=True)

    import difflib
    diff = difflib.unified_diff(left.splitlines(True), right.splitlines(True), fromfile="left", tofile="right")
    assert 0, "\n" + "".join(diff)

Here's the output:

E       AssertionError:
E         --- left
E         +++ right
E         @@ -1,8 +1,8 @@
E          {
E            "cell": "(056)-022-8631",
E            "dob": {
E         -    "age": 44,
E         -    "date": "1983-11-04T01:21:14Z"
E         +    "age": 34,
E         +    "date": "1953-11-04T01:21:04Z"
E            },
E            "email": "zeyneb.elfring@example.com",
E            "gender": "female",
E         @@ -11,12 +11,12 @@
E              "value": "36180866"
E            },
E            "location": {
E         -    "city": "tholen",
E         +    "city": "Tholen",
E              "coordinates": {
E                "latitude": "46.8823",
E                "longitude": "175.8856"
E              },
E         -    "postcode": 64504,
E         +    "postcode": 64509,
E              "state": "groningen",
E              "street": "2074 adriaen van ostadelaan",
E              "timezone": {
E         @@ -34,8 +34,8 @@
E              "uuid": "37e30c59-bc79-4172-aac6-e2c640e165fa"
E            },
E            "name": {
E         -    "first": "zeyneb",
E         -    "last": "elfring",
E         +    "first": "Zeyneb",
E         +    "last": "Elfring",
E              "title": "mrs"
E            },
E            "nat": "NL",
E
E       assert 0

I like it too, but this has the problematic downside that json doesn't handle custom objects, and AFAIK there's no easy way to generalize it.

LanceEa pushed a commit to emissary-ingress/emissary that referenced this issue Jan 26, 2023
When unit testing diagd most of the tests us the `econf_compile` helper
which by default will assert that no aconf errors existed. This is good
but it requires puttng a debugger on it to grab the errors.

This prints out the object so that when the test is run outside of
an IDE with a debugger the errors can be seen. Altough its not neatly diffed/printed
due to being a nested structure. There is an issue to improve this but looks
to be still open: pytest-dev/pytest#1531.

Signed-off-by: Lance Austin <laustin@datawire.io>
LanceEa pushed a commit to emissary-ingress/emissary that referenced this issue Feb 2, 2023
* testing: provide assertion message for aconf errors

When unit testing diagd most of the tests us the `econf_compile` helper
which by default will assert that no aconf errors existed. This is good
but it requires puttng a debugger on it to grab the errors.

This prints out the object so that when the test is run outside of
an IDE with a debugger the errors can be seen. Altough its not neatly diffed/printed
due to being a nested structure. There is an issue to improve this but looks
to be still open: pytest-dev/pytest#1531.

* fix: envoy config generate when Host.hostname includes port

When trying to expose a Host with a hostname that includes a port such
as 'example.com:8500', causes 404 NR's due to the way envoy was
configured.

In the v1.14, we used to group everything under a wildcard virtual_host
domain "*" and did not include sni matching  on the Filter Chain. This
allowed that configuration to work but had the downside of causing
large memory usage and slower route matching due to lumping all
routes into a single virtual host.

If the v2.Y and v3.Y, series this was addressed by creating seperate
Filter Chains for each host. A non-tls Host would get a single Filter
Chain with multiple virtual_hosts per Host. A Host with TLS would
produce a 1-1 FilterChain and Virtualhost. This works fine in most
cases when downstream clients are connecting on standard ports
(80/443) but when a client needs to connect on something like
example.com:8500 this would effectively generate a Filter Chain
that could never match on an incoming request.

In the non-tls scenario there are no changes to what gets generated. In
the TLS scenario we now parse the hostname into two entites, sni and
virtual_host.domain. So, example.com:8500 would have an sni of
example.com and a virtual_host.domain of example.com:8500.
We create a Filter Chain for the sni value and add a virtual host for the
domain value. If a second Host with just `example.com` is provided as
well then we will attempt to merge these into a single Filter Chain with
multiple virtual_host. We can only do this when the same TLSContext
is used because we wouldn't know which attributes to take from which
host since all the transport_socket settings are at the Filter Chain level.

This restores existing behavior in a backwards compatiabile way and
doesn't try to solve the Developer Experience (DX) with the way the
Host is currently designed.

Signed-off-by: Lance Austin <laustin@datawire.io>
@BenjaminSchubert
Copy link
Contributor

@nicoddemus I believe the reason to use json dumping here is that the json output will format the diff with consistent { on their own lines and all of those small tweaks?

If so, I believe it should be doable to extend/override the pretty printer used by pytest to have the same behavior. It might be quite some code due to how the PrettyPrinter is implemented, but would continue to work as today for everything else.

I'd be happy to attempt a solution with prettyprinter if no-one is actively working on this and pytest would be fine with that added code

@nicoddemus
Copy link
Member

@nicoddemus I believe the reason to use json dumping here is that the json output will format the diff with consistent { on their own lines and all of those small tweaks?

Exactly. 👍

If so, I believe it should be doable to extend/override the pretty printer used by pytest to have the same behavior. It might be quite some code due to how the PrettyPrinter is implemented, but would continue to work as today for everything else.

That would be amazing!

I'd be happy to attempt a solution with prettyprinter if no-one is actively working on this and pytest would be fine with that added code

Oh yes, please go ahead, thanks a lot.

@BenjaminSchubert
Copy link
Contributor

#11537 would be my proposed solution

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
topic: rewrite related to the assertion rewrite mechanism type: enhancement new feature or API change, should be merged into features branch
Projects
None yet
10 participants