Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add new checker ListEquals #174

Open
wants to merge 6 commits into
base: master
Choose a base branch
from
Open

Conversation

barrettj12
Copy link

@barrettj12 barrettj12 commented May 31, 2024

The current DeepEquals checker is not great at comparing lists and maps, and if they are not equal, DeepEquals generally fails to give a complete and concise description of the difference between them.

This is WIP to add a new checker which compares two lists for equality. In the case they are different, we essentially use a "diff" algorithm to find the difference between them. Thus we can give a much more useful description of the difference between two slices, e.g.

... obtained []int32 = []int32{65, 66, 68, 69, 90, 71, 72, 73, 76, 74, 75}
... expected []int32 = []int32{65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75}
... difference:
    - at index 3: missing element 67
    - at index 6: obtained element 90, expected 70
    - at index 9: unexpected element 76

The canonical example of where this could be useful in Juju would be the supported series tests in older versions of Juju, when a new Ubuntu version is released.

Here's the output using gc.DeepEquals (spot the difference):

gc.DeepEquals output

Screenshot from 2024-06-13 15-20-07

Here's the output using jc.DeepEquals (even worse):

jc.DeepEquals output

Screenshot from 2024-06-13 15-20-33

And the output from jc.ListEquals:

jc.ListEquals output

Screenshot from 2024-06-13 15-35-29

@dimaqq
Copy link

dimaqq commented May 31, 2024

I wonder if the following format would be more readable, though it probably only works for reasonably short arrays:

... obtained []int32 = []int32{65, 66,     68, 69, 90, 71, 72, 73, 76, 74, 75}
... expected []int32 = []int32{65, 66, 67, 68, 69, 70, 71, 72, 73,     74, 75}
...                                    ^^^         ^^^             ^^^

@barrettj12
Copy link
Author

barrettj12 commented May 31, 2024

@dimaqq : it's a nice idea. Unfortunately I don't think we have any control over the ... obtained ... expected section, since that is controlled by the go-check package. Nor do I have much interest in spending a whole lot of time getting super fancy with ASCII art and the like :)

@manadart
Copy link
Member

manadart commented Jun 5, 2024

We already have the SameContents checker, which is order independent.

@barrettj12
Copy link
Author

@manadart thanks, I'm aware of that one, this one is different though because it is order dependent.

Add some table driven tests, benchmarks and fuzzing.
- precheck for comparable types in ListEquals
- correctly handle elems added/missing at start
e.g. a []string cannot equal an []any, even if they have the same contents
- fix display of changed diffs
- add some more test cases
@barrettj12 barrettj12 marked this pull request as ready for review June 13, 2024 03:42
Copy link
Member

@tlm tlm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like a cool idea. I would just ask that a best effort is made to explain the data structure more and what is going on. With fresh eyes this took me 20 minutes to understand and debug in my head.

}

// "Traceback" to calculate the diff
var diffs []diff
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems far more simpler to just make diffs a []string. We don't use the information in the diff for anything else other then generating the string at printing time. Can we not just calculate it at the time we make the diff and store the result?

At the moment anything that implements the Stringer interface could be a diff also.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, you're probably right

func generateDiff(obtained, expected reflect.Value) string {
// lenLCS[m][n] stores the length of the longest common subsequence of
// obtained[:m] and expected[:n]
lenLCS := make([][]int, obtained.Len()+1)
Copy link
Member

@tlm tlm Jun 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can clean up the memory allocations here. My 20 minute grok of this suggest we are blowing i.j.8 bytes for every check we perform when we can get away with (i+j).8 bytes.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The full table is needed at the end to "trace backwards" and calculate the entire diff. Yes it's O(n*m) in space and time, the onus is on the user to not use this checker for gigantic arrays.

There might be slight optimisations to be made here but I don't think anything more efficient than O(n*m).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants