highlight differences between assertion LHS & RHS #3420

sunaku · 2015-06-22T18:58:03Z

This patch enhances ExUnit.Formatter.format_kind_reason/4 by adding a
new diff: indicator that visually highlights the differences between
the left- and right-hand sides of a failed assertion using git-diff(1).

It also registers a new :diff configuration option, which lets users
specify additional command-line options for the git-diff(1) program.

To see this patch in action, try running the following assertions:

assert "hello world" == "world wide web"
assert %{a: 1, b: 2} == %{b: 2}
assert [1, 2, 3] == [4, 2, 6]
assert :hello == :world
assert 123 == 426

Although these examples are trivial, this functionality is indispensable
when comparing complex data structures found in real-world test suites.

📷 Here are some screenshots of the functionality this patch provides:

See also https://sunaku.github.io/minitest-colordiff.html#bin-gdiff

This patch enhances ExUnit.Formatter.format_kind_reason/4 by adding a new `diff:` indicator that visually highlights the differences between the left- and right-hand sides of a failed assertion using git-diff(1). It also registers a new `:diff` configuration option, which lets users specify _additional_ command-line options for the git-diff(1) program. To see this patch in action, try running the following assertions: assert "hello world" == "world wide web" assert %{a: 1, b: 2} == %{b: 2} assert [1, 2, 3] == [4, 2, 6] assert :hello == :world assert 123 == 426 Although these examples are trivial, this functionality is indispensable when comparing complex data structures found in real-world test suites.

eksperimental · 2015-06-22T19:10:27Z

this is beautiful.

I would like to see the same examples highlighted in the LHS and RHS lines and omitting the diff line (ala github), I have the feeling that they could be simpler to understand provided we are dealing with results longer than just a few chars

josevalim · 2015-06-22T19:30:46Z

this is beautiful.

Agreed.

I am not a fan of invoking git automatically though and for every assertion. Maybe we can leave git diff running as a port and then write to it command by command and get the output as needed?

sunaku · 2015-06-22T19:36:32Z

@eksperimental Injecting highlighted regions parsed from git-diff(1) output back into the original LHS and RHS strings sounds tricky. 😱 But otherwise, the diff-highlight script provided with Git (at /usr/share/doc/git/contrib/diff-highlight/diff-highlight on my system) seems to fulfill your preferred diffing style. In contrast, this patch implements the alternative, character-wise style of diffing. 😅

In my experience, both styles are useful in their own way and, as you're rightfully observed, sometimes one is more useful than the other. But in terms of this patch, I believe we should continue to provide the original LHS and RHS values unmodified (in case the user wants to copy the values out into IEX for quick manipulation) and not inject potentially syntax-breaking fragments (I'm not speaking about the ANSI color escapes; those are typically ignored by the terminal text selection/copy mechanism) into their midst. 🎯

Thus I believe having an additional diff: indicator with character-wise diffs (as this patch currently does) is a reasonable trade-off to injecting difference highlights back into the LHS and RHS values.

sunaku · 2015-06-22T19:39:04Z

@josevalim The git diff command is only run when an assertion fails and both LHS and RHS are present. To my knowledge, it isn't possible to have the program stay alive to receive more diffing requests after it handles the initial one: it simply dies afterward and must be launched again for the next request.

josevalim · 2015-06-22T19:40:06Z

Even though, it still imposes a requirement on git and we certainly should not have that. We could make it configurable but, if it is configurable, nobody will use it.

Maybe the best idea is to port Meyers algorithm for diff-ing. More info:

lexmag · 2015-06-22T19:54:12Z

As for me, it looks nice and might really be helpful, but git-diff just makes it feel like a first step toward the https://pbs.twimg.com/media/Bmm42jWCMAAEVPS.jpg.

ericmj · 2015-06-22T20:24:27Z

This does indeed look awesome but I agree with the other commentators; we need to implement the diffing algorithms in Elixir.

sunaku · 2015-06-22T21:43:48Z

I tried to remove the dependency on git diff in commit sunaku@b4d0788 using the tdiff library @josevalim suggested, but the results are too coarse-grained (word-level differences are lost 😱):

Oh well, hopefully I'll get a chance to hack on the tdiff library this weekend or whatever. 😰

sunaku · 2015-06-23T01:50:56Z

Good news! 🎉 It turns out that, in my haste, I made a mistake in using the tdiff library. 😅

I have now updated this pull request with an additional commit that swaps in tdiff correctly.

And the best part 🏆 is that this new git-less aproach works better than my original patch! 💪

Thanks @josevalim for suggesting the tdiff library and everyone for your guidance! 😺

Now, the next obstacle I'm facing is that make test_ex_unit in the root folder is failing due to the newly added tdiff dependency inside the lib/ex_unit/mix.exs file:

** (UndefinedFunctionError) undefined function: :tdiff.diff/2 (module :tdiff is not available)

Any suggestions on how to fix this (before I hack away indiscriminately at the Makefile)? 😕

sunaku · 2015-06-23T03:19:44Z

Hurray! 🎉 The last obstacle has been cleared. 😤 Please review and merge!

https://github.com/tomas-abrahamsson/tdiff

bitwalker · 2015-06-23T05:10:56Z

Just as a passive observer is it ok that ex_unit would be taking on a dependency on a third-party library, a non-Elixir one at that, and especially one that hasn't been updated in 4 years? Not that libs have to be continuously updated if they do what they intend to do well, and the fact that it's Erlang vs Elixir is mostly irrelevant, but just taking on a third-party dep seems like a first (unless I've missed something). I guess my impression was that @sunaku would need to implement the diffing algorithm in Elixir, in ex_unit itself.

josevalim · 2015-06-23T06:36:04Z

@sunaku I didn't mean to use the tdiff library itself but I meant it as an example of how it is possible to write our own. As @bitwalker says, the only way we can get this feature into core is by having no dependencies. Even if it is an Erlang project, it is still code we would have to maintain and there are other complications like licenses (which tdiff doesn't have, so we must always assume we can't use it).

josevalim · 2015-06-23T06:37:15Z

@sunaku in any case, your current patch shows that using a diff implementation in pure erlang/elixir is a viable option, so good job on validating this assumption before we move forward! :D

sunaku · 2015-06-23T13:16:29Z

I see now, thanks for clarifying. 😅 I'll implement a small ExUnit.Diff module next. 👷

sunaku · 2015-06-25T07:25:34Z

💀 After struggling to implement the LCS algorithm, 🚧 building a sequence of edit operations from it, 😱 discovering its abysmal performance, 📝 failing to memoize it after several attempts (it's complicated to pass a HashDict around that can be modified by any level of the recursion in FP 😭), 📚 (re-)reading research papers on LCS / diffing algorithms again and again, 😌 I finally understood the vdelta algorithm well enough to build my own (:innocent: or so I think) variant that appears to work in O(n+m) time and space! 😤

🏆 Here's the code (I still need to document it properly and complete the test suite):

https://github.com/sunaku/elixir/blob/ex_unit_diff_dev/lib/ex_unit/lib/ex_unit/diff.ex

🔥 Here's an example of how to use it (notice how fast it is! 😊):

ExUnit.Diff.diff(String.graphemes(File.read!("mix.lock")), String.graphemes(File.read!("mix.exs")))

Update: My algorithm made assumptions that don't apply in the general case. 😓 Nice try but no cigar!

josevalim · 2015-06-25T08:12:44Z

Awesome, good job! In the documentation, don't forget to mention any paper you have used as reference. It would help someone who needs to maintain the code later. :)

josevalim · 2015-06-29T18:02:01Z

Ping! Let us know if you need help moving forward with this.

sunaku · 2015-06-29T20:09:36Z

😞 I spent all my free time last week working on my algorithm, rethinking the problem in different ways (directed graph, sparse adjacency list, clustering contiguous sequences, resolving ties with weights, etc.) but no matter what I tried, the dreaded O(m^2) time complexity was in my face at every turn because I could never quite avoid having to compute the LCS (longest common subsequence) of the inputs.

On the plus side, I found ways to trick the Erlang tdiff library (because it's greedy) 😇 into returning suboptimal results 💩, so porting it as-is to Elixir isn't something I'd recommend we do. And as expected, Git's diff implementation remains superior in terms of speed as well as resilience to such trickery. 😈

⌛ Yesterday, I found some newer research on solving the LCS problem in clever ways. Give me another week's time to absorb it and try things out. If it can't wait, proceed with any Elixir diff implementation. 👍

josevalim · 2015-06-29T20:15:20Z

No rush, just wondering! I would honestly try to port myers paper, I have no idea how complex it would be though.

sunaku · 2015-06-29T20:23:58Z

The Myers algorithm is rather complicated (finding "snakes" by traversing from both top and bottom, hoping to meet in the middle). 😅 I'm more hopeful about the newer research on dominant points:

Majid Sazvar, Mahmoud Naghibzadeh, and Nayyereh Saadati. 2012. Quick-MLCS: a new algorithm for the multiple longest common subsequence problem. In Proceedings of the Fifth International C* Conference on Computer Science and Software Engineering (C3S2E '12). ACM, New York, NY, USA, 61-66. DOI=10.1145/2347583.2347591 http://doi.acm.org/10.1145/2347583.2347591

sunaku · 2015-07-07T19:21:53Z

Good news! 🎉 I was able to invent a new algorithm to solve this problem quickly (as fast as git diff but with higher quality results). 😤 I'm refining it now and will soon write an article and then a paper on it. 🎓 Expect an update (with the final code for this PR) in a few weeks time. 🚀 Thanks for your patience! 😅

williamgueiros · 2015-07-07T20:51:55Z

👍

josevalim · 2015-09-13T10:48:15Z

@sunaku I am closing this because the PR as is won't be merged. However, we are all anxiously waiting for your updates and new PR! Thank you!

sunaku force-pushed the ex_unit_diff branch from 91180fd to a8ba4f6 Compare June 22, 2015 19:04

sunaku force-pushed the ex_unit_diff branch from ba0f689 to 5a68ad5 Compare June 23, 2015 03:17

sunaku force-pushed the ex_unit_diff branch from 5a68ad5 to 6c6cf65 Compare June 23, 2015 03:23

remove git dependency using "tdiff" Erlang library

1a4e054

https://github.com/tomas-abrahamsson/tdiff

sunaku force-pushed the ex_unit_diff branch from 6c6cf65 to 1a4e054 Compare June 23, 2015 03:33

josevalim closed this Sep 13, 2015

sunaku mentioned this pull request Apr 30, 2016

Add difference highlighting to ExUnit #4430

Merged

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

highlight differences between assertion LHS & RHS #3420

highlight differences between assertion LHS & RHS #3420

sunaku commented Jun 22, 2015

eksperimental commented Jun 22, 2015

josevalim commented Jun 22, 2015

sunaku commented Jun 22, 2015

sunaku commented Jun 22, 2015

josevalim commented Jun 22, 2015

lexmag commented Jun 22, 2015

ericmj commented Jun 22, 2015

sunaku commented Jun 22, 2015

sunaku commented Jun 23, 2015

sunaku commented Jun 23, 2015

bitwalker commented Jun 23, 2015

josevalim commented Jun 23, 2015

josevalim commented Jun 23, 2015

sunaku commented Jun 23, 2015

sunaku commented Jun 25, 2015

josevalim commented Jun 25, 2015

josevalim commented Jun 29, 2015

sunaku commented Jun 29, 2015

josevalim commented Jun 29, 2015

sunaku commented Jun 29, 2015

sunaku commented Jul 7, 2015

williamgueiros commented Jul 7, 2015

josevalim commented Sep 13, 2015

highlight differences between assertion LHS & RHS #3420

highlight differences between assertion LHS & RHS #3420

Conversation

sunaku commented Jun 22, 2015

eksperimental commented Jun 22, 2015

josevalim commented Jun 22, 2015

sunaku commented Jun 22, 2015

sunaku commented Jun 22, 2015

josevalim commented Jun 22, 2015

lexmag commented Jun 22, 2015

ericmj commented Jun 22, 2015

sunaku commented Jun 22, 2015

sunaku commented Jun 23, 2015

sunaku commented Jun 23, 2015

bitwalker commented Jun 23, 2015

josevalim commented Jun 23, 2015

josevalim commented Jun 23, 2015

sunaku commented Jun 23, 2015

sunaku commented Jun 25, 2015

josevalim commented Jun 25, 2015

josevalim commented Jun 29, 2015

sunaku commented Jun 29, 2015

josevalim commented Jun 29, 2015

sunaku commented Jun 29, 2015

sunaku commented Jul 7, 2015

williamgueiros commented Jul 7, 2015

josevalim commented Sep 13, 2015