-
Notifications
You must be signed in to change notification settings - Fork 3.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
highlight differences between assertion LHS & RHS #3420
Conversation
This patch enhances ExUnit.Formatter.format_kind_reason/4 by adding a new `diff:` indicator that visually highlights the differences between the left- and right-hand sides of a failed assertion using git-diff(1). It also registers a new `:diff` configuration option, which lets users specify _additional_ command-line options for the git-diff(1) program. To see this patch in action, try running the following assertions: assert "hello world" == "world wide web" assert %{a: 1, b: 2} == %{b: 2} assert [1, 2, 3] == [4, 2, 6] assert :hello == :world assert 123 == 426 Although these examples are trivial, this functionality is indispensable when comparing complex data structures found in real-world test suites.
this is beautiful. I would like to see the same examples highlighted in the LHS and RHS lines and omitting the diff line (ala github), I have the feeling that they could be simpler to understand provided we are dealing with results longer than just a few chars |
Agreed. I am not a fan of invoking git automatically though and for every assertion. Maybe we can leave |
@eksperimental Injecting highlighted regions parsed from git-diff(1) output back into the original LHS and RHS strings sounds tricky. 😱 But otherwise, the diff-highlight script provided with Git (at In my experience, both styles are useful in their own way and, as you're rightfully observed, sometimes one is more useful than the other. But in terms of this patch, I believe we should continue to provide the original LHS and RHS values unmodified (in case the user wants to copy the values out into IEX for quick manipulation) and not inject potentially syntax-breaking fragments (I'm not speaking about the ANSI color escapes; those are typically ignored by the terminal text selection/copy mechanism) into their midst. 🎯 Thus I believe having an additional |
@josevalim The |
Even though, it still imposes a requirement on git and we certainly should not have that. We could make it configurable but, if it is configurable, nobody will use it. Maybe the best idea is to port Meyers algorithm for diff-ing. More info: |
As for me, it looks nice and might really be helpful, but |
This does indeed look awesome but I agree with the other commentators; we need to implement the diffing algorithms in Elixir. |
I tried to remove the dependency on Oh well, hopefully I'll get a chance to hack on the tdiff library this weekend or whatever. 😰 |
Good news! 🎉 It turns out that, in my haste, I made a mistake in using the tdiff library. 😅 I have now updated this pull request with an additional commit that swaps in tdiff correctly. And the best part 🏆 is that this new git-less aproach works better than my original patch! 💪 Thanks @josevalim for suggesting the tdiff library and everyone for your guidance! 😺 Now, the next obstacle I'm facing is that
Any suggestions on how to fix this (before I hack away indiscriminately at the Makefile)? 😕 |
Hurray! 🎉 The last obstacle has been cleared. 😤 Please review and merge! |
Just as a passive observer is it ok that ex_unit would be taking on a dependency on a third-party library, a non-Elixir one at that, and especially one that hasn't been updated in 4 years? Not that libs have to be continuously updated if they do what they intend to do well, and the fact that it's Erlang vs Elixir is mostly irrelevant, but just taking on a third-party dep seems like a first (unless I've missed something). I guess my impression was that @sunaku would need to implement the diffing algorithm in Elixir, in ex_unit itself. |
@sunaku I didn't mean to use the tdiff library itself but I meant it as an example of how it is possible to write our own. As @bitwalker says, the only way we can get this feature into core is by having no dependencies. Even if it is an Erlang project, it is still code we would have to maintain and there are other complications like licenses (which tdiff doesn't have, so we must always assume we can't use it). |
@sunaku in any case, your current patch shows that using a diff implementation in pure erlang/elixir is a viable option, so good job on validating this assumption before we move forward! :D |
I see now, thanks for clarifying. 😅 I'll implement a small |
💀 After struggling to implement the LCS algorithm, 🚧 building a sequence of edit operations from it, 😱 discovering its abysmal performance, 📝 failing to memoize it after several attempts (it's complicated to pass a HashDict around that can be modified by any level of the recursion in FP 😭), 📚 (re-)reading research papers on LCS / diffing algorithms again and again, 😌 I finally understood the vdelta algorithm well enough to build my own (:innocent: or so I think) variant that appears to work in O(n+m) time and space! 😤 🏆 Here's the code (I still need to document it properly and complete the test suite): https://github.com/sunaku/elixir/blob/ex_unit_diff_dev/lib/ex_unit/lib/ex_unit/diff.ex 🔥 Here's an example of how to use it (notice how fast it is! 😊): ExUnit.Diff.diff(String.graphemes(File.read!("mix.lock")), String.graphemes(File.read!("mix.exs"))) Update: My algorithm made assumptions that don't apply in the general case. 😓 Nice try but no cigar! |
Awesome, good job! In the documentation, don't forget to mention any paper you have used as reference. It would help someone who needs to maintain the code later. :) |
Ping! Let us know if you need help moving forward with this. |
😞 I spent all my free time last week working on my algorithm, rethinking the problem in different ways (directed graph, sparse adjacency list, clustering contiguous sequences, resolving ties with weights, etc.) but no matter what I tried, the dreaded O(m^2) time complexity was in my face at every turn because I could never quite avoid having to compute the LCS (longest common subsequence) of the inputs. On the plus side, I found ways to trick the Erlang tdiff library (because it's greedy) 😇 into returning suboptimal results 💩, so porting it as-is to Elixir isn't something I'd recommend we do. And as expected, Git's diff implementation remains superior in terms of speed as well as resilience to such trickery. 😈 ⌛ Yesterday, I found some newer research on solving the LCS problem in clever ways. Give me another week's time to absorb it and try things out. If it can't wait, proceed with any Elixir diff implementation. 👍 |
No rush, just wondering! I would honestly try to port myers paper, I have no idea how complex it would be though. |
The Myers algorithm is rather complicated (finding "snakes" by traversing from both top and bottom, hoping to meet in the middle). 😅 I'm more hopeful about the newer research on dominant points:
|
Good news! 🎉 I was able to invent a new algorithm to solve this problem quickly (as fast as |
👍 |
@sunaku I am closing this because the PR as is won't be merged. However, we are all anxiously waiting for your updates and new PR! Thank you! |
This patch enhances ExUnit.Formatter.format_kind_reason/4 by adding a
new
diff:
indicator that visually highlights the differences betweenthe left- and right-hand sides of a failed assertion using git-diff(1).
It also registers a new
:diff
configuration option, which lets usersspecify additional command-line options for the git-diff(1) program.
To see this patch in action, try running the following assertions:
Although these examples are trivial, this functionality is indispensable
when comparing complex data structures found in real-world test suites.
📷 Here are some screenshots of the functionality this patch provides:
See also https://sunaku.github.io/minitest-colordiff.html#bin-gdiff