Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add way to compare strings more fully ignoring whitespace. #2440

Closed
Clockwork-Muse opened this issue Dec 13, 2021 · 17 comments
Closed

Add way to compare strings more fully ignoring whitespace. #2440

Clockwork-Muse opened this issue Dec 13, 2021 · 17 comments

Comments

@Clockwork-Muse
Copy link

Currently, when comparing strings, options like ignoreLineEndingDifferences and ignoreWhiteSpaceDifferences require there be at least one instance of the various characters

// Succeeds
Assert.Equal("    ", " ", ignoreWhiteSpaceDifferences: true);
// Fails
Assert.Equal("    ", String.Empty, ignoreWhiteSpaceDifferences: true);

... I'm looking for a way to compare two strings ignoring any whitespace differences at all, so that the second test succeeds

@bradwilson
Copy link
Member

I would not want this to affect the existing behavior (for example, Assert.Equal("A B", "AB", ignoreWhiteSpaceDifferences: true) fails, and this is expected behavior). Instead, I think we should have a different flag (like maybe ignoreAllWhiteSpace), and make sure the XML docs for the two flags make it clear how they differ, and refer to the other flag when appropriate.

@bakermo
Copy link

bakermo commented Jan 23, 2022

Hello! I'd like to contribute. Would you mind if I took a stab at this?

@mongeon
Copy link
Contributor

mongeon commented May 11, 2022

@bakermo Did you work on this? If not, can I grab it from you :)

@bakermo
Copy link

bakermo commented May 12, 2022

@mongeon I did not end up taking it, go right ahead :)

@VictorLlanir
Copy link

Hello, @mongeon, did you worked on this issue? If not, I would like to contribute.

@mongeon
Copy link
Contributor

mongeon commented Jun 20, 2022

Hello, @mongeon, did you worked on this issue? If not, I would like to contribute.

No, didn't go far on this. Please feel free to contribute to this.

@gy-soft
Copy link

gy-soft commented Jul 6, 2022

This new behavior seems equivalent to removing all white space from actual and expected before comparing. Am I understanding ok?

@Clockwork-Muse
Copy link
Author

@gy-soft - Yes, that's correct.

@jhowlett-scottlogic
Copy link
Contributor

@VictorLlanir did you start working on this?

@jhowlett-scottlogic
Copy link
Contributor

I have a possible solution for this that also addresses Equal(Span) but waiting on a reply to a workflow query.

@bradwilson
Copy link
Member

Merged. Thanks!

@wizofaus
Copy link

wizofaus commented Feb 1, 2024

Doesn't work as I expected - Assert.Equal("<a>\n</a>", "<a> </a>", ignoreAllWhiteSpace: true); still fails?

@bradwilson
Copy link
Member

@wizofaus Carriage returns and line feeds aren't treated as white space, they're treated as line ending characters. It looks like you're trying to use to compare HTML, which has its own rules about when line ending characters are equivalent to white space which isn't something that we directly support. Just one example of a place where this rule is "inconsistent" is when HTML is parsing text which is pre-formatted (i.e., inside of <pre> or <code> or something equivalent) vs. normally formatted.

You may be able to get away with converting line ending characters into white space characters if you know your data doesn't have pre-formatted content, but this is likely to be full of potential edge cases. I would strongly suggest finding a third part library that's designed to compare HTML equivalence.

@wizofaus
Copy link

wizofaus commented Feb 1, 2024

Well, xml technically, but in my book linefeeds are whitespace (unless specified otherwise). Anyway, yes, with a replace on \n (and \r) characters first it works as expected. For the purposes of my test, it was sufficient.

@bradwilson
Copy link
Member

in my book linefeeds are whitespace

I think the easiest way I can explain the difference is: when we say white space, we mean horizontal white space. We treat line endings (aka vertical white space) as a distinct thing, with its own flag.

I will update the XML documentation for the assertions to make this clearer.

Thanks!

@Clockwork-Muse
Copy link
Author

@wizofaus - You still shouldn't be doing it with string comparisons, because there are additional transformations that might be made that still result in semantically identical documents. The most likely one being the order of attributes.

@wizofaus
Copy link

wizofaus commented Feb 1, 2024

@wizofaus - You still shouldn't be doing it with string comparisons, because there are additional transformations that might be made that still result in semantically identical documents. The most likely one being the order of attributes.

Sure, if I really did want to check that two complete XML documents were semantically equivalent despite order of attributes etc. etc. In my particular case the test is far easier to read/understand as it is, just confirming that an XML fragment has the expected elements/attributes in the given order. Anyway, happy to consider current behaviour as "by-design".

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants