Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add option to sort attributes in HTML output. #2097

Closed
wants to merge 1 commit into from

Conversation

lmartelli
Copy link

This is helpful to compare documents and consider documents equal if the only difference is attributes order.

@jhy
Copy link
Owner

jhy commented Jan 6, 2024

Hi,

If the goal is to:

consider documents equal if the only difference is attributes order

Why not make this a comparator function of an Element / node? E.g. we have equals() and hasSameValue().

And can you provide some detail on why that's useful / required? What documents do you have that you are testing for this and why is the attribute sort order the only unstable (?) thing about them? Trying to understand the utility of it.

@lmartelli
Copy link
Author

Hi,

If the goal is to:

consider documents equal if the only difference is attributes order

Why not make this a comparator function of an Element / node? E.g. we have equals() and hasSameValue().

And can you provide some detail on why that's useful / required? What documents do you have that you are testing for this and why is the attribute sort order the only unstable (?) thing about them? Trying to understand the utility of it.

I would like to use this for approval tests. The order of attributes can be changed by the developer in a template, or it can change depending on the version of the template engine, but it should have no impact on the outcome of the test. But if the value of an attribute changes, the HTML must validated again, because it could have an impact on the rendered document. Ideally, I would render the HTML, take a snapshot image, and compare the images pixel by pixel, but that would be slower and more complicated. Comparing the HTML while ignoring changes that are known not to have an impact on the rendered document looks like a good compromise.

With the proposed patch, I could do somehting like this :

val actualHtml = Jsoup.parse(actual);
actualHtml.outputSettings().sortAttributes(false);
val expectedHtml = Jsoup.parse(expected);
expectedHtml.outputSettings().sortAttributes(true);
if (actualHtml.outerHtml().equals(expectedHtml.outerHtml()))
	return VerifyResult.SUCCESS;
else
	return VerifyResult.FAILURE;

But it could indeed be a comparator.

@jhy
Copy link
Owner

jhy commented Jul 1, 2024

Thanks for the context, the use case makes sense. On review, I am choosing not to include this feature as-is.

Here's an example of how to implement it in user-space:

import org.jsoup.Jsoup;
import org.jsoup.nodes.Attribute;
import org.jsoup.nodes.Document;

import java.util.Comparator;
import java.util.List;

class Scratch {
    public static void main(String[] args) {
        String html = "<p beta=1 alpha=2>One\n<p foo=3 bar=4>Two";
        Document doc = Jsoup.parse(html);
        System.out.println("Original: ");
        System.out.println(doc.html());

        doc.stream().forEach(el -> {
            List<Attribute> attributes = el.attributes().asList();
            el.clearAttributes();
            attributes.stream().sorted(Comparator.comparing(Attribute::getKey)).forEach(attr -> {
                el.attributes().put(attr);
            });
        });

        System.out.println("\n\nAfter sorting: ");
        System.out.println(doc.html());
    }
}

Giving:

Original: 
<html>
 <head></head>
 <body>
  <p beta="1" alpha="2">One</p>
  <p foo="3" bar="4">Two</p>
 </body>
</html>


After sorting: 
<html>
 <head></head>
 <body>
  <p alpha="2" beta="1">One</p>
  <p bar="4" foo="3">Two</p>
 </body>
</html>

And because this modifies the order in the DOM, the other DOM comparators will then consider the trees equal.

@jhy jhy closed this Jul 1, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants