Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XML line wrapping of long lines not as expected #723

Closed
wborn opened this issue Oct 21, 2020 · 8 comments
Closed

XML line wrapping of long lines not as expected #723

wborn opened this issue Oct 21, 2020 · 8 comments
Labels

Comments

@wborn
Copy link

wborn commented Oct 21, 2020

We use Spotless to also format long lines in XML files.
It seems to have a strange way of wrapping long lines in XML.

Have a look at this Lorem Ipsum example in original.xml that definitely needs some wrapping.

So I have configured Spotless 2.5.0 to wrap XML at 120 chars (see pom.xml/xml_files.prefs).

After running mvn spotless:apply on original.xml the result is original-spotless.xml.

Though I would have expected it to reformat the XML in original.xml as expected.xml.

Spotless also thinks expected.xml still does not pass the check.
If you run mvn spotless:apply on expected.xml it is transformed into expected-spotless.xml.

I've tested this with Spotless 2.5.0, Maven 3.6.3 on Ubuntu 20.04.

@nedtwigg
Copy link
Member

It looks like the Eclipse XML formatter doesn't "reflow" the text. It inserts a line break if a line is too long, but doesn't remove line breaks if a line is too short.

It also looks like it does the line-length requirement first, and after that it puts the content on its own line, after the <tag>, resulting in a jagged right edge.

I agree that this formatter behavior is not very good, but it's probably not a "bug". I think it would be better if the formatter parsed the XML then pretty-printed it, rather than applying rule after rule, but that's a different spec.

Spotless is just a switchboard out to other formatters that do the real work. You could file a bug report / feature request with Eclipse WTP for their XML formatter, but that would probably not be fruitful.

Better would be to add some other XML formatter to Spotless, either by writing a new one or integrating an existing one.

@wborn
Copy link
Author

wborn commented Oct 21, 2020

Do you know if the XML formatter used in Eclipse IDE is a different one? Because if I open the original.xml file in Eclipse (2020-09 (4.17.0)) and hit CTRL+SHIFT+F to format it, it looks nice and pretty as expected. :-)

@nedtwigg
Copy link
Member

We just released support for 4.17.0. So perhaps if you upgrade to the latest spotless that will fix it. You might need to set the version manually, which we show how to do in the docs.

@fvgh
Copy link
Member

fvgh commented Oct 21, 2020

I just tired with my local version of Photon and with 2020-09. Both producing quite shabby results, as you mentioned in your bug description.

In 2020-09, I started off with:

Wild Web Developer XML tools 0.10.2.202007231627 org.eclipse.wildwebdeveloper.xml.feature.feature.group Eclipse Wild Web Developer project

Afterwards I tried the WTP:

Eclipse Java Web Developer Tools 3.19.0.v202008250115 org.eclipse.jst.web_ui.feature.feature.group Eclipse Web Tools Platform

In the end the Wild Web Developer XML tools are just a port of the XML part of Eclipse Java Web Developer Tools.

But I see no way the tool can behave as you expected. Since no XSD is specified, all nodes are just treated as xs:string.
The stronghold of Eclipse formatter is its good awareness of XSD/DTD.
But in your case I think even a XSD will not help, since you expect that artificial paragraphs (carriage returns) are taken into account.
The only thing you can improve is the usage of a xs:normalizedString. In that case Eclipse will normalize all white spaces first and than redo an indentation as requested. But this will of course also remove your carriage return.
I always declare free text nodes where I expect "ASCII art" as PCDATA.
In that case the Eclipse formatter does not help (it just does not change anything within the node), but you can use an additional Spotless step to check line length.

If you want to check how behaviour change by using xs:normalizedString, you can simply use XML comments instead of the description node. Eclipse formats comments by default as xs:normalizedString.

Sorry, I see no way you used Eclipse WTP XML formatter and produced the result you mentioned in your previous comment.
Are you sure that you haven't installed a different editor plugin for XML in your Eclipse 2020-09?
Maybe we can make a feature request to support some other Eclipse editor?

@wborn
Copy link
Author

wborn commented Oct 21, 2020

Thanks for all your help!

So perhaps if you upgrade to the latest spotless that will fix it.

I updated the version in the pom.xml of the example but unfortunately it doesn't produce better results.

I always declare free text nodes where I expect "ASCII art" as PCDATA.

We have hundreds of XML files with these sometimes long descriptions. So while it might work it will likely introduce a lot of overhead on these descriptions. Usually they are only a couple of lines long. If it would just be a few files being impacted by this it might have been a good option.

Are you sure that you haven't installed a different editor plugin for XML in your Eclipse 2020-09?

I think the editor is:

Eclipse XML Editors and Tools 3.19.0.v202008192217 org.eclipse.wst.xml_ui.feature.feature.group Eclipse Web Tools Platform

Here you can see it in action:

1

The result looks quite good. But upon closer inspection the lines are 130 chars long while it is configured to wrap at 120 chars. 🤔

@fvgh
Copy link
Member

fvgh commented Oct 22, 2020

@wborn When you trigger the formatting twice in Eclipse, you will see that the WTP does the formatting again, this time adding line breaks to the lines which are longer than 120 characters. This is a bug in Eclipse, not in Spotless.

But that bug is still not the source of your problem, since still the WTP only adds line breaks, but does not remove existing line breaks in an xs:string node. So yes, for your particular test data and your particular test configuration it looks nice after the first formatting, but not after the second one.

Anyhow, as @nedtwigg already stated, we are just integrating the Eclipse formatters into Spottless. So we just consider things a Spotless bug if the formatter behaviour changes due to our integration.

So why don't you see the first formatted version when you trigger Spotless only once?
Spotless aims for stable formatter result. That's why it triggers the formatting at least twice. Details are described in PADDEDCELL.md.

@wborn
Copy link
Author

wborn commented Oct 24, 2020

When you trigger the formatting twice in Eclipse, you will see that the WTP does the formatting again
So why don't you see the first formatted version when you trigger Spotless only once?
Spotless aims for stable formatter result. That's why it triggers the formatting at least twice

Thanks for the explanation and I indeed see my pretty XML getting messed up again when I reformat the formatted XML a second time. If it would wrap the lines immediately at the right line length the output will probably become stable and the issue with formatting XML with Spotless would also be solved. I will try to find the right WTP issue tracker for this. If you already know its location please let me know. :-)

@fvgh
Copy link
Member

fvgh commented Oct 26, 2020

I never reported a bug for WTP project. Not sure whether you need an Eclipse account. This WTP guide to the bug workflow describes everything.

@fvgh fvgh closed this as completed Oct 26, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants