Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fun with line endings #22

Closed
wants to merge 5 commits into from
Closed

Fun with line endings #22

wants to merge 5 commits into from

Conversation

chkpnt
Copy link
Contributor

@chkpnt chkpnt commented May 7, 2016

Recently I've run into trouble using spotless for a project that uses git.

With the default line ending setting PLATFORM_NATIVE spotless works on both Linux and Windows, as long as git checks out the relevant files with CRLF-line-endings under Windows. If someone uses git without its eol-conversion, PLATFORM_NATIVE won't work on Windows.
Setting the line ending to UNIX is only an option in cases, where you can ensure the repository is only used on Linux or checked-out as-is on Windows.

But even without git, the current line ending setting options lack flexibility: Imagine a project, where CRLFs as well as LFs are used for text files. For example shell-scripts are sometimes forced to use LFs even on Windows.

I like your design of having different formatter steps which can be added to the plugin and configured by a FormatExtension 😃. So why not doing the same for the EOL-stuff? I see two thinks to do:

  1. Introduce the possibility to derive the requested line ending from the current file
  2. Rework the normalization into its own step, so line ending setting can be set per format

Removing the normalization from Formatter is non-trivial, as all the currently existing steps rely on getting a String normalized to LFs. But it is necessary, as a LineEndingStep that retrieves already normalized strings doesn't make sense.

I'm attaching an implementation proposal. I have splitted it in 5 (hopefully comprehensible) commits. Please let me know what you think. If there is something I should rework, don't hesitate to bring it up. :-)

The proposal is fully backward compatible to the current version. Instead of PLATFORM_NATIVE, I would prefere having DERIVED as the default, but this kind of API-change should be your decision.


This change is Reviewable

@nedtwigg
Copy link
Member

nedtwigg commented May 7, 2016

First off - imagine if the world deprecated windows newlines forever! Dagen H was the day that Sweden started driving on the right-hand side of the road rather than the left, and my dream is to see a "Dagen Newline" before the end of my career.

But that day isn't today, so thanks for this excellent contribution! I am 100% on board with adding support for .gitattributes and/or DERIVED, and your code is high quality.

I see two issues in the design, both of which are easily addressable.

Formatter implementations should see text that contains only unix newlines

Dagen Newline isn't today for end-users, but it is for authors of FormatterSteps. If we pass the raw text to people's custom regexes and custom steps, then we're pushing the requirement to handle both kinds of newline onto every formatter implementation and custom regex. Inevitably, people will do something that works for their case, but breaks when someone from the other side of the \r debate checks out their code. Right now Spotless fixes this by sanitizing the input to FormatterSteps, and this PR breaks this feature, and requires that every FormatterStep know what a LineEndingService is and how to use it, rather than just knowing what a String is.

If we want to add per-format line endings, the way to do it is for FormatExtension to a have a lineending property. If it's set to DERIVED, then before the FormatterTasks run, we record the current line ending, and set it back to what it needs to be after they have run. I think an old version of Spotless did this, but I removed it because it was more complicated, I didn't need it, and I wasn't sure if anyone else was gonna use spotless or not.

LineEnding.UNDEFINED vs .gitattributes

If I pass this string public class SomeClass { public void someMethod(){} } to the java formatter with DERIVED, how should the formatter resolve the UNDEFINED, since the EclipseFormatterStep will add newlines? The only way for the user to retain full control is to have DERIVED_FALLBACKTOUNIX and DERIVED_FALLBACKTOWINDOWS, which is ugly. I'm trying to be really pedantic about keeping edge-cases out of Spotless, and this DERIVED / UNDEFINED complicates things quite a bit.

An alternative to LineEnding.DERIVED is to add support for parsing a .gitattributes file.

# Set the default behavior, in case people don't have core.autocrlf set.
* text=auto

# Explicitly declare text files you want to always be normalized and converted
# to native line endings on checkout.
*.c text
*.h text

# Declare files that will always have CRLF line endings on checkout.
*.sln text eol=crlf
# Declare files that will always have LF line endings on checkout.
*.sh text eol=lf

DiffPlug's day job is making a git client, so I've already got the infrastructure to do this.

How to procede

I've got a few questions for you:

  1. If spotless supported .gitattributes, would that solve your problem? If so, I'd like to leave LineEnding.DERIVED out of it.

  2. Eclipse 4.6 is scheduled for release on June 22. I'm gonna have to dig in around then to fix some things anyway. How pressing is it for you that Spotless support either DERIVED or .gitattributes before then? If it's not pressing, then I'll wait until 4.6 comes out and do it all at once.

@chkpnt
Copy link
Contributor Author

chkpnt commented May 8, 2016

Thanks for your review. And I can comprehend your concerns.

Using sanitized Strings for the FormatterSteps is a suitable design decission. But I don't think this decission contradicts the usage of FormatterSteps for the normalization-part: If we did the sanitizing-normalization as an implicit first step, our other and the custom steps woudn't need to know the LineEnding and could work as they work today. The destination-normalization could be an implicit or explicit last step, which is configured by the per-format or global line ending setting.

I like the idea of using a .gitattributes-file. Even if the project doesn't use git, writing and understanding such a file is imho very easy. And it allows a fine-grained control over the line endings, more than a simple DERIVED could provide. Only one note: if a file was tagged as binary or -text, what should Spotless do?

It is not pressing for me. In my day-job-project, Spotless is currently only activated in the build process on my local machine; I would like to enable Spotless globally, but a few months won't matter.

PS: For Mac users, Dagen Newline was in 2001 when OS X was released. :-)
PPS: Git in Eclipse doesn't support .gitattributes, this is one reason I'm primary using SourceTree / Git Bash on Windows. But there is hope: the recently released JGit 4.3 introduced .attributes-support :-)

@chkpnt
Copy link
Contributor Author

chkpnt commented May 8, 2016

Initially, I've opend this issue as an issue. But unfortunately, I've converted this issue to an pull request, when I've tried to "attach" my branch. An issue would be more suitable for the future proceed. I've just looked if it's possible to do a reconvertion, but it's not. 😔

@nedtwigg
Copy link
Member

nedtwigg commented May 8, 2016

if a file was tagged as binary or -text, what should Spotless do?

Probably spit a warning that the Spotless config and .gitattributes are in conflict, and ignore these files until the user fixes the conflict.

The destination-normalization could be an implicit or explicit last step, which is configured by the per-format or global line ending setting.

Yep! In order to support DERIVED, you have to read the files before normalization, which means you can't use the standard FormatterStep infrastructure. But with .gitattributes you can still use the infrastructure as-is.

That said, I'm inclined to keep this functionality out of the FormatterSteps. The FormatterSteps currently are guaranteed to receive only unix newlines, and the spec requires that they output only unix newlines. This new gitattributes rule would have to be the last step, or it would break the other steps that were expecting to only receive unix newlines.

By keeping the line handling entirely outside of the FormatterSteps, we preserve the illusion of Dagen Newline.

PS: For Mac users, Dagen Newline was in 2001 when OS X was released.

See, it can be done! For a while I think the rumor was that Windows 8 would be Microsoft's Dagen Newline, but it never came to pass. :'(

An issue would be more suitable for the future proceed

I agree, so I'm closing this PR and invite you to continue the discussion at #23.

@nedtwigg
Copy link
Member

Just wanted to follow up that Spotless 2.0.0 has full support for .gitattributes, core.lf, and all of git's newline handling.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants