Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Strip trailing whitespace #1422

Merged
merged 27 commits into from
Nov 14, 2019
Merged

Strip trailing whitespace #1422

merged 27 commits into from
Nov 14, 2019

Conversation

sjakobi
Copy link
Collaborator

@sjakobi sjakobi commented Oct 16, 2019

Fixes #183.

This supersedes #955.

TODO:

  • Fix all the other places where we use layoutSmart.
  • Accept new test output.
  • Test that dhall diff handles trailing whitespace in multi-line strings correctly.

@sjakobi sjakobi force-pushed the sjakobi/183-trailing-whitespace branch from e3df29c to bb39d0b Compare October 16, 2019 10:24
@quchen
Copy link
Contributor

quchen commented Oct 21, 2019

Prettyprinter 1.4 is released!

@sjakobi
Copy link
Collaborator Author

sjakobi commented Oct 21, 2019

For the update to hnix, see haskell-nix/hnix#518.

@sjakobi sjakobi changed the title WIP: Strip trailing whitespace Strip trailing whitespace Oct 21, 2019
@sjakobi
Copy link
Collaborator Author

sjakobi commented Oct 21, 2019

There's no compatible hnix release yet. Apart from that, this is ready for review.

Copy link
Collaborator

@Gabriella439 Gabriella439 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great! 🙂

@sjakobi
Copy link
Collaborator Author

sjakobi commented Nov 2, 2019

I just noticed that we can have trailing whitespace in multi-line strings, which this patch would remove.

What would be the best way to preserve that whitespace? We could put it into interpolations, like this:

$ dhall format
''
1                                                                           
2                                                
''
<Ctrl-D>
''
1${"                                                                           "}
2${"                                                "}
''

@Gabriella439
Copy link
Collaborator

Gabriella439 commented Nov 2, 2019

@sjakobi: I think the simplest way to fix this is to change this pull request to have dhall lint strip trailing all trailing whitespace (including trailing whitespace in multi-line strings) and leave dhall format unchanged (i.e. dhall format wouldn't strip any trailing whitespace).

My reasoning is that dhall format should never be making any semantically significant changes to the code and stripping trailing whitespace in a multi-line string is definitely a semantically significant change.

However, I could see dhall lint stripping whitespace since it makes no guarantees of preserving the code's semantics and only has a general mandate to "improve things". For example, we already rewrite let example = x === y to let example = assert : x === y, which is not a semantics-preserving change.

In fact, I actually think we could treat it as a feature if dhall lint strips trailing whitespace from a multi-line string (i.e. it caught and fixed trailing whitespace that a user unintentionally introduced), as long as we advertise it as one of the things that linting is intended to do.

@Gabriella439
Copy link
Collaborator

@sjakobi: Note that there is a downside of my proposal which is:

  • Users may be inclined to lint things they otherwise wouldn't just to strip semantically insignificant whitespace
  • Users who only wish to use dhall format would still not be satisfied

... so I'm still not sure what's the best way to fix that. However, I still think that dhall format should definitely not be stripping whitespace in multi-line strings, so if we really want to fix dhall format then we need to find a way to restrict the stripping to the correct lines.

@sjakobi
Copy link
Collaborator Author

sjakobi commented Nov 3, 2019

Right, I agree that dhall format should not make semantic changes.

Since it is the pretty-printing that usually adds the trailing whitespace in the first place, I also believe that any fix to #183 must involve stripping whitespace as a part of dhall format.

But changing e.g. " " to "${" "}" isn't a semantic change, right? At least Dhall offers no functions that could distinguish these two text literals. So I think that's what we should do when we layout multi-line strings that contain whitespace before a line break.

Whether we change dhall lint to "improve" text literals is IMHO a separate topic.

@Gabriella439
Copy link
Collaborator

@sjakobi: It isn't a semantic change, but it is a syntactic change. Up until now dhall format has never made a change that would affect the binary encoding (as far as I know; there should probably be a property test for that).

@sjakobi
Copy link
Collaborator Author

sjakobi commented Nov 3, 2019

It isn't a semantic change, but it is a syntactic change. Up until now dhall format has never made a change that would affect the binary encoding (as far as I know; there should probably be a property test for that).

I see. I thought that dhall encode would normalize an expression before encoding it but that's not the case:

$ dhall encode --json
" "
[
    18,
    " "
]
$ dhall encode --json
"${" "}"
[
    18,
    "",
    [
        18,
        " "
    ],
    ""
]

dhall hash does though (and I think it makes sense):

$ dhall hash
" "
sha256:27c45a3dfdd033111d28e8907ff3fc72e04d38d75c27518a9b50520b895459cf
$ dhall hash
"${" "}"
sha256:27c45a3dfdd033111d28e8907ff3fc72e04d38d75c27518a9b50520b895459cf

I don't see many options though:

  1. We could change the pretty-printing of text literals, so if they have whitespace before a line break, they are formatted as single-line strings.
  2. We could put the trailing whitespace into interpolations and output a warning when we do that.
  3. We could put the trailing whitespace into interpolations and not output a warning.

My preference is (3) – a semantic change would be bad, but a purely syntactic change in rather rare circumstances, that doesn't affect the hash, should be fine.

@Gabriella439
Copy link
Collaborator

@sjakobi: I still have reservations about putting trailing whitespace in string interpolations. That seems even worse than trailing whitespace outside of them and a regression compared to the status quo. I think the status quo is the least bad option until we can figure out how to fix this properly

@sjakobi
Copy link
Collaborator Author

sjakobi commented Nov 3, 2019

I still have reservations about putting trailing whitespace in string interpolations. That seems even worse than trailing whitespace outside of them and a regression compared to the status quo.

Oh, I'm wondering whether we're talking about the same thing here! I'm definitely not arguing for the whitespace accidentally added by prettyprinter to be preserved! I'm talking about situations like this where the input text literal includes whitespace before line breaks:

$ echo '"                                                  \n                           \n"' | dhall format
''
                                                  
                           
''

I'm arguing that this should be formatted as

''
${"                                                  "}
${"                           "}
''

Do you still see a problem with this?

@Gabriella439
Copy link
Collaborator

@sjakobi: Yeah, I understand. I just think it's going to be very jarring for users to see trailing whitespace preserved in that way

@sjakobi
Copy link
Collaborator Author

sjakobi commented Nov 3, 2019

Yeah, I understand. I just think it's going to be very jarring for users to see trailing whitespace preserved in that way

Ah, ok – I'm a bit surprised though, because I wouldn't expect users to actually want that whitespace in the first place. Once the interpolations make it more visible, I'd expect them to usually just remove it.

What do you think about option 1 then:

We could change the pretty-printing of text literals, so if they have whitespace before a line break, they are formatted as single-line strings.

I also wonder whether we should change the standard to disallow trailing whitespace in multi-line text literals.

@Gabriella439
Copy link
Collaborator

@sjakobi: I don't think we should modify the standard just to work around a limitation specification the Haskell implementation. This seems like something that we should be able to fix with a bit more work on our end.

For example, the prettyprinter package supports annotations. Would it be possible to annotate multi-line literals in such a way that the whitespace stripping logic skips them?

@sjakobi
Copy link
Collaborator Author

sjakobi commented Nov 4, 2019

For example, the prettyprinter package supports annotations. Would it be possible to annotate multi-line literals in such a way that the whitespace stripping logic skips them?

Well, that sounds like a very good idea! 👍 removeTrailingWhitespace does indeed skip annotations. I'll give this a try!

@sjakobi
Copy link
Collaborator Author

sjakobi commented Nov 4, 2019

Oh, boy! I had completely forgotten that text literals are already annotated with literal! Their whitespace is already preserved! I'm so sorry to have wasted your time with this discussion, @Gabriel439! :/

I'll just try to tweak this so if we have an empty line in a multi-line text literal, the whitespace added by prettyprinter up to the indentation level is removed. I'll also add a test.

@Gabriella439
Copy link
Collaborator

@sjakobi: No worries! 🙂

dhall/tests/diff/function.txt Outdated Show resolved Hide resolved
@@ -0,0 +1,7 @@
{ x =
''

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There's trailing whitespace on this line, but GitHub doesn't seem to display it unlike the trailing whitespace after foo on line 5.

This was referenced Nov 10, 2019
nix/shared.nix Outdated
@@ -393,6 +393,10 @@ let
haskellPackagesNew.semigroups
];

hnix =
pkgsNew.haskell.lib.doJailbreak
haskellPackagesOld.hnix;
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I thought that this would allow hnix to be built with prettyprinter-1.5.1, but Hydra seems to disagree:

Configuring hnix-0.6.1...
CallStack (from HasCallStack):
  die', called at libraries/Cabal/Cabal/Distribution/Simple/Configure.hs:958:20 in Cabal-2.2.0.1:Distribution.Simple.Configure
  configureFinalizedPackage, called at libraries/Cabal/Cabal/Distribution/Simple/Configure.hs:462:12 in Cabal-2.2.0.1:Distribution.Simple.Configure
  configure, called at libraries/Cabal/Cabal/Distribution/Simple.hs:596:20 in Cabal-2.2.0.1:Distribution.Simple
  confHook, called at libraries/Cabal/Cabal/Distribution/Simple/UserHooks.hs:67:5 in Cabal-2.2.0.1:Distribution.Simple.UserHooks
  configureAction, called at libraries/Cabal/Cabal/Distribution/Simple.hs:178:19 in Cabal-2.2.0.1:Distribution.Simple
  defaultMainHelper, called at libraries/Cabal/Cabal/Distribution/Simple.hs:115:27 in Cabal-2.2.0.1:Distribution.Simple
  defaultMain, called at Setup.hs:2:8 in main:Main
Setup: Encountered missing dependencies:
prettyprinter >=1.2.1 && <1.3

Could you take a look @Gabriel439?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjakobi: I pushed a change to fix the Nix build

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @Gabriel439! :)

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@sjakobi: You're welcome! 🙂

@sjakobi sjakobi merged commit dedd5e0 into master Nov 14, 2019
@sjakobi sjakobi deleted the sjakobi/183-trailing-whitespace branch November 14, 2019 13:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

dhall-format inserts trailing whitespace
3 participants