Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Display violations detected (fix #7) #44

Merged
merged 3 commits into from
Feb 10, 2023

Conversation

ulysses4ever
Copy link
Contributor

This looks like this:

❯ cat test-show-violation/bug.hs 
import Data.List

main = 
	print "Hi!" -- the first character is tab;



❯ $fix-whitespace --check test-show-violation/bug.hs
[ Violation detected ] test-show-violation/bug.hs:
line 3] main·=·
line 4] →print·"Hi!"
line 6] <empty line>
line 7] <empty line>



So, I display line numbers and also make visible all spaces and tabs in the violating lines, so that it's easier to see what's happening, in particular, I show:

  • spaces as middots
  • tabs as right arrows
  • trailing empty lines as <empty line>

@ulysses4ever ulysses4ever force-pushed the show-violation branch 3 times, most recently from 9713c27 to 9e770be Compare February 9, 2023 03:55
@andreasabel
Copy link
Member

@ulysses4ever : Thanks for taking this on!

While you are redoing the check output, I wonder whether we should switch to a standard error message format. I only found the GNU standard when googling for such a thing: https://www.gnu.org/prep/standards/html_node/Errors.html
It starts like this:

Error messages from compilers should look like this:

sourcefile:lineno: message

This could lead to an awful repetition of sourcefile, but the advantage is that there is some tooling for this kind of error format. E.g., emacs can pick up the locations from such error and get you to the relevant file on a click.
Do you know of any other standards (is there one for VS Code)?

So, I display line numbers and also make visible all spaces and tabs in the violating lines, so that it's easier to see what's happening, in particular, I show:

  • spaces as middots

Approved!

  • tabs as right arrows

At least in the context of Agda, this could be confusing. Could they displayed as ^I instead or <TAB>, maybe in some other color? Or maybe as the respective number of spaces but with background color red? (This is how emacs does it for me.)

  • trailing empty lines as <empty line>

In general, any violation could be rendered with some error background color, like red.
The terminal package has a function to test whether the terminal supports colors and formatting: https://hackage.haskell.org/package/pretty-terminal-0.1.0.0/docs/System-Console-Pretty.html#v:supportsPretty

@ulysses4ever
Copy link
Contributor Author

There are several things that I address in order below, but tldr; is: I can add file:line (but not file:line:col-range) and change representation of tabs as a part of this PR, and all the rest I suggest we take as a future work.

General comment. I'm generally open to anything that (at least, barely) works for my use case, which is: looking at the output in CI and figuring out what's wrong by inspecting the output. Note that this is quite different from your (implied) use-case: Emacs integration, but perhaps one is not mutually exclusive with another. Another thing: I'd prefer to merge any minimal-working-prototype and iterate in another PRs, if possible. I'm happy to apply quick fixes as a part of this PR. But I will probably not have enough time to implement sufficient improvements here.

Format of errors. I was rather looking into something like errata, so something prettier rather than something that would please Emacs. I'm not opposed to file:line:column-range. The bigger issue here is that the current patch doesn't track columns. That would require quite some surgery on the lowest-level functions that do the actual fixing. I am not willing to implement it as a part of this PR. I can switch to file:line if that would be a tradeoff that works for you. Also, I use Emacs myself and have no idea about VS Code standard (but my Emacs applies all these fixes automagically already, without fix-whitespaces, and I'm not going to integrate it into my Emacs workflow).

Colors. That would be nice but I've never used terminal libraries, so the amount of my work needed here exceeds what I can contribute as a part of this PR.

Tabs. I don't care much how exactly tab is rendered. I find ^I not very user friendly (I personally had no idea that it has to do with tabs), but I could live with it. Just to make sure, when you say:

Could they displayed as ^I instead or

you mean "instead of →"? Because I don't use literal <TAB> anywhere. By the way, if we go for more than character (which I initially wanted to avoid but that's no big deal), I think literal <TAB> would be more user friendly than ^I. What do you think? Again, I'm generally open to anything that doesn't require big changes.

@andreasabel andreasabel added this to the 0.1 milestone Feb 9, 2023
@andreasabel
Copy link
Member

Thanks again, @ulysses4ever:

  • Concerning errors, I think we should settle to the file:line: text format then (as we don't know any other standard).
  • Concerning tabs, ^I is the general control character rendering, e.g. when you cat a binary file. See also https://unix.stackexchange.com/questions/434193/display-tab-characters-as-i
  • There is also \t which is familiar to many. How do you like this? There could be \n for trailing newlines to pair up with this.
  • I am also fine with <TAB> and <NEWLINE> or something similar.
  • Concerning colors, sure I can look into this myself later.

Details like renderings of tabs and trailing newlines can also be changed easily at a later stage.

Btw, if you rebase onto master, you get a working CI again (it was broken upstream).

  • Please allow transformers < 0.7 since GHC 9.6.0 is now part of the CI.
  • I don't understand why constraint solving for GHC <= 8.4 fails. Somehow a too new yaml and resourcet are picked with only support GHC >= 8.6.

@ulysses4ever
Copy link
Contributor Author

Yeah, I know the control sequence business in general, and it was fun to read the SE thread you referenced (thanks for the reference!). cat's example is solid, although I'm not sure that many people know/use -t. In general, I don't believe many people (especially younger ones) recognize ^I well. Otoh, \t is an interesting idea. For me, it looks a bit weird outside source code string literals, though. So, my current plan is to use <TAB>, if you don't mind.

I'll bump the bound, thanks for noticing.

Question: do you want this output in both --check and fix mode? Currently, it works that way. I can do --check only, if you like. I don't have a preference.

@andreasabel
Copy link
Member

So, my current plan is to use <TAB>, if you don't mind.

Sure, works for me!

Question: do you want this output in both --check and fix mode? Currently, it works that way. I can do --check only, if you like. I don't have a preference.

Ideally, in --check mode these lines would be prefixed with a cross mark to indicate and error, and in fix mode with checkmark to indicate the fix.

@ulysses4ever
Copy link
Contributor Author

Ideally, in --check mode these lines would be prefixed with a cross mark to indicate and error, and in fix mode with checkmark to indicate the fix.

By "prefixed", do you mean:

x Main.hs:4: main·=·

or

Main.hs:4: x main·=·

I'm no big fun of it because it doesn't seem to improve either substance or presentation. Maybe I just don't know how to do it pretty. But I can add it. Just tell me which one.

Currently, I don't store the fixed lines, only the original ones. How about I only print problematic lines in the --check mode and leave printing fixed lines in the normal mode for future work?


I don't understand why constraint solving for GHC <= 8.4 fails.

My current guess is that I have transformers >= 0.5.6.2 which appeared in GHC 8.6.4. If you want 8.4, you need 0.5.5. Unfortunately, that version doesn't have Control.Monad.Trans.Writer.CPS. I can switch to Control.Monad.Trans.Writer.Strict which is less efficient (if you believe documentation).

@andreasabel
Copy link
Member

By prefixed I meant beginning of the line, so it looks like a checklist. But ok to skip this if you don't think it helps.

How about I only print problematic lines in the --check mode and leave printing fixed lines in the normal mode for future work?

Yes, that's fine.

My current guess is that I have transformers >= 0.5.6.2 which appeared in GHC 8.6.4. If you want 8.4, you need 0.5.5. Unfortunately, that version doesn't have Control.Monad.Trans.Writer.CPS

Ok, I see. Generally, it is fine to drop such old GHCs if they cause complications. After all, fix-whitespace is an executable, and it even has a binary distribution.

At the moment, fix-whitespace isn't optimized. E.g. it has to keep the whole file in memory (due to the reverse . dropWhile . reverse on the lines of the file). If efficiency was a concern, not needing to store everything in memory would maybe be the first thing to work on.

@andreasabel andreasabel self-assigned this Feb 10, 2023
@andreasabel andreasabel added enhancement New feature or request UX User experience labels Feb 10, 2023
@andreasabel andreasabel merged commit eb7062c into agda:master Feb 10, 2023
@ulysses4ever ulysses4ever deleted the show-violation branch February 10, 2023 18:02
@ulysses4ever
Copy link
Contributor Author

Thanks a lot, Andreas, for engaging and helping to nail it! A hackage release would be appreciated, so that we start ripping the benefits :-)

@ulysses4ever
Copy link
Contributor Author

@andreasabel would it be possible to have a Hackage release with this patch?

@andreasabel
Copy link
Member

I wanted to add a testsuite before that (#45), with goldplate, but disappeared into a rabbit hole last Saturday when trying to upgrade goldplate to also support Windows. Hence the delay, my apologies.
I might just settle for tasty-golden instead, which anyway has a lighter dependency footprint.

@ulysses4ever
Copy link
Contributor Author

Oh, so you're on it, good. No rush, of course.

Just one thought: if tests take more time than expected and you want to do no more than one release in the near future, then, perhaps, releasing without tests and then adding them only on Github is no big deal: I think tests are less important on Hackage...

@andreasabel
Copy link
Member

andreasabel commented Feb 17, 2023

@ulysses4ever : While testing, I observed that the following violation does not produce an error:

* Ensure that the file ends in a newline character.

I have to investigate.
Atm, you cannot push a fix because I am in a big restructuring process in order to accommodate the tasty suite.

Ah, I understand why. After breaking the file into lines, we cannot observe anymore whether the last line ended in a newline character. So maybe we have to implement our own line-breaking function which gives an error if the last line does not end with newline.

@andreasabel
Copy link
Member

@ulysses4ever : I also realize that performance degrades for large files. E.g. with a file that has 200,000 lines of just a space character, we get the following runtimes:

Before:

$ time fix-whitespace-0.0.10 --check 200000-violations.txt 2> /dev/null

real	0m0.134s
user	0m0.092s
sys	0m0.038s

With the PR:

$ time fix-whitespace-0.1.0 --check 200000-violations.txt 2> /dev/null

real	0m30.314s
user	0m12.238s
sys	0m17.980s

So we regress from 0.1s to 30s for this example.

I suspect there is a quadratic factor in using the Writer monad with a list [a]. I think I try a diff-list instead: https://hackage.haskell.org/package/base-4.17.0.0/docs/Data-Monoid.html#t:Endo

@ulysses4ever
Copy link
Contributor Author

Do you mean it's a regression after this patch?

@andreasabel
Copy link
Member

Yes.

@ulysses4ever
Copy link
Contributor Author

Wow, a lot of exciting data! Could you share the file you used for benchmarking. There's a trick with Writer [a] that usually helps (using censor instead of tell)...

@andreasabel
Copy link
Member

I think I try a diff-list instead: hackage.haskell.org/package/base-4.17.0.0/docs/Data-Monoid.html#t:Endo

This does not help... (no observable timing difference to just lists).

@andreasabel
Copy link
Member

@ulysses4ever: I pushed my work on testsuites and reported the two issues discussed here:

I think these should be fixed before releasing. Would you have time to work on them? If yes, please drop a comment at the respective issue(s). (I am not sure how much time for it I have from tomorrow on.)

@ulysses4ever
Copy link
Contributor Author

Of course, this should not be released as is. I'll try to work on fixing it. I can't promise any timeline though.

@ulysses4ever
Copy link
Contributor Author

Just to make sure: performance regression does come from this patch, but the issue with the trailing newline in the last line has nothing to do with it?

@andreasabel
Copy link
Member

Since this patch, removeFinalEmptyLinesExceptOne uses the length function, which is a strictification, so it could well be the culprit.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request UX User experience
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants