Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Code formatter for Bicep #823

Merged
merged 16 commits into from
Nov 5, 2020
Merged

Code formatter for Bicep #823

merged 16 commits into from
Nov 5, 2020

Conversation

shenglol
Copy link
Contributor

@shenglol shenglol commented Nov 4, 2020

The implementation is largely based on the first section of A prettier printer by Waldler. The algorithm described in the paper is considered a state-of-art in Prettyprint. It is also used by the one of the most popular code formatters, Prettier.

For now all formatting rules are hard coded, and it won't split expression based on some line width limit, so it's not "pretty". This is fine since #146 is still open. But once we decide to make the Bicep syntax less newline sensitive we should implement lint splitting algorithm based on section 2~5 of Waldler's paper.

This closes #220.

Demo:
code-formatting


long indentSize = request.Options.TabSize;
IndentKindOption indentKindOption = request.Options.InsertSpaces ? IndentKindOption.Space : IndentKindOption.Tab;
bool insertFinalNewline = request.Options.ContainsKey("insertFinalNewline") && request.Options.InsertFinalNewline;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

InsertFinalNewline won't work until microsoft/vscode-languageserver-node@4d07d47 is released.

We should also remove the ContainsKey guard once OmniSharp/csharp-language-server-protocol#411 is fixed.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll have a hotfix ready for this as 0.18.3, as that's a pretty big break.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@david-driscoll Thanks for the quick turn around!

return Task.FromResult<TextEditContainer?>(null);
}

long indentSize = request.Options.TabSize;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any ability in VSCode to set a default spacing for the language? If possible, I'd like the majority of our samples to be formatted with indent = 2, indentkind = space.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, the following settings should do the work:

    "[bicep]": {
        "editor.tabSize": 2,
        "editor.insertSpaces": true
    }

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice!

Copy link
Member

@anthony-c-martin anthony-c-martin Nov 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is that specified in an individual user's settings or picked up as a default when the extension is in use? I'm wondering whether it's possible to make the extension opinionated about formatting for .bicep files, and give users the usual ability to override it.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree with what anthony is proposing - it would be good if this can be the default setting for the extension

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on this as well.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be up to the users to provide the settings. I also agree that if we can provide some default formatting settings so users can get a consistent experience. My question is should we do that via some custom VSCode bicep settings, or we read it from some external config file, say a .editorconfig file (or a bicep project file if we are going to support that)?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry I was wrong. It turns out we can add the following section to the extension manifest in package.json - no need to invent custom VSCode settings:

    "configurationDefaults": {
      "[bicep]": {
        "editor.tabSize": 2,
        "editor.insertSpaces": true
      }
    },

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a great find!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've just done tests with editorconfig as well. As long as the editorconfig VSCode extension is installed, it will override the tabSize and insertSpaces settings temporarily for the current opened workspace, which means we don't even need to read editorconfig files explicitly by ourself, so we can close this one :)

@@ -42,5 +45,7 @@ public static string EscapeBicepString(string value)

return buffer.ToString();
}

public static bool IsMultilineString(string value) => NewLineRegex.Matches(value).Count >= 2;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IsMultilineString [](start = 27, length = 17)

So something like "\nstuff" or "stuff\n" is considered a single line string?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...good question. I was bad at naming stuff. Will change it to something like HasAtLeastTwoNewlines.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No worries - new name works.

{
private const int MinIndentSize = 1;

private const int MaxIndentSize = 8;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MaxIndentSize [](start = 26, length = 13)

Can LSP give us higher indent sizes? If yes, why would we limit it in the formatter?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can. On second thought I was overthinking. I thought we should add some constrains in case users set a really big tab size like 100000, but apparently VSCode accepts that and outputs it upon pressing Tab...although it takes a fews seconds to complete the insertion.
image

If users decide to be creative, we should support them 💪. I'll remove the limits.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lol, I'm glad you tested that :)

Copy link
Contributor Author

@shenglol shenglol Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fun fact - after digging deeper I realized that you can overflow the tab size with a huge number:
image
...which seems to be a bug of VSCode?

Also VSCode does have a minimum tab size of 1. Now I start to think maybe it's a good idea to keeping both limits again, but maybe we can increase the upper boundary a bit to, say, 1000?

@majastrz @anthony-c-martin what do you think?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense to me.


In reply to: 517730586 [](ancestors = 517730586)

private static readonly IReadOnlyDictionary<int, string> IndentsBySize = Enumerable.Range(1, 8)
.ToDictionary(size => size, size => new string(' ', size));

public static string? PrintProgram(ProgramSyntax programSyntax, PrettyPrintOptions options)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PrintProgram [](start = 30, length = 12)

I haven't read the rest of the code yet, but I'm wondering something... We have some cases where we need to produce dynamically generated multi-line snippet completions. I want to make sure we adhere to the user's formatting settings (spaces, indent size, etc.) when we do so the completion doesn't produce code that looks out of place. How hard would be to modify this in the future to support formatting a subtree of the AST? (Out of scope of this PR, of course.)

Copy link
Contributor Author

@shenglol shenglol Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can change the method to take a SyntaxBase, but the tricky thing is how we can fix the indent. There are other many other edge cases to handle. Waldler's paper doesn't mention anything about range formatting, but Prettier implements it. My idea is to find the top level statements overlapping the range in a formatting request and then format the statements, which should work but it's not optimal. I'm sure prettier does something more advanced, and I'll take a look at their source code to figure it out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah for that specific scenario, committing the completion would reformat stuff around it, which would be kind of weird. I haven't seen how auto formatting normally works in VS Code. In VS in C# auto formatting only triggers when you type ; or close a block - they don't seem to do it when you commit a completion. I'm mostly worried about user experience and less worried about efficiency. Although if we got both, that works too.

{
if (programSyntax.GetParseDiagnostics().Count > 0)
{
return null;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

null [](start = 23, length = 4)

What would it take to support formatting when there are syntax errors?

Copy link
Contributor Author

@shenglol shenglol Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Technically the formatter will just work when there are syntax errors. However, the output is not predictable at all. Even a missing } could mess up a big chunk:

param p0 object = { a: true
param p1 string
param p2 string
param p3 string
param p4 string

var foo = {
  b: false
}

will be formatted as

param p0 object = {a: true
  paramp1string 
  paramp2string 
  paramp3string 
  paramp4string 

  varfoo={ 
  b: false
}

I think this one of the reason why formatters like Prettier and Black don't run formatting when there are lexical and syntactic errors. Because there are infinite number of ways to break a syntax tree, finetuning the code formatter to make sure it doesn't produce something that will surprise users requires a lot of efforts.

That being said, since Bicep is not a generic programing language and is declarative, I do think there are certain things we can do to make it work. For example, if a declaration has syntax errors, skip formatting that declaration but keep formatting the other nodes. I'll do some experiments and send a separate PR if it goes well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good.

Copy link
Member

@anthony-c-martin anthony-c-martin Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced this is a problem we need to solve, especially if the perceived user experience is unpredictable (and I don't see how it wouldn't be - your bracket example demonstrates that perfectly). If there's a way to simply fail with an error if there are parse errors, that would be my preferred option.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I disagree. I'd say sometimes doing nothing in the format command would be even worse than partially formatting a file. (I don't want the user to ever wonder why nothing happened when they clicked the format option.)

I've also never had a format command fail in an IDE with an error. Just tried in JSON, ts in VS code and they do format even with syntax errors. Same for c# in VS.

That said, they do produce weird formatting in some cases when there are syntax errors, so we don't need to be perfect here. If the user hates the result they can always undo.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to agree to disagree then :)

I disagree. I'd say sometimes doing nothing in the format command would be even worse than partially formatting a file. (I don't want the user to ever wonder why nothing happened when they clicked the format option.)

100% agree we should not fail without informing the user - I'm making the assumption that we have the ability to surface an error.

I've also never had a format command fail in an IDE with an error. Just tried in JSON, ts in VS code and they do format even with syntax errors. Same for c# in VS.

Maybe it's just me then, but that behavior seems totally undesirable...

That said, they do produce weird formatting in some cases when there are syntax errors, so we don't need to be perfect here. If the user hates the result they can always undo.

Technically true, but there are going to be cases where it's not immediately obvious what has been changed - if you're working with larger files, without source control etc. Reformatting when we don't have matching brackets has the potential to introduce some very subtle changes that could go unnoticed for a while.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll have to agree to disagree then :)

I disagree. I'd say sometimes doing nothing in the format command would be even worse than partially formatting a file. (I don't want the user to ever wonder why nothing happened when they clicked the format option.)

100% agree we should not fail without informing the user - I'm making the assumption that we have the ability to surface an error.

👍 LSP protocol allows error response theoretically, but I don't know if OmniSharp can handle it. Even if we surfaced an error, VS code extensions seem to have trained users not to expect it at all, so they might miss it regardless.

I've also never had a format command fail in an IDE with an error. Just tried in JSON, ts in VS code and they do format even with syntax errors. Same for c# in VS.

Maybe it's just me then, but that behavior seems totally undesirable...

I agree with you that it's not ideal, but it seems like we'd be the only ones doing something about it. Being all exceptional and such is very tempting of course, but we had a similar argument when we discussed hovers (making them kind of like peek definition) and chose consistency with other VS code extensions there. Why do that there and this here?

That said, they do produce weird formatting in some cases when there are syntax errors, so we don't need to be perfect here. If the user hates the result they can always undo.

Technically true, but there are going to be cases where it's not immediately obvious what has been changed - if you're working with larger files, without source control etc. Reformatting when we don't have matching brackets has the potential to introduce some very subtle changes that could go unnoticed for a while.

This is on the user to verify the changes made by tools. If they are using source control, it's easy to do a diff.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shenglol I don't think this discussion should block the PR btw. We can do another PR once we close on it.

Copy link
Contributor Author

@shenglol shenglol Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@majastrz Gotcha. I can merge the PR then once you and @anthony-c-martin approve.

Regarding formatting with syntax errors, I think I agree with @anthony-c-martin on the point that it is undesirable. Although there are cases it may work (like missing a comma in a JSON file), I find it confusing and even misleading a lot of times. Taking the VSCode JSON formatter as an example:

code-formatting

The last formatting request with the missing [ even changed the syntax error.

I'm trying to understand the use case where users would trigger formatting when there are syntax errors. One scenario might be that, say, I want to format a file, but I don't notice that I mistyped or deleted something. I cannot think of any good reasons why a user would format a file by intention when there are syntax errors. Usually when people notice syntax errors they will try to fix them immediately.

I think I'm still open to enable formatting broken syntax trees, but IMHO, we should do that only if it doesn't change the original syntax errors.

Copy link
Member

@majastrz majastrz Nov 5, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Btw, I'm with both of you that it's kind of weird and potentially problematic to do it, but it seems like everyone else is doing it anyway. So if we do something different, we become the weird one.

One case I can think of is formatting a file that's a work in progress. Your idea of skipping declarations containing syntax errors when formatting would take a care of it without causing weirdness in the unfinished parts of the file.

@anthony-c-martin
Copy link
Member

anthony-c-martin commented Nov 4, 2020

This is a bit of a vague question, but do you have any ideas for how we could add this in to our standard set of tests we run on sample templates? It would be nice to have (for example) a test breaking if we add new syntax and forget to make a change to the formatter for it.

Perhaps something similar to the syntax round trip tests, where we could enforce that all samples are 'correctly' formatted?

@anthony-c-martin
Copy link
Member

Should we have a CLI command for this? Can be a separate PR of course, and the answer might be "no" :)

@majastrz
Copy link
Member

majastrz commented Nov 4, 2020

Should we have a CLI command for this? Can be a separate PR of course, and the answer might be "no" :)

I think we should. bicep format <file> would format in-place. Makes sense to have a separate PR. Ideally, we'd able to format bicep files with lexical and syntax errors then. Otherwise, it would be possible for the command to fail/do nothing.

@shenglol
Copy link
Contributor Author

shenglol commented Nov 5, 2020

This is a bit of a vague question, but do you have any ideas for how we could add this in to our standard set of tests we run on sample templates? It would be nice to have (for example) a test breaking if we add new syntax and forget to make a change to the formatter for it.

Perhaps something similar to the syntax round trip tests, where we could enforce that all samples are 'correctly' formatted?

Yeah, this is definitely something I missed but should get done. I like the idea of syntax round trip validation, and I've created a integration test to implement it. I've also added a test that emits main.formatted.bicep files like what other integration tests do.

@shenglol
Copy link
Contributor Author

shenglol commented Nov 5, 2020

Should we have a CLI command for this? Can be a separate PR of course, and the answer might be "no" :)

I think we should. bicep format <file> would format in-place. Makes sense to have a separate PR. Ideally, we'd able to format bicep files with lexical and syntax errors then. Otherwise, it would be possible for the command to fail/do nothing.

Agreed we should have a CLI command. Although I wonder if we should let the command fail / do nothing instead of formatting files with errors, because users cannot undo with CLI.

@majastrz
Copy link
Member

majastrz commented Nov 5, 2020

Should we have a CLI command for this? Can be a separate PR of course, and the answer might be "no" :)

I think we should. bicep format <file> would format in-place. Makes sense to have a separate PR. Ideally, we'd able to format bicep files with lexical and syntax errors then. Otherwise, it would be possible for the command to fail/do nothing.

Agreed we should have a CLI command. Although I wonder if we should let the command fail / do nothing instead of formatting files with errors, because users cannot undo with CLI.

Yeah the lack of undo is a good pt. We could rely on source control although that's not guaranteed.

In any case format CLI command can probably wait for later milestones. For now, the vs code command should be sufficiently awesome.

@codecov-io
Copy link

codecov-io commented Nov 5, 2020

Codecov Report

Merging #823 into main will increase coverage by 1.09%.
The diff coverage is 99.50%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #823      +/-   ##
==========================================
+ Coverage   94.19%   95.28%   +1.09%     
==========================================
  Files         285      294       +9     
  Lines       12076    12949     +873     
  Branches       12        0      -12     
==========================================
+ Hits        11375    12339     +964     
+ Misses        701      610      -91     
Flag Coverage Δ
dotnet 95.28% <99.50%> (+0.36%) ⬆️
typescript ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
src/Bicep.Core.Samples/DataSet.cs 97.61% <50.00%> (-2.39%) ⬇️
...gServer/Handlers/BicepDocumentFormattingHandler.cs 88.00% <88.00%> (ø)
src/Bicep.Core/PrettyPrint/DocumentBuildVisitor.cs 99.57% <99.57%> (ø)
...IntegrationTests/PrettyPrint/PrettyPrinterTests.cs 100.00% <100.00%> (ø)
src/Bicep.Core.IntegrationTests/PrintVisitor.cs 100.00% <100.00%> (ø)
src/Bicep.Core.Samples/DataSets.cs 100.00% <100.00%> (ø)
...e.UnitTests/PrettyPrint/PrettyPrintOptionsTests.cs 100.00% <100.00%> (ø)
...p.Core.UnitTests/PrettyPrint/PrettyPrinterTests.cs 100.00% <100.00%> (ø)
...c/Bicep.Core/Extensions/StringBuilderExtensions.cs 100.00% <100.00%> (ø)
src/Bicep.Core/Parser/StringUtils.cs 100.00% <100.00%> (ø)
... and 27 more

Copy link
Member

@anthony-c-martin anthony-c-martin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

Copy link
Member

@majastrz majastrz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is amazing!

@shenglol shenglol merged commit a89640a into main Nov 5, 2020
@shenglol shenglol deleted the shenglol/pretty-printer branch November 5, 2020 22:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support code formatting
6 participants