Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix issue with indentation of code/markup following directive blocks ... #6531

Merged
merged 6 commits into from
Jul 13, 2022

Conversation

tuespetre
Copy link
Contributor

@tuespetre tuespetre commented Jun 23, 2022

… related to #4358

Summary of the changes

added a test that demonstrates the problem scenario:

@section Foo {
    <p></p>
}
<p></p>
@section Bar {
    @{
        var test = 1;
    }
}
<p></p>

gets incorrectly formatted like

@section Foo {
    <p></p>
}
    <p></p>
    @section Bar {
    @{
        var test = 1;
    }
    }
    <p></p>
  • added a block of code that fixes the problem
  • removed a previously added block of code that only partially resolved the issue

@davidwengier
Copy link
Contributor

davidwengier commented Jun 23, 2022

Looks like you need to merge from main perhaps, lots of unrelated changes it seems. I think GitHub wasn't showing me the results of your force push. Looks good now.

Also could you update the description, or log a new issue that this PR can "fix", to demonstrate what the formatting engine used to do in this situation? Thanks!

{
var absoluteIndex = sourceMappingIndentationScopes[index];
var scopeOwner = context.CodeDocument.GetSyntaxTree().Root.LocateOwner(new SourceChange(absoluteIndex, 0, string.Empty));
var containingDirective = scopeOwner.FirstAncestorOrSelf<RazorDirectiveSyntax>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this could be narrowed to check for @section specifically, as that is the only direction (I'm pretty sure?) that will exhibit this issue, due to how the code is generated. Which, if I'm right, also means that we can probably get rid of the FirstAncestorOrSelf which would be a big perf win. LocateOwner should be deterministic, so we can check parents etc. to see where we are.

I think that will be especially important because right now, this will look the whole way up the tree, for every iteration.

Copy link
Contributor Author

@tuespetre tuespetre Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

today, the located owner in such a case is an UnclassifiedTextLiteral, with the following lineage:

UnclassifiedTextLiteral > CSharpCodeBlock > RazorDirectiveBody > RazorDirective

so by the time we are able to check if it is from a section, we already have the knowledge that it is from a directive. i could change this from scopeOwner.FirstAncestorOrSelf<RazorDirectiveSyntax>() to scopeOwner.Parent.Parent?.Parent as RazorDirectiveSyntax instead?

if (index < 0)
{
// Couldn't find the exact value. Find the index of the element to the left of the searched value.
index = (~index) - 1;

// If that index comes from within a directive that does not contain the line we are working with,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love this approach! Nice work

@tuespetre
Copy link
Contributor Author

tuespetre commented Jun 23, 2022

changes made since initial review by @davidwengier:

  • moved context.CodeDocument.GetSyntaxTree().Root outside of all loops
  • target the expected structure of the section directive specifically to save overhead of calling FirstAncestorOrSelf
  • added more thorough description to the body of the PR

@tuespetre
Copy link
Contributor Author

i thought of a scenario that breaks the 'just look at parents' approach: c# code within a section creates new indentation scopes, and when they are picked they may not match the simple 3-nodes-up configuration. so something like FirstAncestorOrSelf must be used. updated the new test to cover the scenario.

@davidwengier
Copy link
Contributor

Correct me if I'm wrong, but given a situation like below:

@section Goo {
}

<p></p>
<p></p>
<p></p>
<p></p>
<p></p>

For each line with a P tag, above, we'll look all the way up through the whole document, looking all the way up through the tree every time, right? This is where my perf concerns come in.

So even if we could rely on a predictable tree structure, it's still not a great experience, and means formatting is going to be, for some documents, an exponentially slower process as the document grows.

@tuespetre
Copy link
Contributor Author

tuespetre commented Jun 24, 2022

first, a big LOL at @section Goo

i know, it's not great... but improper formatting is a greater evil, imo. trying to think of a better means of determining the indentation. perhaps the relationship between the indentation scopes and their containing directives should be computed before this whole loop, and then during the loop some quicker checks could be made to determine when we are past a directive and remove its scopes from the list of candidates to look back towards. i will try it out and see if it works.

@davidwengier
Copy link
Contributor

but improper formatting is a greater evil, imo

This is true in an ideal world, but unfortunately falls down in reality. It's one of the most frustrating things about working on formatting, and developer tooling in general, and I'm sorry you're about to discover it :)

Let's pretend a user has a 1000 file razor file, and they currently are happy with what the formatter does, because they don't use @section. If we were to improve formatting for @section (which this PR definitely does), but also make that format operation take 5 seconds (which, to be clear, this PR probably doesn't do), then what now happens in reality is that the user formats their document, waits a second or two, but sees nothing happening, so they think its a bug, and they keep working. The editor meanwhile gets the formatting results back from Razor, but sees that the document state has changed, and so has no choice but to abandon the formatting edits that Razor produced. Formatting is now producing better results, but the users experience is notably worse.

It's a tricky balancing act, and we have to consider all sorts of things, and unfortunately in past decisions, @section blocks have been somewhat disadvantaged, because their usage is perhaps not as high as other features, or because most users put them at the end of the document, therefore avoiding these issues etc. If you're a user who is affected, this PR is great, but if you're the user from my above scenario, this PR is bad.

perhaps the relationship between the indentation scopes and their containing directives should be computed before this whole loop

I think this is definitely on the right track, and I like how you've avoided doing the tree traversal as often. A couple of things come to mind though, that might be worth trying. I've commented in the PR itself.

@@ -118,7 +119,9 @@ protected async Task<List<TextChange>> AdjustIndentationAsync(FormattingContext
continue;
}

sourceMappingIndentations[originalLocation] = indentation;
var scopeOwner = syntaxTreeRoot.LocateOwner(new SourceChange(originalLocation, 0, string.Empty));
var containingDirective = scopeOwner?.FirstAncestorOrSelf<RazorDirectiveSyntax>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps its possible to remember the last seen containingDirective, and only try to locate the owner, and traverse the tree if we've passed it? Or does that break for C# nested inside sections?

@@ -108,7 +108,8 @@ protected async Task<List<TextChange>> AdjustIndentationAsync(FormattingContext
var significantLocationIndentation = await CSharpFormatter.GetCSharpIndentationAsync(context, significantLocations, cancellationToken);

// Build source mapping indentation scopes.
var sourceMappingIndentations = new SortedDictionary<int, int>();
var sourceMappingIndentations = new SortedDictionary<int, (int cSharpDesiredIndentation, RazorDirectiveSyntax? containingDirective)>();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We only use the containing directive to get its end position, and when building this dictionary, we are already aware of positions anyway, so I wonder if its possible to simply artificially add a new entry to the dictionary, when we get to the end of the directive. ie, instead of the dictionary being something like:

[1, (12, null)] // naturally occuring source mapping
[10, (16, directive)] // naturally occuring source mapping, for a directive that has an end location in position 18
[18, (12, null)] // artificially inserted source mapping indentation point, reverting indentation back to what it was before the directive
[24, (12, null)] // next naturally occuring source mapping

Obviously after this is done, we wouldn't need the tuple at all, and in fact the rest of the code in this method could be left as is. All of which is to say, can we build a smarter map of indentation scopes, and process them simply, rather than trying to be smart about how we process a map that doesn't have enough information in it.

@@ -108,7 +108,8 @@ protected async Task<List<TextChange>> AdjustIndentationAsync(FormattingContext
var significantLocationIndentation = await CSharpFormatter.GetCSharpIndentationAsync(context, significantLocations, cancellationToken);

// Build source mapping indentation scopes.
var sourceMappingIndentations = new SortedDictionary<int, int>();
var sourceMappingIndentations = new SortedDictionary<int, (int cSharpDesiredIndentation, RazorDirectiveSyntax? containingDirective)>();
var syntaxTreeRoot = context.CodeDocument.GetSyntaxTree().Root;
foreach (var originalLocation in sourceMappingMap.Keys)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to mention, the really exciting thing to do here would be to artificially insert something in the sourceMappingMap, so that we create a source mapping point at the end of the @section to get things back on track. I think that would require compiler changes though, and so haven't looked too deeply into it, because we're currently working through plans to bring the compiler into this repo, which would make the work easy. If you wanted to try that approach though, I think it would be the ideal solution.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i agree, this would be ideal.

Copy link
Contributor

@davidwengier davidwengier Jul 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tuespetre I'm not sure what your plans were for this PR, but this idea was nagging at me, so I took the liberty of pushing a commit to your branch that does something very similar. Let me know what you think.

It adds a source mapping record to our map, to represent the end of the section block, so that if that is ever found to be the correct mapping point to use, we end up using the indentation from just before the start of the @section.

I didn't touch the test you added, so assuming it covered the extra cases you were talking about, I think this is a nice balance between your awesome idea, to not get hampered by bad source mappings of section blocks, but with better performance characteristics than always traversing the tree would give.

…blocks

This avoids the need to look all the way up the syntax tree to find directives that we might be
contained within, just in case. The tree navigation is converted to a specific 4-parent lookup
that only matches section directives.
@davidwengier
Copy link
Contributor

Ping @NTaylorMullen @ryanbrandenburg @allisonchou for review.

@@ -207,6 +212,30 @@ protected async Task<List<TextChange>> AdjustIndentationAsync(FormattingContext
var absoluteIndex = sourceMappingIndentationScopes[index];
csharpDesiredIndentation = sourceMappingIndentations[absoluteIndex];

// If the indentation is negative, then its out sign that we are at the end of a @section block
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we only do negative indentation in this one scenario? Seeing negative indentation and then immediately assuming "must be a directive" is a bit scary to me. Only reason is that even if it's true that we only do it in one place today if / when we make changes to this in the future will we get into a false-positive situation?

I don't have enough background here so apologize if this is a silly question but if we're using negative numbers to indicate "need to do more" why are we unable to pre-emptively do the work?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we only do negative indentation in this one scenario?

Yes, this is new. If you think it's risky, we can probably change this to store a record that is a bit more expressive, but I verified under the debugger that none of our existing tests fall into this branch (of having a negative indentation)

if we're using negative numbers to indicate "need to do more" why are we unable to pre-emptively do the work?

We're using negative numbers to avoid pre-emptive work, because pre-emptive work is harder/slower. The issue is that when we are at the end of a directive, we want to find the indentation that was correct at the point before the start of the directive. That is painful for two reasons: Firstly, source mappings could be out of order (which is why we store them in a SortedDictionary when building up the info), so we might not even have the before info yet, and secondly, because we use that SortedDictionary, we can't just arbitrarily ask for "the one before this one", we have to go searching for it. I tried to maintain a running list as we build the map to short cut this, but it was very difficult (eg SortedList is not a list, its more like a dictionary 🤦‍♂️)

We could do this as a post processing step for the map, to fill in these negative values, but I decided to take it a step further and essentially make that process lazy, because if we never actually need the indentation after the directive, then we never do the work to calculate it. That is what this is.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the clarification! Ya unfortunately I think having these negative values is definitely risky. It'll also add a few bits of "tribal knowledge" into the data storage type which scares me from a debugging perspective. I'd much rather store a more expressive record to ensure readability / understandability stay intact if that's possible

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to use an IndentationData class which is capable of lazy-loading its indentation info.

Copy link
Contributor

@NTaylorMullen NTaylorMullen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the follow up! Tribal knowledge reduced ❤️

@davidwengier davidwengier merged commit 6cc43d8 into dotnet:main Jul 13, 2022
@davidwengier
Copy link
Contributor

Thank you @tuespetre for kicking this all off, and the awesome contribution!

@tuespetre
Copy link
Contributor Author

thanks to you, crew 🙌

NTaylorMullen added a commit that referenced this pull request Jul 14, 2022
- Originally found in this Integration test CI failure: https://devdiv.visualstudio.com/DevDiv/_build/results?buildId=6414782&view=results
- This ended up revealing a few issues:
    1. Prior to this changeset formatting tests would wait for folding ranges to be available before attempting to format anything. In the above CI failure those checks timed out and the screenshots included also showed that no folding ranges were available. This indicates there might be a secondary issue where folding ranges ar ea little flakey. I've filed [this](#6592) to track re-enabling code folding tests and moved our existing formatting tests off of folding ranges being available to waiting for semantic classifications
    2. With the introduction of [this](#6531) change we started accurately formatting content after directive bodies. Updated our baselines for formatting tests to capture this new requirement.
- As part of this work I also found that the existing `WaitForClassificationAsync` was a little misleading because it defaulted to waiting for Razor component classifications. I've gone ahead and renamed the existing method to `WaitForComponentClassificationAsync` to be extra clear what's being waited for. As an extra reaction I added a general `WaitForSemanticClassificationAsync` that's now used in Razor formatting tests so we can wait for semantic Razor transitions.
NTaylorMullen added a commit that referenced this pull request Jul 14, 2022
- Originally found in this Integration test CI failure: https://devdiv.visualstudio.com/DevDiv/_build/results?buildId=6414782&view=results
- This ended up revealing a few issues:
    1. Prior to this changeset formatting tests would wait for folding ranges to be available before attempting to format anything. In the above CI failure those checks timed out and the screenshots included also showed that no folding ranges were available. This indicates there might be a secondary issue where folding ranges ar ea little flakey. I've filed [this](#6592) to track re-enabling code folding tests and moved our existing formatting tests off of folding ranges being available to waiting for semantic classifications
    2. With the introduction of [this](#6531) change we started accurately formatting content after directive bodies. Updated our baselines for formatting tests to capture this new requirement.
- As part of this work I also found that the existing `WaitForClassificationAsync` was a little misleading because it defaulted to waiting for Razor component classifications. I've gone ahead and renamed the existing method to `WaitForComponentClassificationAsync` to be extra clear what's being waited for. As an extra reaction I added a general `WaitForSemanticClassificationAsync` that's now used in Razor formatting tests so we can wait for semantic Razor transitions.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants