Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug in TryApplyChanges causes it to rewrite all line endings #934

Closed
jaredpar opened this issue Feb 27, 2015 · 8 comments · Fixed by #1024
Closed

Bug in TryApplyChanges causes it to rewrite all line endings #934

jaredpar opened this issue Feb 27, 2015 · 8 comments · Fixed by #1024

Comments

@jaredpar
Copy link
Member

The bottom of this issue has a complete code snippet that demonstrates the problem. It's as minimal as I could make it but in summary it does the following:

  • Opens up the CodeAnalysis project.
  • Defines a symbol that isn't used anywhere in the code in the parse options.
  • Runs the formatter on every file.
  • Undefines the symbol.
  • Applies the changes.

The apply changes portion will cause every document in the project to be rewritten with different line endings. This happens even if the formatter operation is a complete no-op.

The problem appears to be the changing and resetting of the parse options. If the same sample is run without changing the preprocessor symbols (comment out the line labeled CRITICAL LINE) then the sample behaves as expected.

This bug came up as a result of running the code formatter on Roslyn. It caused Git to believe every file had a binary difference.

var projectPath = @"e:\dd\ros\Open\src\Compilers\Core\Portable\CodeAnalysis.csproj";
var workspace = MSBuildWorkspace.Create();
workspace.LoadMetadataForReferencedProjects = true;
var project = await workspace.OpenProjectAsync(projectPath, CancellationToken.None);
var originalOptions = (CSharpParseOptions)project.ParseOptions;
var originalSymbols = originalOptions.PreprocessorSymbolNames;
var newSymbols = originalSymbols.Concat(new[] { "TEST_SYMBOL" });

// CRITICAL LINE 
project = project.WithParseOptions(originalOptions.WithPreprocessorSymbols(newSymbols));

foreach (var doc in project.Documents)
{
    var syntaxNode = await doc.GetSyntaxRootAsync(CancellationToken.None);
    if (syntaxNode == null)
    {
        continue;
    }

    var newNode = Formatter.Format(syntaxNode, workspace);
    project = doc.WithSyntaxRoot(newNode).Project;
}

project = project.WithParseOptions(originalOptions);
if (workspace.TryApplyChanges(project.Solution))
{
    Console.WriteLine("Succeeded");
}
@petr-k
Copy link
Contributor

petr-k commented Mar 3, 2015

Could not reproduce on the latest version, is there anything special about your setup? How exactly do the line endings change in your case in terms of CRLF/CR/LF?

@jaredpar
Copy link
Member Author

jaredpar commented Mar 3, 2015

@petr-k

The "line endings" portion is just the best message I could get out of git. The most obvious symptom is a good portion of files, in particular ".designer.cs", started showing up as binary diffs.

This is reproducible across a couple of machines using the latest NuGet drop. What version did you use?

@mattwar
Copy link
Contributor

mattwar commented Mar 3, 2015

Its probably the encoding that changed.

@petr-k
Copy link
Contributor

petr-k commented Mar 3, 2015

Yes, it changes the encoding from UTF-8 to UTF-16 (little endian). I will look into it, if that's okay?

@Pilchie
Copy link
Member

Pilchie commented Mar 3, 2015

Sure! I've looked at some encoding issues before, but there must still be some missed cases.

@mattwar
Copy link
Contributor

mattwar commented Mar 3, 2015

Changing parse options should trigger all documents to lazy reparse, which will make it look to TryApplyChanges that all documents changed, which they probably did anyway. The problem probably lies in the logic that queues up the lazy reparse, not carrying forward the correct encoding.

@Pilchie Pilchie added this to the 1.0 (stable) milestone Mar 3, 2015
@petr-k
Copy link
Contributor

petr-k commented Mar 3, 2015

It does occur in the latest NuGet drop, but when I try it against the sources in the current master (at bbdf786) and step through, the issue is gone.

@mattwar
Copy link
Contributor

mattwar commented Mar 4, 2015

This seems to be fixed a while ago by @Pilchie, but we should still be better at remembering the correct encoding.

The general problem is that you can possibly call Document.WithSyntaxRoot(newRoot) before ever accessing the text or tree (thus never loading the text from disk and discovering the encoding). The solution was to use an encoding of null whenever Document.WithSyntaxRoot is called, and push the problem to the workspace when the text is written out. The workspace should then notice the null encoding and attempt to determine it from an existing file at that time or default it to UTF8.

That works fine, but it is a rare edge case to call Document.WithSyntaxRoot without first accessing the text or tree. It is possible to keep track of the correct encoding during normal usage of Document.WithSyntaxRoot() which is the common pattern for updating the solution & workspace.

I'll submit a pull request to improve this soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants