Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve incremental build in presence of globs #1328

Merged
merged 3 commits into from Nov 16, 2016

Conversation

AndyGerlicher
Copy link
Contributor

First iteration, looking for feedback. Filename, where I ended up putting it (Common targets vs Roslyn Core targets, etc.).

  • Fix issue with incremental build with globs.
    Add a target that runs before CoreCompile that will record and store the state of the Compile ItemGroup. This file is added to the CustomAdditionalCompileInputs which is an input to CoreCompile. When this file changes, CoreCompile will run again. This fixes the issue of deleting a file or adding a file with an old timestamp to a glob (e.g. *.cs).

  • Add WriteOnlyWhenDifferent to WriteLinesToFile
    Add option to not write the file when the file would not have changed (preserves the timestamp).

The original prototype wrote the file after compile and then subsequent runs could read that before compile and compare the file to the Compile ItemGroup. I went with this approach as it was far simpler and hopefully more elegant, but required a change to WriteLinesToFile to preserve the time stamp when the file doesn't change.

============================================================
-->
<Target Name="_GenerateCompilerDependencyFile" BeforeTargets="CoreCompile">
<WriteLinesToFile Lines="@(Compile->'%(FullPath)')" File="$(IntermediateOutputPath)Compile.deps" Overwrite="True" WriteOnlyWhenDifferent="True" />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we're at this, in addition to Compile I wonder if it would be worth writing out all the properties / items that are not part of the CoreCompile Inputs but sent to CSC as args.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely also references (via ReferencePath and DependsOnTargets="ResolveAssemblyReferences", probably?). Other properties and items should be caught by the Inputs="$(MSBuildAllProjects)" in Microsoft.CSharp.Core.targets. That would be more reliable with #1299 (as is, it might not catch some user-authored files) but is . . . ok now.

Copy link
Member

@rainersigwald rainersigwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was preparing to be high and mighty about how this belongs in the core targets in the Roslyn repo, but this looks like a pretty good approach, and fairly general (it doesn't apply to C++ which uses a different item name, but should work for C# and VB both, with no changes on their end).

Would definitely like to run this by @jaredpar and/or @agocke though.

WriteOnlyWhenDifferent = true,
Lines = new ITaskItem[] {new TaskItem("File contents1")}
};
Assert.True(a2.Execute());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I hate to say this but you need to sleep before running the second task. On HFS+ (and evidently some other *nix systems), timestamps have only a 1-second granularity, so this could rewrite the file but the checks below wouldn't detect it.

============================================================
-->
<Target Name="_GenerateCompilerDependencyFile" BeforeTargets="CoreCompile">
<WriteLinesToFile Lines="@(Compile->'%(FullPath)')" File="$(IntermediateOutputPath)Compile.deps" Overwrite="True" WriteOnlyWhenDifferent="True" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Definitely also references (via ReferencePath and DependsOnTargets="ResolveAssemblyReferences", probably?). Other properties and items should be caught by the Inputs="$(MSBuildAllProjects)" in Microsoft.CSharp.Core.targets. That would be more reliable with #1299 (as is, it might not catch some user-authored files) but is . . . ok now.

============================================================
-->
<Target Name="_GenerateCompilerDependencyFile" BeforeTargets="CoreCompile">
<WriteLinesToFile Lines="@(Compile->'%(FullPath)')" File="$(IntermediateOutputPath)Compile.deps" Overwrite="True" WriteOnlyWhenDifferent="True" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like .deps, especially since there's already the .deps.json files for .NET Core. Maybe .CompilerInputs?

Copy link
Contributor

@cdmihai cdmihai Nov 9, 2016

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

corecompile.cache? makes it clear what build entity it's for

@agocke
Copy link
Member

agocke commented Nov 9, 2016

Right now we don't have any logic around incremental build in the compiler and that feels appropriate.

Basically, the compiler itself just executes build commands from MSBuild -- so it feels like whether or not to issue the command is the responsibility of the MSBuild core tasks + targets, while figuring out how to execute the build is the responsibility of the Roslyn tasks + targets.

@rainersigwald
Copy link
Member

@agocke Agreed, that totally meshes with the MSBuild model. But do you have any objection to this approach of writing references + source files to a file and using that as an additional input, so that we can catch the "compile *.cs" + "delete something.cs" + build (and actually re-invoke the compiler)?

@agocke
Copy link
Member

agocke commented Nov 9, 2016

@rainersigwald I think the general strategy is OK, but you'll need much more than just the references + source files to guarantee deterministic recompilation. @jaredpar Has the list of inputs for the compiler that need to be considered.

Regarding specifically adding files, have you considered looking at the last-modified timestamp of the directories in the glob? That may be simpler if you only care about adding/removing source files.

@rainersigwald
Copy link
Member

Yeah, this is not a complete did-anything-change implementation, but it addresses a problem that will be more prevalent with the new dotnet SDK templates' use of wildcarding (because you can delete a file without changing the project file if it's not individually listed). The existing did-a-project-file-change check is the main "miscellaneous changes" catch.

@@ -1177,6 +1177,7 @@ public partial class WriteLinesToFile : Microsoft.Build.Tasks.TaskExtension
public Microsoft.Build.Framework.ITaskItem File { get { throw null; } set { } }
public Microsoft.Build.Framework.ITaskItem[] Lines { get { throw null; } set { } }
public bool Overwrite { get { throw null; } set { } }
public bool WriteOnlyWhenDifferent { [System.Runtime.CompilerServices.CompilerGeneratedAttribute]get { throw null; } [System.Runtime.CompilerServices.CompilerGeneratedAttribute]set { } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why have the CompilerGeneratedAttribute here? The use doesn't seem to fit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is our ref assembly, auto-generated.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it's auto-generated then why do only half have the attribute?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The property I added was an auto-property, and that crazy compiler we use added the CompilerGeneratedAttribute to the IL. The ref assembly code is just a dump of that, and the other properties all have backing fields so they don't have it.

@@ -23,6 +17,7 @@ public class WriteLinesToFile : TaskExtension
private ITaskItem[] _lines = null;
private bool _overwrite = false;
private string _encoding = null;
private static Encoding s_defaultEncoding = new UTF8Encoding(false, true);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend using named arguments here. On a casual read no idea what false and true mean here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend using `readonly here as well.

@@ -83,12 +85,12 @@ public override bool Execute()
}
}

Encoding encode = null;
Encoding encoding = s_defaultEncoding;
if (_encoding != null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Making the s_defaultEncoding field with readonly allows you to avoid this logic.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is if the user specified encoding (_encoding string set). I set it to the default so we could always safely use it rather than checking the parsed value for null because it failed to convert the string to an Encoding object.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm ... then my follow up suggestion would be to rename _encoding or encoding. Makes it easy to mis-read this code 😄

{
System.IO.File.WriteAllText(File.ItemSpec, buffer.ToString());
var existingContents = System.IO.File.ReadAllText(File.ItemSpec);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens when ReadAllText throws? The use of File.Exists above doesn't give an guarantees this will succeed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a low priority message for this case. No reason to fail the whole task if this happens.

string contentsAsString = null;

// When WriteOnlyWhenDifferent is set, read the file and if they're the same return.
if (WriteOnlyWhenDifferent && System.IO.File.Exists(File.ItemSpec))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Recommend avoiding Exists here. Just try and read the file and deal with errors if they happen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ReadAllText throws when the file doesn't exist. Since this is a pretty normal case (in the context I'm using it every clean build the file won't be there) I'd rather check first to avoid the allocation/read entirely.

var existingContents = System.IO.File.ReadAllText(File.ItemSpec);
if (existingContents.Length == buffer.Length)
{
contentsAsString = buffer.ToString();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this going to be causing a lot more allocations to occur in MSBuild as a result? You're basically bringing all these files into memory as a contiguous string object.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is true and it could be optimized. We ended up going with just the hash so the effect here is very minimal and not worth optimizing. I will file an issue to improve it.

System.Diagnostics.Debugger.Break();

// Hash should be valid for empty item
var emptyItemHash = ExecuteHashTask(new ITaskItem[] {new TaskItem("")});
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There could be another test for an empty array: ExecuteHashTask(new ITaskItem[0]);

@AndyGerlicher AndyGerlicher force-pushed the cscincrementalglob branch 4 times, most recently from c439ebe to fa39f89 Compare November 14, 2016 20:16
@AndyGerlicher
Copy link
Contributor Author

@dotnet-bot test Windows_NT Build for Desktop please

Copy link
Member

@rainersigwald rainersigwald left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good overall to me.

contentsAsString = buffer.ToString();
if (existingContents.Equals(contentsAsString))
{
return true;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is probably worth an info message.

{
/// <summary>
/// Generates a hash of a given ItemGroup items. Metadata is not considered in the hash.
/// <remarks>Currently uses SHA1. Implementation subject to change between MSBuild versions.</remarks>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some sort of "Not intended as a cryptographic security measure" type caveat?

/// <summary>
/// Execute the task.
/// </summary>
/// <returns></returns>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: remove empty <returns>


using (var sha1 = SHA1.Create())
{
var hash = sha1.ComputeHash(Encoding.UTF8.GetBytes(hashInput.ToString()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we didn't have to materialize the full string but I don't see a better way.

@@ -400,6 +400,7 @@
<Compile Include="FindAppConfigFile.cs">
<ExcludeFromStyleCop>true</ExcludeFromStyleCop>
</Compile>
<Compile Include="Hash.cs" />
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

big nit: alphabetize?

@@ -3166,6 +3167,31 @@ Copyright (C) Microsoft Corporation. All rights reserved.

<!--
============================================================
_GenerateCompilerDependencyCache
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd name this GenerateCompileDependencyCache (no r), because it's around the Compile target. But I could go a different way.

@@ -3010,6 +3010,7 @@ Copyright (C) Microsoft Corporation. All rights reserved.
_GenerateCompileInputs;
BeforeCompile;
_TimeStampBeforeCompile;
_GenerateCompilerDependencyCache;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why add this here instead of doing a BeforeTargets="CoreCompile"? Just for relative ordering?

@@ -2085,6 +2085,10 @@
<value>MSB3491: Could not write lines to file "{0}". {1}</value>
<comment>{StrBegin="MSB3491: "}</comment>
</data>
<data name="WriteLinesToFile.ErrorReadingFile">
<value>MSB3492: Could not read existing file "{0}" to compare contents against. Will attempt to overwrite only.</value>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could not read existing file "{0}" to determine whether its contents are up to date. Overwriting it. ?

Add option to not write the file when the file would not have changed
(preserves the timestamp).
Add a task to perform a quick, in-memory, hash of a set of items..
Add a target that runs before CoreCompile that will record and store the
state of the Compile ItemGroup. This file is added to the
CustomAdditionalCompileInputs which is an input to CoreCompile. When this
file changes, CoreCompile will run again. This fixes the issue of deleting
a file or adding a file with an old timestamp to a glob (e.g. *.cs).

Fixes dotnet#1327
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants