Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Include/Exclude filtering capability to Unzip Task (#5169) #6018

Merged
merged 19 commits into from Feb 6, 2021
Merged
Show file tree
Hide file tree
Changes from 7 commits
Commits
Show all changes
19 commits
Select commit Hold shift + click to select a range
9987940
Added Include/Exclude RegEx filtering capability to Unzip Task
sc-ivanlieckens Jan 11, 2021
08cb0d1
Added Exclude Filter Unit Test for Unzip Task
sc-ivanlieckens Jan 11, 2021
86c25c2
Adjusted Unit Test according to feedback
sc-ivanlieckens Jan 14, 2021
fbc0a4f
Applied resource string suggestion
sc-ivanlieckens Jan 14, 2021
e71e37f
Adjusted Include and Exclude to IncludePattern and ExcludePattern as …
sc-ivanlieckens Jan 20, 2021
152aadc
Adjusted property names to "Include" and "Exclude" and altered them t…
sc-ivanlieckens Jan 26, 2021
941d7eb
Unzip task unit tests for pattern parsing
sc-ivanlieckens Jan 27, 2021
658c23a
Name adjustment to better match actual verification
sc-ivanlieckens Jan 28, 2021
8164bfd
Attempt to fail valid character path tests on MacOS and Linux
sc-ivanlieckens Jan 28, 2021
1f1aa6d
Made InvalidPath Unzip unit tests platform specific
sc-ivanlieckens Jan 28, 2021
5e489e2
Improved property or item reference detection
IvanLieckens Jan 29, 2021
9fedb34
Invalid path detection before splitting improvement
sc-ivanlieckens Jan 29, 2021
62d56a9
Added | to invalid path testing
sc-ivanlieckens Jan 29, 2021
76a94f1
Removed platform specific attribute on Invalid Path Character tests i…
sc-ivanlieckens Jan 29, 2021
e9e0f5f
Perf improvement detecting include pattern match
IvanLieckens Jan 29, 2021
8608320
Perf improvement detecting exclude pattern match
IvanLieckens Jan 29, 2021
e4d0286
Small FileMatcher correction, Removed return on parsing for patterns …
sc-ivanlieckens Jan 29, 2021
164e8e7
Updated Include description
IvanLieckens Jan 29, 2021
081c179
Updated Exclude description
IvanLieckens Jan 29, 2021
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Jump to
Jump to file
Failed to load files.
Diff view
Diff view
Expand Up @@ -1217,6 +1217,8 @@ public sealed partial class Unzip : Microsoft.Build.Tasks.TaskExtension, Microso
public Unzip() { }
[Microsoft.Build.Framework.RequiredAttribute]
public Microsoft.Build.Framework.ITaskItem DestinationFolder { get { throw null; } set { } }
public string Exclude { get { throw null; } set { } }
public string Include { get { throw null; } set { } }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we stick with regular expressions for this, we should change the names for Exclude and Include as they are "default" msbuild names. ie. items are added via Include and the patterns differ between them and regular expressions. A suggested name during PR review was IncludePattern & ExcludePattern

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense, the functionality differs indeed. I have pushed the change.

Copy link
Member

@Forgind Forgind Jan 20, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benvillalobos, I thought we'd agreed globs were more MSBuild-y than regex? Confused by your comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We did, I posted this while we were talking about it, hence the "If we stick with regular expressions." The current implementation is still regex, is this how we process includes and excludes normally?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No; we use globs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@benvillalobos ok, thank you for the clearing up. Is there a good example in MSBuild of how you'd like this to function I can use as a lead for the implementation? Just want to make sure that it feels native without any quirks.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Check out Expander.cs, ExpandIntoItemsLeaveEscaped may have what you're looking for.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ItemSpec.cs is also relevant here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you @benvillalobos , I studied Expander and ItemSpec but they were outside of reach for the Tasks assembly and I did not want to introduce dependencies so I leveraged FileMatcher to handle the normalization and verification of the globs and paths. It does not support property references due to this. I hope this matches with your expected behavior.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MSBuildGlob would have been great here, but unfortunately it's not visible by tasks. :(

public bool OverwriteReadOnlyFiles { get { throw null; } set { } }
public bool SkipUnchangedFiles { get { throw null; } set { } }
[Microsoft.Build.Framework.RequiredAttribute]
Expand Down
Expand Up @@ -894,6 +894,8 @@ public sealed partial class Unzip : Microsoft.Build.Tasks.TaskExtension, Microso
public Unzip() { }
[Microsoft.Build.Framework.RequiredAttribute]
public Microsoft.Build.Framework.ITaskItem DestinationFolder { get { throw null; } set { } }
public string Exclude { get { throw null; } set { } }
public string Include { get { throw null; } set { } }
public bool OverwriteReadOnlyFiles { get { throw null; } set { } }
public bool SkipUnchangedFiles { get { throw null; } set { } }
[Microsoft.Build.Framework.RequiredAttribute]
Expand Down
25 changes: 20 additions & 5 deletions src/Shared/FileMatcher.cs
Expand Up @@ -31,6 +31,8 @@ internal class FileMatcher
private static readonly char[] s_wildcardCharacters = { '*', '?' };
private static readonly char[] s_wildcardAndSemicolonCharacters = { '*', '?', ';' };

private static readonly string[] s_propertyReferences = { "$(", "@(" };
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private static readonly string[] s_propertyReferences = { "$(", "@(" };
private static readonly string[] s_propertyAndItemReferences = { "$(", "@(" };

Items are referred to with an @ and properties with $.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I just copied from the old name of the method. I adjusted the naming now to these.


// on OSX both System.IO.Path separators are '/', so we have to use the literals
internal static readonly char[] directorySeparatorCharacters = { '/', '\\' };
internal static readonly string[] directorySeparatorStrings = directorySeparatorCharacters.Select(c => c.ToString()).ToArray();
Expand Down Expand Up @@ -166,8 +168,6 @@ internal static void ClearFileEnumerationsCache()
/// <summary>
/// Determines whether the given path has any wild card characters.
/// </summary>
/// <param name="filespec"></param>
/// <returns></returns>
internal static bool HasWildcards(string filespec)
{
// Perf Note: Doing a [Last]IndexOfAny(...) is much faster than compiling a
Expand All @@ -180,18 +180,33 @@ internal static bool HasWildcards(string filespec)
}

/// <summary>
/// Determines whether the given path has any wild card characters or any semicolons.
/// Determines whether the given path has any wild card characters or semicolons.
/// </summary>
internal static bool HasWildcardsOrSemicolon(string filespec)
{
return -1 != filespec.LastIndexOfAny(s_wildcardAndSemicolonCharacters);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any significant perf difference between using Aggregate and -1 != filespec.LastIndexOfAny(s_wildcardAndSemicolonCharacters)?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Were you thinking Aggregate with a function like Aggregate(false, (acc, ch) => acc || s_wildcardAndSemicolonCharacters.Contains(ch))? That would almost certainly be slower than this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was thinking the difference between s_propertyReferences.Aggregate(false, (current, propertyReference) => current | filespec.Contains(propertyReference)); and -1 != filespec.LastIndexOfAny(s_propertyReferences);

If the former is more efficient, we can change HasWildcardsOrSemicolon to do the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe LastIndexOf is going to be faster but I had to use the aggregate for the other because the s_wildcardAndSemicolonCharacters are char[] and the s_propertyReferences are string[]. Would need to setup a microbenchmark to validate.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah! I didn't realize that we couldn't replicate what was already there. This looks fine to me 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Forgind improved it further now to use Any() :)

}

/// <summary>
/// Determines whether the given path has any wild card characters, any semicolons or any property references.
/// </summary>
internal static bool HasWildcardsSemicolonItemOrPropertyReferences(string filespec)
{
return

(-1 != filespec.IndexOfAny(s_wildcardAndSemicolonCharacters)) ||
filespec.Contains("$(") ||
filespec.Contains("@(")
HasPropertyReferences(filespec)
;
}

/// <summary>
/// Determines whether the given path has any property references.
/// </summary>
internal static bool HasPropertyReferences(string filespec)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
internal static bool HasPropertyReferences(string filespec)
internal static bool HasPropertyOrItemReferences(string filespec)

Another option would be to create a HasPropertyReferences that checks for $( and a separate HasItemReferences that checks for @(. Though I don't feel too strongly about that extra suggestion since filematcher already has something like s_wildcardAndSemicolonCharacters and HasWildcardsOrSemicolon.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I split up the existing method that was already there to allow more granular calling and identification. But I didn't want to introduce too many new methods if they weren't needed.

{
return s_propertyReferences.Aggregate(false, (current, propertyReference) => current | filespec.Contains(propertyReference));
}

/// <summary>
/// Get the files and\or folders specified by the given path and pattern.
/// </summary>
Expand Down
207 changes: 207 additions & 0 deletions src/Tasks.UnitTests/Unzip_Tests.cs
Expand Up @@ -214,5 +214,212 @@ public void LogsErrorIfSourceFileDoesNotExist()
_mockEngine.Log.ShouldContain("MSB3932", () => _mockEngine.Log);
}
}

[Fact]
public void CanUnzip_WithIncludeFilter()
{
using (TestEnvironment testEnvironment = TestEnvironment.Create())
{
TransientTestFolder source = testEnvironment.CreateFolder(createFolder: true);
TransientTestFolder destination = testEnvironment.CreateFolder(createFolder: false);
testEnvironment.CreateFile(source, "BE78A17D30144B549D21F71D5C633F7D.txt", "file1");
testEnvironment.CreateFile(source, "A04FF4B88DF14860B7C73A8E75A4FB76.txt", "file2");

TransientZipArchive zipArchive = TransientZipArchive.Create(source, testEnvironment.CreateFolder(createFolder: true));

Unzip unzip = new Unzip
{
BuildEngine = _mockEngine,
DestinationFolder = new TaskItem(destination.Path),
OverwriteReadOnlyFiles = true,
SkipUnchangedFiles = false,
SourceFiles = new ITaskItem[] { new TaskItem(zipArchive.Path) },
Include = "BE78A17D30144B549D21F71D5C633F7D.txt"
};

unzip.Execute().ShouldBeTrue(() => _mockEngine.Log);

_mockEngine.Log.ShouldContain(Path.Combine(destination.Path, "BE78A17D30144B549D21F71D5C633F7D.txt"), () => _mockEngine.Log);
_mockEngine.Log.ShouldNotContain(Path.Combine(destination.Path, "A04FF4B88DF14860B7C73A8E75A4FB76.txt"), () => _mockEngine.Log);
}
}

[Fact]
public void CanUnzip_WithExcludeFilter()
{
using (TestEnvironment testEnvironment = TestEnvironment.Create())
{
TransientTestFolder source = testEnvironment.CreateFolder(createFolder: true);
TransientTestFolder destination = testEnvironment.CreateFolder(createFolder: false);
testEnvironment.CreateFile(source, "BE78A17D30144B549D21F71D5C633F7D.txt", "file1");
testEnvironment.CreateFile(source, "A04FF4B88DF14860B7C73A8E75A4FB76.txt", "file2");

TransientZipArchive zipArchive = TransientZipArchive.Create(source, testEnvironment.CreateFolder(createFolder: true));

Unzip unzip = new Unzip
{
BuildEngine = _mockEngine,
DestinationFolder = new TaskItem(destination.Path),
OverwriteReadOnlyFiles = true,
SkipUnchangedFiles = false,
SourceFiles = new ITaskItem[] { new TaskItem(zipArchive.Path) },
Exclude = "BE78A17D30144B549D21F71D5C633F7D.txt"
};

unzip.Execute().ShouldBeTrue(() => _mockEngine.Log);

_mockEngine.Log.ShouldNotContain(Path.Combine(destination.Path, "BE78A17D30144B549D21F71D5C633F7D.txt"), () => _mockEngine.Log);
_mockEngine.Log.ShouldContain(Path.Combine(destination.Path, "A04FF4B88DF14860B7C73A8E75A4FB76.txt"), () => _mockEngine.Log);
}
}

[Fact]
public void CanUnzip_WithIncludeAndExcludeFilter()
{
using (TestEnvironment testEnvironment = TestEnvironment.Create())
{
TransientTestFolder source = testEnvironment.CreateFolder(createFolder: true);
TransientTestFolder destination = testEnvironment.CreateFolder(createFolder: false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind modifying this test to use wildcards such that two are included, of which one is also excluded; a third is also excluded; and a fourth isn't excluded or included?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not at all, I hope this new version is more what you were looking for?

TransientTestFolder sub = source.CreateDirectory("sub");
testEnvironment.CreateFile(source, "file1.js", "file1");
testEnvironment.CreateFile(source, "file1.js.map", "file2");
testEnvironment.CreateFile(source, "file2.js", "file3");
testEnvironment.CreateFile(source, "readme.txt", "file4");
testEnvironment.CreateFile(sub, "subfile.js", "File5");

TransientZipArchive zipArchive = TransientZipArchive.Create(source, testEnvironment.CreateFolder(createFolder: true));

Unzip unzip = new Unzip
{
BuildEngine = _mockEngine,
DestinationFolder = new TaskItem(destination.Path),
OverwriteReadOnlyFiles = true,
SkipUnchangedFiles = false,
SourceFiles = new ITaskItem[] { new TaskItem(zipArchive.Path) },
Include = "*.js",
Exclude = "*.js.map;sub\\*.js"
};

unzip.Execute().ShouldBeTrue(() => _mockEngine.Log);

_mockEngine.Log.ShouldContain(Path.Combine(destination.Path, "file1.js"), () => _mockEngine.Log);
_mockEngine.Log.ShouldNotContain(Path.Combine(destination.Path, "file1.js.map"), () => _mockEngine.Log);
_mockEngine.Log.ShouldContain(Path.Combine(destination.Path, "file2.js"), () => _mockEngine.Log);
_mockEngine.Log.ShouldNotContain(Path.Combine(destination.Path, "readme.txt"), () => _mockEngine.Log);
_mockEngine.Log.ShouldNotContain(Path.Combine(destination.Path, "sub", "subfile.js"), () => _mockEngine.Log);
}
}

[Fact]
public void LogsErrorIfIncludeContainsInvalidPathCharacters()
{
using (TestEnvironment testEnvironment = TestEnvironment.Create())
{
TransientTestFolder source = testEnvironment.CreateFolder(createFolder: true);
TransientTestFolder destination = testEnvironment.CreateFolder(createFolder: false);
testEnvironment.CreateFile(source, "BE78A17D30144B549D21F71D5C633F7D.txt", "file1");
testEnvironment.CreateFile(source, "A04FF4B88DF14860B7C73A8E75A4FB76.txt", "file2");

TransientZipArchive zipArchive = TransientZipArchive.Create(source, testEnvironment.CreateFolder(createFolder: true));

Unzip unzip = new Unzip
{
BuildEngine = _mockEngine,
DestinationFolder = new TaskItem(destination.Path),
OverwriteReadOnlyFiles = true,
SkipUnchangedFiles = false,
SourceFiles = new ITaskItem[] { new TaskItem(zipArchive.Path) },
Include = "<BE78A17D30144B549D21F71D5C633F7D>.txt"
};

unzip.Execute().ShouldBeFalse(() => _mockEngine.Log);

_mockEngine.Log.ShouldContain("MSB3937", () => _mockEngine.Log);
}
}

[Fact]
public void LogsErrorIfIncludeContainsPropertyReferences()
{
using (TestEnvironment testEnvironment = TestEnvironment.Create())
{
TransientTestFolder source = testEnvironment.CreateFolder(createFolder: true);
TransientTestFolder destination = testEnvironment.CreateFolder(createFolder: false);
testEnvironment.CreateFile(source, "BE78A17D30144B549D21F71D5C633F7D.txt", "file1");
testEnvironment.CreateFile(source, "A04FF4B88DF14860B7C73A8E75A4FB76.txt", "file2");

TransientZipArchive zipArchive = TransientZipArchive.Create(source, testEnvironment.CreateFolder(createFolder: true));

Unzip unzip = new Unzip
{
BuildEngine = _mockEngine,
DestinationFolder = new TaskItem(destination.Path),
OverwriteReadOnlyFiles = true,
SkipUnchangedFiles = false,
SourceFiles = new ITaskItem[] { new TaskItem(zipArchive.Path) },
Include = "$(Include)"
};

unzip.Execute().ShouldBeFalse(() => _mockEngine.Log);

_mockEngine.Log.ShouldContain("MSB3938", () => _mockEngine.Log);
}
}

[Fact]
public void LogsErrorIfExcludeContainsInvalidPathCharacters()
{
using (TestEnvironment testEnvironment = TestEnvironment.Create())
{
TransientTestFolder source = testEnvironment.CreateFolder(createFolder: true);
TransientTestFolder destination = testEnvironment.CreateFolder(createFolder: false);
testEnvironment.CreateFile(source, "BE78A17D30144B549D21F71D5C633F7D.txt", "file1");
testEnvironment.CreateFile(source, "A04FF4B88DF14860B7C73A8E75A4FB76.txt", "file2");

TransientZipArchive zipArchive = TransientZipArchive.Create(source, testEnvironment.CreateFolder(createFolder: true));

Unzip unzip = new Unzip
{
BuildEngine = _mockEngine,
DestinationFolder = new TaskItem(destination.Path),
OverwriteReadOnlyFiles = true,
SkipUnchangedFiles = false,
SourceFiles = new ITaskItem[] { new TaskItem(zipArchive.Path) },
Exclude = "<BE78A17D30144B549D21F71D5C633F7D>.txt"
};

unzip.Execute().ShouldBeFalse(() => _mockEngine.Log);

_mockEngine.Log.ShouldContain("MSB3937", () => _mockEngine.Log);
}
}

[Fact]
public void LogsErrorIfExcludeContainsPropertyReferences()
{
using (TestEnvironment testEnvironment = TestEnvironment.Create())
{
TransientTestFolder source = testEnvironment.CreateFolder(createFolder: true);
TransientTestFolder destination = testEnvironment.CreateFolder(createFolder: false);
testEnvironment.CreateFile(source, "BE78A17D30144B549D21F71D5C633F7D.txt", "file1");
testEnvironment.CreateFile(source, "A04FF4B88DF14860B7C73A8E75A4FB76.txt", "file2");

TransientZipArchive zipArchive = TransientZipArchive.Create(source, testEnvironment.CreateFolder(createFolder: true));

Unzip unzip = new Unzip
{
BuildEngine = _mockEngine,
DestinationFolder = new TaskItem(destination.Path),
OverwriteReadOnlyFiles = true,
SkipUnchangedFiles = false,
SourceFiles = new ITaskItem[] { new TaskItem(zipArchive.Path) },
Exclude = "$(Include)"
};

unzip.Execute().ShouldBeFalse(() => _mockEngine.Log);

_mockEngine.Log.ShouldContain("MSB3938", () => _mockEngine.Log);
}
}
}
}
11 changes: 11 additions & 0 deletions src/Tasks/Resources/Strings.resx
Expand Up @@ -2789,9 +2789,20 @@
<value>MSB3936: Failed to open unzip file "{0}" to "{1}". {2}</value>
<comment>{StrBegin="MSB3936: "}</comment>
</data>
<data name="Unzip.ErrorParsingPatternInvalidPath">
<value>MSB3937: Failed to parse pattern "{0}" because it contains an invalid path character.</value>
<comment>{StrBegin="MSB3937: "}</comment>
</data>
<data name="Unzip.ErrorParsingPatternPropertyReferences">
<value>MSB3938: Failed to parse pattern "{0}" because it contains a property reference which isn't supported.</value>
<comment>{StrBegin="MSB3938: "}</comment>
</data>
<data name="Unzip.DidNotUnzipBecauseOfFileMatch">
<value>Did not unzip from file "{0}" to file "{1}" because the "{2}" parameter was set to "{3}" in the project and the files' sizes and timestamps match.</value>
</data>
<data name="Unzip.DidNotUnzipBecauseOfFilter">
<value>Did not unzip file "{0}" because it didn't match the include filter or because it matched the exclude filter.</value>
</data>
<data name="Unzip.FileComment">
<value>Unzipping file "{0}" to "{1}".</value>
</data>
Expand Down
15 changes: 15 additions & 0 deletions src/Tasks/Resources/xlf/Strings.cs.xlf

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.