New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unzip Task: Enable filtering #5169
Comments
Team triage: this is an interesting idea. We would potentially accept a PR that did this, but we'd like to first see a rough design about the filter mechanism, including whether it's easy to implement with the zip APIs we use, or if there's an easier one to implement. |
@rainersigwald I'm sorry it's not a PR but what I did for now to quickly have this working for my own build is the following custom task, it only adds a few small modifications to the existing one (and some because I couldn't access the internal classes being used in some places): using System;
using System.Diagnostics;
using System.IO;
using System.IO.Compression;
using System.Linq;
using System.Resources;
using System.Text.RegularExpressions;
using System.Threading;
using Microsoft.Build.Framework;
using Microsoft.Build.Utilities;
using Tasks.Properties;
namespace Tasks
{
public class FilteredUnzip : Task, ICancelableTask
{
// We pick a value that is the largest multiple of 4096 that is still smaller than the large object heap threshold (85K).
// The CopyTo/CopyToAsync buffer is short-lived and is likely to be collected at Gen0, and it offers a significant
// improvement in Copy performance.
private const int _DefaultCopyBufferSize = 81920;
/// <summary>
/// Stores a <see cref="CancellationTokenSource"/> used for cancellation.
/// </summary>
private readonly CancellationTokenSource _cancellationToken = new CancellationTokenSource();
public FilteredUnzip()
{
Log.TaskResources = Resources.ResourceManager;
}
/// <summary>
/// Gets or sets a <see cref="ITaskItem"/> with a destination folder path to unzip the files to.
/// </summary>
[Required]
public ITaskItem DestinationFolder { get; set; }
/// <summary>
/// Gets or sets a value indicating whether read-only files should be overwritten.
/// </summary>
public bool OverwriteReadOnlyFiles { get; set; }
/// <summary>
/// Gets or sets a value indicating whether files should be skipped if the destination is unchanged.
/// </summary>
public bool SkipUnchangedFiles { get; set; } = true;
/// <summary>
/// Gets or sets an array of <see cref="ITaskItem"/> objects containing the paths to .zip archive files to unzip.
/// </summary>
[Required]
public ITaskItem[] SourceFiles { get; set; }
/// <summary>
/// Gets or sets a regular expression that will be used to include files to be unzipped.
/// </summary>
public string Include { get; set; }
/// <summary>
/// Gets or sets a regular expression that will be used to exclude files to be unzipped.
/// </summary>
public string Exclude { get; set; }
/// <inheritdoc cref="ICancelableTask.Cancel"/>
public void Cancel()
{
_cancellationToken.Cancel();
}
/// <inheritdoc cref="Task.Execute"/>
public override bool Execute()
{
DirectoryInfo destinationDirectory;
try
{
destinationDirectory = Directory.CreateDirectory(DestinationFolder.ItemSpec);
}
catch (Exception e)
{
Log.LogErrorWithCodeFromResources("Unzip.ErrorCouldNotCreateDestinationDirectory", DestinationFolder.ItemSpec, e.Message);
return false;
}
BuildEngine3.Yield();
try
{
foreach (ITaskItem sourceFile in SourceFiles.TakeWhile(i => !_cancellationToken.IsCancellationRequested))
{
if (!File.Exists(sourceFile.ItemSpec))
{
Log.LogErrorWithCodeFromResources("Unzip.ErrorFileDoesNotExist", sourceFile.ItemSpec);
continue;
}
try
{
using (FileStream stream = new FileStream(sourceFile.ItemSpec, FileMode.Open, FileAccess.Read, FileShare.Read, 0x1000, false))
{
using (ZipArchive zipArchive = new ZipArchive(stream, ZipArchiveMode.Read, false))
{
try
{
Extract(zipArchive, destinationDirectory);
}
catch (Exception e)
{
// Unhandled exception in Extract() is a bug!
Log.LogErrorFromException(e, true);
return false;
}
}
}
}
catch (OperationCanceledException)
{
break;
}
catch (Exception e)
{
// Should only be thrown if the archive could not be opened (Access denied, corrupt file, etc)
Log.LogErrorWithCodeFromResources("Unzip.ErrorCouldNotOpenFile", sourceFile.ItemSpec, e.Message);
}
}
}
finally
{
BuildEngine3.Reacquire();
}
return !_cancellationToken.IsCancellationRequested && !Log.HasLoggedErrors;
}
/// <summary>
/// Extracts all files to the specified directory.
/// </summary>
/// <param name="sourceArchive">The <see cref="ZipArchive"/> containing the files to extract.</param>
/// <param name="destinationDirectory">The <see cref="DirectoryInfo"/> to extract files to.</param>
private void Extract(ZipArchive sourceArchive, DirectoryInfo destinationDirectory)
{
foreach (ZipArchiveEntry zipArchiveEntry in sourceArchive.Entries.TakeWhile(i => !_cancellationToken.IsCancellationRequested))
{
FileInfo destinationPath = new FileInfo(Path.Combine(destinationDirectory.FullName, zipArchiveEntry.FullName));
// Zip archives can have directory entries listed explicitly.
// If this entry is a directory we should create it and move to the next entry.
if (Path.GetFileName(destinationPath.FullName).Length == 0)
{
// The entry is a directory
Directory.CreateDirectory(destinationPath.FullName);
continue;
}
if (!destinationPath.FullName.StartsWith(destinationDirectory.FullName, StringComparison.OrdinalIgnoreCase))
{
// ExtractToDirectory() throws an IOException for this but since we're extracting one file at a time
// for logging and cancellation, we need to check for it ourselves.
Log.LogErrorFromResources("Unzip.ErrorExtractingResultsInFilesOutsideDestination", destinationPath.FullName, destinationDirectory.FullName);
continue;
}
if (ShouldSkipEntry(zipArchiveEntry, destinationPath))
{
Log.LogMessageFromResources(MessageImportance.Low, "Unzip.DidNotUnzipBecauseOfFileMatch", zipArchiveEntry.FullName, destinationPath.FullName, nameof(SkipUnchangedFiles), "true");
continue;
}
try
{
destinationPath.Directory?.Create();
}
catch (Exception e)
{
Log.LogErrorWithCodeFromResources("Unzip.ErrorCouldNotCreateDestinationDirectory", destinationPath.DirectoryName, e.Message);
continue;
}
if (OverwriteReadOnlyFiles && destinationPath.Exists && destinationPath.IsReadOnly)
{
try
{
destinationPath.IsReadOnly = false;
}
catch (Exception e)
{
Log.LogErrorWithCodeFromResources("Unzip.ErrorCouldNotMakeFileWriteable", zipArchiveEntry.FullName, destinationPath.FullName, e.Message);
continue;
}
}
try
{
Log.LogMessageFromResources(MessageImportance.Normal, "Unzip.FileComment", zipArchiveEntry.FullName, destinationPath.FullName);
using (Stream destination = File.Open(destinationPath.FullName, FileMode.Create, FileAccess.Write, FileShare.None))
using (Stream stream = zipArchiveEntry.Open())
{
stream.CopyToAsync(destination, _DefaultCopyBufferSize, _cancellationToken.Token)
.ConfigureAwait(false)
.GetAwaiter()
.GetResult();
}
destinationPath.LastWriteTimeUtc = zipArchiveEntry.LastWriteTime.UtcDateTime;
}
catch (IOException e)
{
Log.LogErrorWithCodeFromResources("Unzip.ErrorCouldNotExtractFile", zipArchiveEntry.FullName, destinationPath.FullName, e.Message);
}
}
}
/// <summary>
/// Determines whether or not a file should be skipped when unzipping.
/// </summary>
/// <param name="zipArchiveEntry">The <see cref="ZipArchiveEntry"/> object containing information about the file in the zip archive.</param>
/// <param name="fileInfo">A <see cref="FileInfo"/> object containing information about the destination file.</param>
/// <returns><code>true</code> if the file should be skipped, otherwise <code>false</code>.</returns>
private bool ShouldSkipEntry(ZipArchiveEntry zipArchiveEntry, FileInfo fileInfo)
{
bool result = SkipUnchangedFiles && fileInfo.Exists
&& zipArchiveEntry.LastWriteTime == fileInfo.LastWriteTimeUtc
&& zipArchiveEntry.Length == fileInfo.Length;
if (!string.IsNullOrWhiteSpace(Include))
{
result |= !Regex.IsMatch(zipArchiveEntry.FullName, Include);
}
if (!string.IsNullOrWhiteSpace(Exclude))
{
result |= Regex.IsMatch(zipArchiveEntry.FullName, Exclude);
}
return result;
}
}
} |
There's some difference between the PR code and the code I originally posted here. By using that custom task I found some flaws in the original code found here which have been resolved in the PR. For 1 it doesn't fail with "PathTooLong" in case you exclude the archive entry that would cause this by moving the validation to the first position. Secondly it adds its own message making logs clearer as to why a certain file wasn't unzipped. Any and all feedback is very welcome. |
Desired functionality
Project file
Directory contents:
MyZipFile.zip contents:
Expected behavior
The MyZipFile.zip is unzipped to the desired location, unzipping only entries that match up with inclusion (if present) and are not excluded.
In the example root.txt is not unzipped because it's not included and excluded.txt is not unzipped because it's excluded.
Resulting Directory contents:
Actual behavior
No filtering of Unzip is possible at this time.
Environment data
msbuild /version
output:16.4.0.56107
OS info:
Windows 10 Enterprise
If applicable, version of the tool that invokes MSBuild (Visual Studio, dotnet CLI, etc):
/
The text was updated successfully, but these errors were encountered: