-
Notifications
You must be signed in to change notification settings - Fork 4.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[API Proposal]: Add property to ZipArchiveEntry to indicate if the entry is encrypted #68897
Comments
Tagging subscribers to this area: @dotnet/area-system-io-compression Issue DetailsBackground and motivationWe have received requests to be able to detect if a ZipArchive contains password-protected entries. Without this information, the consumer of a zip file cannot determine if the contents can be extracted ahead of time. During extraction of the entire archive or a single entry, an By adding a property to indicate if an entry is encrypted, the entries could be inspected ahead of time and/or skipped during extraction. See #57197 for more background on the requests. API Proposalnamespace System.IO.Compression;
public class ZipArchiveEntry
{
public bool IsEncrypted { get; }
} API Usageprivate bool ArchiveContainsEncryptedEntries(ZipArchive archice)
{
foreach (var entry in archive.Entries)
{
if (entry.IsEncrypted)
{
return true;
}
}
return false;
} Alternative DesignsExposing an Archive-level PropertyWe could consider additionally including an archive-level property to indicate if any of the entries are encrypted. However, populating this property would require iterating over the entries as is shown in the usage example above. Until such an API request is received, I recommend we limit the new API surface to the entry-level property. Exposing the Encryption LevelZip archives contain "weak" and "strong" encryption. Weak encryption (via the ZipCrypto algorithm) is indicated on a file with a single bit set within the General Purpose Bit Flags (bit 0). Strong encryption (e.g. AES-*) is indicated with both bits 0 and 6 being set in the General Purpose Bit Flags. An alternative design would use an enum to indicate the encryption level of an entry:
Unless we intend to offer APIs in the future that would enable the extraction of encrypted entries, the weak vs. strong indicator is unnecessary. With that consideration, I recommend we use a simple boolean to indicate if the entry is encrypted, regardless of the encryption level or algorithm. And with the boolean approach, we only need to check General Purpose Bit 0, since that bit must be set for both weak and strong encryption. netstandard support via Exception.DataSome of the requests for this functionality came with the desire for netstandard support. Without adding new API surface, a failed extraction could include data on the exception instance that indicates why the RisksThe Zip file format also includes the ability to encrypt some of the central directory metadata, including file names. This API design does not take that concept into account. Popular tools such as WinZip, 7-Zip, and WinRAR do not support this capability either though. More typically, a password is set for the zip such that it is applied to each of the entries individually, and the central directory metadata remains unencrypted. Based on the specification for how central directory metadata encryption is implemented, the entry count values are to be obfuscated if the central directory is encrypted (reference). That expectation would likely introduce other issues with the current ZipArchive implementation if we sought to support encryption of the central directory. However, the file entry header General Purpose Bit Flags would remain the same, with bit 0 still indicating that the entry is also encrypted; thus the proposed design would still apply.
|
For the record, PKWare strong encryption requires a license, see #1545 (comment)
WinZIP encryption doesn't use the bit 6, it uses an extra data field with header 0x9901. Now, API-wise, I think My point being that once we extend the encryption support, Yet another alternative, similar to your "Exposing the Entry Encryption Level" design, could be: namespace System.IO.Compression
{
public class ZipArchiveEntry
{
+ public ZipEntryEncryption Encryption { get; }
}
+ public enum ZipEntryEncryption
+ {
+ None,
+ Weak, // ZipCrypto
+ Aes128,
+ Aes192,
+ Aes256,
+ Unknown = 0xFFFF,
+ }
}
|
Thanks, @jozkee; I had not read the WinZip documentation you linked to. With the enum approach, callers that needs to determine if an entry is encrypted would use |
Looks good as proposed. It seems like there's some missing user story around what to do when it's namespace System.IO.Compression
{
public class ZipArchiveEntry
{
public bool IsEncrypted { get; }
}
} |
Should we mark this as a |
Sorry; that code comment was slightly misleading. This isn't a breaking change; Thanks for noticing that detail though and calling it out, @deeprobin. |
Background and motivation
By specification, Zip archives support "weak" and "strong" entry-level encryption. Weak encryption of an entry (via the ZipCrypto algorithm) is indicated in the entry-level General Purpose Bit Flag value by setting bit 0 to
1
. Strong encryption (e.g. AES-*) is indicated with both bits 0 and 6 (PKZIP) or by setting bit 0 and using an extra data field (WinZip). With any encryption scheme, General Purpose Bit Flag bit 0 will be set if an entry is encrypted.Our APIs do not currently expose any indicators for whether an archive contains any weak- or strong-encrypted entries, nor do we expose the General Purpose Bit Flag for advanced scenarios that require checking bits within that value. The end result is that calling code cannot determine if an entry is encrypted. Furthermore, when attempting to extract an encrypted entry from a
ZipArchive
, two different behaviors are observed:InvalidDataException
is thrown when attempting to extract the entry. This exception type is also thrown in other circumstances, and there is no discriminating information to indicate the reason for the exception; thus, there is no means for identifying that the entry was encrypted.Issue #57197 requests a means for detecting if a
ZipArchiveEntry
is encrypted while iterating over the entries in the archive.API Proposal
API Usage
Additional APIs to Consider
Expose an Archive-Level Property
We could additionally include an archive-level property to indicate if any of the entries are encrypted. However, since the encryption information is stored at the entry-level "General Purpose Bit Flag" value, populating this property would require iterating over the entries as is shown in the usage example above. If demand for this archive-level property exists, it could be added to
ZipArchive
alongside the existingEntries
andMode
properties. BecauseZipArchive
currently contains no other aggregate information about the entries though, I recommend we do not add this property unless sufficient requests are received after adding the entry-level property.Specify Encryption Handling Behavior to ZipFile Accelerators
The
ZipFile
class provides static accelerator methods, includingOpen
,OpenRead
, andExtractToDirectory
. TheOpen
andOpenRead
methods simply return aZipArchive
without taking any other action and they succeed with weak- or strong-encrypted entries present. TheExtractToDirectory
method exhibits the same behavior as described above, with anInvalidDataException
being thrown for strong-encrypted entries and silently extracting encrypted entries for weak encryption.We could add overloads to
ZipFile.ExtractToDirectory
to specify how encrypted entries should be handled. I recommend we do not pursue this unless sufficient demand emerges. At that time, we could learn more about the scenarios to determine if we need an enum to define different behaviors or if we merely need to allow a boolean to indicate encrypted entries should be skipped. Without these overloads, a caller can instead useZipFile.Open[Read]
and iterate over the entries.Alternative Designs
Exposing the Entry Encryption Algorithm
An alternative design would use an enum to indicate the encryption algorithm used for an entry:
For Add password to ZipArchive · Issue #1545 · dotnet/runtime, an enum along these lines is very likely to be needed. Without fully designing the APIs for those features though, it's possible that the enum illustrated above might not be the right design for the full feature set. Since the enum is not strictly necessary for detecting if there are password-protected/encrypted entries, I recommend we stick with the simple boolean for now. If #1545 is implemented and we end up adding an
Encryption
property that uses an enum similar to what's shown here, then theIsEncrypted
property would become a convenience method over checking ifEncryption
isZipEntryEncryption.None
or another value. For the initial implementation though,IsEncrypted
would only need to be based on General Purpose Bit 0 being set.Exposing the General Purpose Bit Flag Value
We could optionally expose the General Purpose Bit Flag value on
ZipArchiveEntry
, either in addition to or instead ofIsEncrypted
. Exposing this value would allow callers to check other bits to glean more about the archive entry than our APIs support.From there, we could also ostensibly introduce an enum that names each of the bits per the specification. There has not been demand for this low-level data to be exposed however, so I recommend we do not pursue this approach unless demand emerges.
netstandard support via Exception.Data
Some of the requests for this functionality came with the desire for netstandard support. Without adding new API surface, a failed extraction could include data on the exception instance that indicates why the
InvalidDataException
was thrown. For instance,Exception.Data
could include a key/value pair of "IsEncrypted" and the boolean indicating if the entry was encrypted. However, this would rely on exception handling to control flow, which is an anti-pattern.Risks
The Zip file format also includes the ability to encrypt some of the central directory metadata, including file names. This API design does not take that concept into account. Popular tools such as WinZip, 7-Zip, and WinRAR do not support this capability either though. More typically, a password is set for the zip such that it is applied to each of the entries individually, and the central directory metadata remains unencrypted.
Based on the specification for how central directory metadata encryption is implemented, the entry count values are to be obfuscated if the central directory is encrypted (reference). That expectation would likely introduce other issues with the current ZipArchive implementation if we sought to support encryption of the central directory. However, the file entry header General Purpose Bit Flag would remain the same, with bit 0 still indicating that the entry is also encrypted; thus the proposed design would still apply.
The text was updated successfully, but these errors were encountered: