Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for ZipCrypto decryption #129

Open
wants to merge 4 commits into
base: main
Choose a base branch
from

Conversation

aurimasv
Copy link

@aurimasv aurimasv commented May 3, 2014

See the second commit message for explanation.

I think there was some discussion that this would have to be optional. Not sure how you want to control this option exactly, but it would be fairly trivial to add, I think.

Partially addresses #115

Remove unused variable from generateZipParts
For ZipCrypto encrypted files, one of the following MUST be passed in the loadOptions object:
* retrievePasswordCallback: a function that returns a password as a string, or
* retrieveEncryptionKeysCallback: a function that returns an array of 3 32-bit integers to be used as pre-primed encryption keys
If both are set, ZipCrypto will call only retrieveEncryptionKeysCallback.

Additionally, invalidPasswordCallback MAY be set and will be called if the supplied password does not validate.
invalidPasswordCallback should return `true` if password validation should be ignored.

This patch does _not_ add support for PKWARE's Strong Encryption.
* @return {Unit8Array} Decrypted data in the same format as was passed in
*/
ZipCrypto.prototype.decryptData = function(data, crc32) {
// This doesn't work, how can we check?
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it reasonable to believe that the data being decompressed doesn't have to be Unit8Array? I see that there are readers for string and NodeBuffer (not quite sure what that is). If I need to worry about these getting here, then I need to figure out a way to test for Unit8Array. Otherwise, I could just remove this.

Edit: Wow... dyslexia. This whole time I've been reading it as Unit instead of Uint (that makes a lot more sense)

@aurimasv
Copy link
Author

aurimasv commented May 3, 2014

I'll back out and redo the dist update, so please ignore that for now. Also need to update tests.

Before I do all of that though, is jszip interested in this patch at all? I'm going to use this for a small project, so I would rather not have to maintain a separate fork.

if (i == 11 && crc32 !== undefined) {
// Validate checksum
// 12th byte is the high order byte of crc32
if ( (data[i] & 0xFF) != (crc32 >>> 24) &&
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this is not always set correctly (or maybe I'm doing something wrong here). E.g. encrypted.zip I generated under cygwin (included in the tests) does not validate here, but the contents do get decrypted correctly. Creating a zip with 7zip does validate though.

@dduponchel
Copy link
Collaborator

Reading encrypted zip files sounds good !
I'm ok with not supporting PKWARE strong encryption since the specification says

The Strong Encryption technology defined in this specification is                  
covered under a pending patent application. The use or implementation              
in a product of certain technological aspects set forth in the current             
APPNOTE, including those with regard to strong encryption, patching,               
or extended tape operations requires a license from PKWARE.

Wikipedia says that this ZipCrypto encryption has known weaknesses so I think that a lot of softwares use AES encryption : if we add one, the other will need to follow (or we will get a lot of bug reports).

Regarding the API, I wonder if checking the password/decrypting the content when decompressing is a good thing. JSZip doesn't decompress files until necessary so a developer can have a hard time predicting where this will be done. For example, if a user read a zip file (with encrypted content but without password callback), add an image and generate the result, the exception will be thrown in the generate() method.
A (maybe naive and oversimplistic) way to avoid that could be to decrypt content when parsing the zip file. What do you think of it ?

Regarding the type of the compressed data, it depends of the reader :

  • StringReader on IE 6-9 (the data will be a binary string)
  • NodeBufferReader on nodejs (the data will be a nodejs' Buffer)
  • Uint8ArrayReader on modern browsers

You can do the following to get the right type :

var dataType = support.uint8array ? "uint8array" : "array";
var data = utils.transformTo(dataType, encryptedData);

@dduponchel
Copy link
Collaborator

And you don't have to update the dist/* files, they are updated before each release :-)

@aurimasv
Copy link
Author

aurimasv commented May 4, 2014

I'll look into adding AES support.

I'm looking into our options for delaying decryption as long as possible (until it's actually needed). If I'm reading the code right, the checkCRC32 load option always causes the decompression to happen immediately when the local header is read. Is this intentional? I'm not sure what the purpose of this check is. Would it not make sense to perform the check when a file contents are actually requested?

@dduponchel
Copy link
Collaborator

Yes, with the checkCRC32 option, everything is decompressed. The idea was to be able to check the validity of a zip file (and the crc32 values with this option) with only one try/catch (on load()). With a lazy crc32 check, the load is faster but the user will need to guard each and every call to JSZip methods with try/catch.
The decompression also has the side effect of checking the decompressed file length. This check is usually lazy (because it depends of the decompression) and can throw errors in the getters or in the generate() method (but only with corrupted files).

I clearly see the advantages of delaying decryption : speed, the ability to ask the user its password after showing the files list, etc.
What I fear is that it may become harder to use JSZip if we put too many things as lazy. I'm not against documenting load() / getters / generate() as methods throwing exceptions in more situations, I just want to be sure that's the right thing to do :)
@Stuk what's your opinion ?

@aurimasv
Copy link
Author

aurimasv commented May 5, 2014

I guess I understand the point here. I do see that in prepareContent there is another throw (throw new Error("Bug : uncompressed data size mismatch");), so basically, any call that would trigger encryption errors, could (in theory) also trigger this error and would have to be handled in a try-catch. Being unable to decrypt is just as severe of a failure as data size mismatch.

I'm not sure what a better alternative to the current implementation would be though. Asking for a password on load (and not validating it?) would not make much sense, since you only need it to decrypt files right before decompression. There are a number of operations (adding, deleting files) that you could do without needing a password. Additionally, not all file sin the archive have to be encrypted, nor do they have to be encrypted using the same algorithm. (actually, they don't even have to use the same password, but I haven't coded that in, since I don't think that's common)

Having thought about this a bit, I would suggest introducing an onError handler (or something similar). This handler would be passed any errors that are thrown and could either re-throw them (or perhaps just return and let JSZip re-throw them), or handle them in whatever way (and return true; so JSZip does not re-throw). All JSZip methods would then need to be wrapped (internally) in try-catch and call the onError handler in the catch. This would probably offer the most flexibility and would not require any try-catching.

I'm working on AES implementation in the mean time.

@Mithgol
Copy link
Contributor

Mithgol commented May 7, 2014

I have yet another idea, though it feels crude. Could crc32 check just be prevented for the files that are known to be encrypted, but still remain for the other files to check their validity on load()?

@aurimasv
Copy link
Author

aurimasv commented May 9, 2014

I have yet another idea, though it feels crude. Could crc32 check just be prevented for the files that are known to be encrypted, but still remain for the other files to check their validity on load()?

I think that's reasonable. Another reason to ignore encrypted files is that with WinZip's AE-2 encryption, you don't even have a crc32, but instead you are given an HMAC-SHA1-80 hash of the encrypted data.

I'm trying to test AES decryption for which I was going to use node-forge package, but the way it requires submodules is not really compatible with browserify (I think). I haven't worked with node (edit: and by extensions, browserify) before, so I would appreciate if someone could let me know whether there's a way to make forge work, or if I have to look for a different crypto library (maybe crypto.js).

@aurimasv
Copy link
Author

aurimasv commented May 9, 2014

I made some progress (using crypto-js), but I'm still having issues. I'd like to split this up into two pull requests: one for ZipCrypto now and the other one for AES a little bit later. ZipCrypto is working perfectly now with files generated by 7zip, WinZIP, WinRAR, and ZipInfo's zip (the one in unix/linux).

Any further thought on how to handle interaction with the user in regards to password retrieval and validation? Two things are necessary:

  • we need to be able to retrieve the password (I think delaying until required and not crc32-checking encrypted files)
  • we need to be able to notify the user that the supplied password is wrong (possibly retrying? I would leave the retrying up to the user though)

@Mithgol
Copy link
Contributor

Mithgol commented May 9, 2014

I'd like to split this up into two pull requests: one for ZipCrypto now and the other one for AES a little bit later.

Wise decision, especially for future references. Smaller pull requests are easier to understand. Always.

@Mithgol
Copy link
Contributor

Mithgol commented May 9, 2014

API suggestions:

  • Add .encrypted property to ZipObject; if not undefined, that property should mean that the file is encrypted; that property should actually contain encyption's name (string) such as 'zipcrypto' or 'aes'.
  • Add optional options parameter (object) to ZipObject's getters .asText(), .asBinary(), .asArrayBuffer(), .asUint8Array(), .asNodeBuffer() so that providing a password becomes possible: .asText({password: 'examplePassword'}), .asBinary({password: 'examplePassword'}), .asArrayBuffer({password: 'examplePassword'}), .asUint8Array({password: 'examplePassword'}), .asNodeBuffer({password: 'examplePassword'})
  • Also make this methods throw on CRC errors or password mismatches. Define a couple of constants (such as CRC_ERROR and PASSWORD_ERROR for example), and export them, and only throw errors based on them (such as throw new Error(CRC_ERROR) for example) because it should be possible to catch the exception later and compare with these constants to determine what happened.

This way “different passwords for different files” are possible. Interaction with the user (password input, retries, etc.) is left to the application over JSZip.

@aurimasv
Copy link
Author

aurimasv commented May 9, 2014

Add .encrypted property to ZipObject; if not undefined, that property should mean that the file is encrypted; that property should actually contain encyption's name (string) such as 'zipcrypto' or 'aes'

Agreed. (That's my current approach actually)

Add optional options parameter to ZipObject's getters .asText(), .asBinary(), .asArrayBuffer(), .asUint8Array(), .asNodeBuffer() so that providing a password becomes possible: .asText({password: 'examplePassword'}), .asBinary({password: 'examplePassword'}), .asArrayBuffer({password: 'examplePassword'}), .asUint8Array({password: 'examplePassword'}), .asNodeBuffer({password: 'examplePassword'})

This... idk. Seems like a huge hassle for the user to do if(zipObj.encrypted) // ask for password first all the time. I rather like the callback idea (it's only called once per zip file, unless the password is wrong). Do you have something against this that I am overlooking?

Also make this methods throw on CRC errors or password mismatches. Define a couple of constants (such as CRC_ERROR and PASSWORD_ERROR for example), and export them, and only throw errors based on them (such as throw new Error(CRC_ERROR) for example) because it should be possible to catch the exception later and compare with these constants to determine what happened.

I actually added some new error classes that extend Error and am throwing those, so you can do checks with instanceof. But we can go the constant string route as well.

@Mithgol
Copy link
Contributor

Mithgol commented May 10, 2014

Classes extending Error are even better than string constants, thanks.

@Mithgol
Copy link
Contributor

Mithgol commented May 10, 2014

Seems like a huge hassle for the user to do if(zipObj.encrypted) // ask for password first all the time. I rather like the callback idea (it's only called once per zip file, unless the password is wrong). Do you have something against this that I am overlooking?

A callback is a better idea than if(zipObj.encrypted) but only if you have good answers for at least the following couple of questions:

  • When is that callback called? (When an encrypted file is detected? When a ZipObject of that file appears? When a getter, such as .asText() or .asBinary(), is called? Or maybe we'd have an API method to choose between these three?)
  • What is that's callback's signature (i.e. its set of parameters)? Can it be asynchronous (e.g. when such callback renders a GUI dialog prompt asking for password and then calls some further callback — any unzipping has to be postponed until that)?

@dduponchel
Copy link
Collaborator

A zipObj.encrypted and extended errors sound good !
For the onError (5 days ago, I'm a bit late) : it may be too weird to have a synchonous method returns a null of undefined while giving the error to an other function.

I'm working on asynchonous methods (asTextAsync(callback), generateAsync(options, callback), etc. see #121) so if we use a sync callback to get the password, we could also use async callbacks with these methods (it makes sense when asking for a password). If having sync and async callbacks is too awkward, I'm not against restricting this feature (ZipCrypto) to only one type of functions and throwing an error in the others.

If we don't use a callback we should add a flag allowEncrypted (default = false) to keep backward compatibility (we can use the callback itself if we use it) and throw an exception if the user's code isn't ready for encrypted zip files.

.asText({password: 'examplePassword'})

There is one use case covered by a callback and not by this method :

  • load a encrypted zip file
  • add files
  • generate a zip file

The last part will try to decrypt the files and fail.

@hsdk123
Copy link

hsdk123 commented Jul 18, 2014

Hi, any updates to this? Password encryption for zip files would really be a great feature/option.

@SunboX
Copy link

SunboX commented Aug 13, 2014

I second this, would be great if we can protect zip files.

@jprichardson
Copy link

Ya, this would be awesome. If it helps, I've started https://github.com/cryptocoinjs/aes The problem is, that I need to add aes-cbc stream ciphering though, which is presumably what zip files use for encryption?

@Manouchehri
Copy link

@jprichardson Zip uses AES-CTR (128, 192 or 256 bit).

http://www.winzip.com/aes_info.htm

@RouR
Copy link

RouR commented Jan 21, 2019

any progress?

@xqdoo00o
Copy link
Contributor

xqdoo00o commented Mar 11, 2020

I add some Aes decryption code, please see at #696

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants