Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for 0x50 firstByte #99

Closed
mgcrea opened this issue Aug 25, 2014 · 3 comments
Closed

Add support for 0x50 firstByte #99

mgcrea opened this issue Aug 25, 2014 · 3 comments

Comments

@mgcrea
Copy link

mgcrea commented Aug 25, 2014

I've got .xlsx files that are rejected by the lib Unsupported file. Any idea what I should start looking for? Tried to add:

case 0x50: return parse_xlml(d, (o.type="buffer",o));

But it breaks at:

node_modules/xlsjs/xls.js:6366
            switch(state[state.length-1][0]) {
TypeError: Cannot read property '0' of undefined
    at parse_xlml_xml (node_modules/xlsjs/xls.js:6366:32)

Great job by the way. Would love to switch all future developments to a lib with a "clean" license.


Found this stackoverflow answer:

If the files are in Office 2007 format (e.g. .docx), then their internal storage is either:

1) A zip file of xml docs (if it's not password protected)

2) The old style compound file format (if it IS password protected).

Therefore you could probably do something like this:

1) Check the first few bytes of the file

2) If it's a zip file (non password protected), it'll start with 0x50 0x4b 0x03 0x04.

3) If it's not a zip file, then it's probably password protected. It will start with a different binary signature (e.g. Word 2007 docs start with 0xd0 0xcf 0x11 0xe0 in this case)

Basically, if it's a new .docx or .xlsx, and it DOESN'T start with the zip signature of 0x50 0x4b 0x03 0x04, it's probably password protected.

For other versions of MS Office, it's a bit trickier...

So it looks like it's a standard zip file.

@mgcrea
Copy link
Author

mgcrea commented Aug 25, 2014

Used the wrong npm repo xlsjs. Closing while I'm trying to reproduce against correct HEAD.

@mgcrea mgcrea closed this as completed Aug 25, 2014
@SheetJSDev
Copy link
Contributor

@mgcrea If you are trying to handle both pre-2007 and 2007+ excel files in node, I'd recommend using j: https://www.npmjs.org/package/j . The read functions embed the file type check: https://github.com/SheetJS/j/blob/master/j.js#L7-L32

@mgcrea
Copy link
Author

mgcrea commented Aug 25, 2014

Thanks! Will do that.

I've deprecated my node-xlsx module by the way. Once again, awesome work. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants