Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

can’t read sheetname of some xlsx file #276

Closed
lab37 opened this issue Oct 16, 2018 · 6 comments
Closed

can’t read sheetname of some xlsx file #276

lab37 opened this issue Oct 16, 2018 · 6 comments

Comments

@lab37
Copy link

lab37 commented Oct 16, 2018

这个文件,内部的表名读不出来,是不是很奇怪,因为xml在parse的时候失败了。
临清.xlsx

All the versions of go can‘t read the sheetname of this file。

@xuri xuri added the confirmed This issue can be reproduced label Oct 16, 2018
@xuri xuri closed this as completed in 1c45425 Oct 16, 2018
@xuri
Copy link
Member

xuri commented Oct 16, 2018

@lab37 Thanks for your issue. I have fixed it.

@xuri xuri removed the confirmed This issue can be reproduced label Oct 16, 2018
@lab37
Copy link
Author

lab37 commented Oct 17, 2018

Good work!Processing is very fast。But when I tried to read this file ,another problem arises. The func GetRows() return wrong code。I checked this ,it like caused by the func sharedStringsReader() in the 252 line of row.go。byte[ss] maybe the end。
按下葫芦起来瓢,还是有点小问题。

@lab37
Copy link
Author

lab37 commented Oct 17, 2018

ps:I get another file which may cause the Memory leak. My computer got stuck when it read this file by excelize. Though I knew there was some useless form in this file that made it so big,I can open it correctly by office. So I think this file should be treated as a normal document,and the problem should be belongs to excelize.
file:

华润菏泽.zip

@lab37
Copy link
Author

lab37 commented Oct 18, 2018

@xuri

xuri added a commit that referenced this issue Oct 18, 2018
@xuri
Copy link
Member

xuri commented Oct 18, 2018

@lab37 The issue caused by missing tradition to strict conversion for sharedStringsReader() has been fixed. In addition, I have tested open xlsx file in attachment by Excel application, it can be open correctly by Excel for Mac version 16.17 (180909), but can't open by Office 2007 Excel 12.0.4518.1014 on the Windows (got error: file format is not valid). I checked the internal structure of the file in the attachment. By analytics the internal structure of the files in the attachment, a huge number of redundant XML tags cause much more memory needed to be parsed by Sheet1, also caused a worksheet with only 59 rows to take up about 72.4MB of disk space. Which version of Excel generated this file? The excelize doesn't implement streaming read currently, so the memory usage of parsing XML temporarily depends on the file content.

@lab37
Copy link
Author

lab37 commented Oct 18, 2018

@xuri That explains a lot,Thank you very much.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants