New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Invalid sheet xml (no <worksheet>) #202
Comments
Sorry, I messed it up ... Please have a look at these files: |
OK yes these are different now. Do you know or can you find out the name of the software that authors these? Have you tried opening them with other Excel-reading R packages? Because it would be informative to know which ones can/cannot open the original xlsx files. For the moment, I suspect you might have more immediate success with xlsx or XLConnect, which both wrap the Apache POI. Or keep opening and re-saving to use with readxl. |
I'm afraid I cannot share the software's name... But I think it would not help you because it comes with a certain equipment we use. Commercial and very specific. I've tried to open the files in "openxlsx", which ends up with a similar failure mode. |
Yeah, that what I expected. There's obviously a set of ayptical xlsx files that Apache POI (and Excel itself) can open that readxl (and openxlsx) currently cannot. I think it can be fixed but I doubt the fix is imminent. |
Ok. Thanks for the quick feedback! |
I have an identical issue, also with files produced by extremely niche commercial software. Since a fix is not on the horizon and for my project I cannot rely on the Java dependencies of working packages, I thought it might be useful to share my workaround: Below is a function that will use a VBS to silently convert the troublesome XLSX files into CSV. It is not perfect, obviously it depends on having Excel installed and is Windows-only. It will also leave a .vbs file in the directory of the original xlsx file. It works quickly enough for my purposes (1-2 seconds per CSV) but if you need it be faster you can change system(wait = F) and loop it. Hope this is helpful to someone.
|
@PPhilipp85 Different xml files would be more helpful to me. Can I see these files instead, for not-working and working?
|
Note to self: snippet of the <x:worksheet xmlns:r="http://schemas.openxmlformats.org/officeDocument/2006/relationships" xmlns:mc="http://schemas.openxmlformats.org/markup-compatibility/2006" xmlns:x14ac="http://schemas.microsoft.com/office/spreadsheetml/2009/9/ac" xmlns:x="http://schemas.openxmlformats.org/spreadsheetml/2006/main" mc:Ignorable="x14ac">
<x:dimension ref="A1:N6"/>
<x:sheetViews>
<x:sheetView workbookViewId="0"/>
</x:sheetViews>
<x:sheetFormatPr baseColWidth="10" defaultRowHeight="15" x14ac:dyDescent="0.25"/>
<x:sheetData>...</x:sheetData>
<x:autoFilter ref="A1:N1"/>
<x:pageMargins left="0.7" right="0.7" top="0.48" bottom="0.48" header="0.3" footer="0.3"/>
</x:worksheet> |
@jennybc Please see below the requested files. Please note that reading sheet1 also fails because it is not named sheet1.xml in the original not working file. It is named sheet.xml instead. Strange ... but thats how it is. The following sheets are then sheet2.xml, sheet3.xml and so on. I also tried to take sheet1.xml from the re-saved working xlsx and inserted it in the original not working xlsx. This also brings erros: |
Did you mean to close this? No, there are namespace problems. You've never provided a full workbook, but these namespace issues run throughout several xml files and affect. It's pretty difficult to do the necessary surgery, as you're attempting above, by hand. |
Sorry... Closed it by mistake. |
I will see if I can provide a full workbook. |
@etrippler @PPhilipp85 I believe I have a fix for this in an unpushed branch. But some live, problematic workbooks from one or both of you would be very helpful to me. I'd be happy to just open them once, to verify my fix works, then delete. You could provide them to me privately, i.e. not via GitHub. |
Yes, I can provide it privately to you but need your contact details.
2017-03-20 23:59 GMT+01:00 Jennifer (Jenny) Bryan <notifications@github.com>
:
… @etrippler <https://github.com/etrippler> @PPhilipp85
<https://github.com/PPhilipp85> I believe I have a fix for this in an
unpushed branch. But some live, problematic workbooks from one or both of
you would be very helpful to me. I'd be happy to just open them once, to
verify my fix works, then delete. You could provide them to me privately,
i.e. not via GitHub.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#202 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AVK4sKq8ZXNUs-om11-0S64HmcBjO6M5ks5rnwTJgaJpZM4J84x->
.
|
@PPhilipp85 You can email me at jenny@stat.ubc.ca. And/or you can try the current dev version out on some of your problematic sheets. I just merged the putative fix! I suppose if all is well, we are OK. But if not, I would appreciate a real example for more troubleshooting. |
Hi @jennybc, sorry for the late reply to your request. I tried out the dev version and it still doesn't appear to work for me.
I can still send you a problematic workbook via email if you are curious to check it out. |
@etrippler Yes please do send me a problematic workbook. |
@etrippler is sorted out now -- needed to install latest dev version. |
Hello,
a software of a supplier of my company creates .xlsx files that cannot be treated with readxl.
I get: Error: Invalid sheet xml (no )
If I open those files and re-save them with my Excel2007, read_excel works.
I can see that the xml structure between the original and the re-saved files differs. Due to confidential reasons, I cannot share the full file (I would need to erase some information and so on...), but I have attached one xml sheet file for the original not working and the re-saved working version.
Xml_Sheet_Examples.zip
Hope you can find something out.
Many Thanks
Peter
The text was updated successfully, but these errors were encountered: