New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Unable to open xls files: "libxls error: Unable to open file" #598
Comments
I get the same result as you with readxl. I wonder if the people who succeed with readxl have opened the file once with Excel and re-saved it 🤔 I need to come back here and see if I can read it with the stand-alone tool using libxls to go from xls to csv. That would rule out readxl as the problem. I suspect this is what I will find. |
Any chance that |
@jimhester |
@jennybc The user (@red_quark) on StackOverflow said the following when I asked whether he opened the file in Excel first, or changed it in any way:
|
I just tried opening the same file on Windows 7 and got the same Specs: |
The person who is able to open the file is running the following specs: > sessionInfo()
R version 3.6.1 (2019-07-05)
Platform: x86_64-w64-mingw32/x64 (64-bit)
Running under: Windows >= 8 x64 (build 9200) Windows: Windows 10 Enterprise, vers. 1803, OS build 17134.1006 The only thing that catches my eye is that they are using an old version of |
I was seeing a similar error and wanted to share a short-term solution.
Yields (on some systems in my classroom but not others):
The temporary solution was to paste the URL of the xls file into Firefox and download it via the browser. Once this was done we could run the read_excel line without error. This was happening today on Windows 10, with R 3.6.2 and R Studio 1.2.5033. |
Thanks, @rtburg! |
On my machine (see below) I have to use
Linked to #583 ? Hope it helps.
|
Thanks for the suggestion, @gregleleu! Unfortunately, this makes no difference: using the full path, abbreviated path (being in the wd with the file), or using |
readxl reads the file in the link above in my machine, even without using path.expand... |
I really think this thread now has folks in it who are seeing at least 2 different phenomena. |
I am facing the same issue with read_xls. I am running R version 3.5.2 (2018-12-20) |
I downloaded the file and got the error |
Thank you for your reply. I think this solution works for me as well - opening the file in MS Excel and re-saving it (either as *.xls or *.xlsx) does solve the problem. However, the original file is one of hundreds that are automatically generated during a research data collection, and so not having to run this file through Excel would be highly desirable. |
@Brunox13 for sure, opening and resaving the file is not the ideal solution. Rather I wanted to test if resaving resolved the issue. As it stands, I suspect this is a problem with the specific file, rather than a general bug in Since you say you have hundreds of files---were these files generated from some other program? If so, I'm wondering if the program that generates these files has some bug or quirk in the way it builds the excel files that doesn't play nice with |
@Brunox13 Also maybe not an ideal solution, but you could write a simple C# program to open/save/close excel files. Here's an example for a single file, it could be easily modified to e.g. loop through files in a directory. using System.IO;
using Excel = Microsoft.Office.Interop.Excel;
namespace resaver
{
class Program
{
static void Main(string[] args)
{
string srcFile = Path.GetFullPath(args[0]);
Excel.Application excelApplication = new Excel.Application();
excelApplication.Application.DisplayAlerts = false;
Excel.Workbook srcworkBook = excelApplication.Workbooks.Open(srcFile);
srcworkBook.Save();
srcworkBook.Close();
excelApplication.Quit();
}
}
} I redownloaded the file and confirmed it would not open with |
This can also be achieved using Libreoffice CLI: loffice --convert-to xls --outdir some_folder some_file.xls |
This worked for me as well; I am able read a local .xls file directly with |
I just opened the file once and saved it as I'm using Linux, the problem was about the file extension. LibreCalc was trying to convert it into .odt once specified that I wanted to keep the file in .xls extension, it stopped showing the error. In my opinion, you should try to open the file too. |
This approach has been suggested before - opening in MS Excel first also solves the problem. This solution, however, is not really feasible with the number of files that I work with (and I suppose others do as well). |
Is there a solution to this problem?
Attaching the rejected file. Note: the file if uploaded to google drive opens easily as a googlesheet. So there is no corruption in the file. |
I had the same exact problem with a bunch of .xls files and it turns out there was something wrong in all of them. I "solved" the problem by converting back all files to xls in batch mode using this command from LibreOffice: /Applications/LibreOffice.app/Contents/MacOS/soffice --headless --convert-to xls --outdir your_favorite_outdir *.xls For this to work on macOS you need to install LibreOffice : https://www.libreoffice.org/download/download/ |
I had the same issue reading an "xlsx" file, I simply opened the file and saved it as an xls file and the problem was solved. |
@sanjmeh |
I was able to successfully read the .xls files that were giving me the same error using the read.xlsx function from the package xlsx |
I had the same error when opening .xls files, .xlsx works fine.
|
This is a total shot in the dark. I had the same error. It turns out the .xls files were incorrectly named, and were actually tab delimited text files (who'd have thunk). Excel was ok opening them but appreciable read_excel not so much. On the rare chance you also have text files that are named .xls, try opening in a text editor to confirm. |
If you want to test this theory that something is NOT actually Here's what this looks like when a tab-delimited file is "sold" to readxl as library(readxl)
(tmp <- tempfile("readxl-can-check-magic-nunber-", fileext = ".xls"))
#> [1] "/tmp/RtmpUAwxda/readxl-can-check-magic-nunber-31e2659e970e.xls"
writeLines("X1\tX2\na\tb\n", tmp)
read.delim(tmp)
#> X1 X2
#> 1 a b
read_xls(tmp)
#> Error:
#> filepath: /private/tmp/RtmpUAwxda/readxl-can-check-magic-nunber-31e2659e970e.xls
#> libxls error: Unable to open file
readxl:::format_from_signature(tmp)
#> [1] NA Created on 2021-04-23 by the reprex package (v2.0.0.9000) |
This is thread is a real mix of different problems because unfortunately many distinct problems all lead to the same |
converting xls file to xlsx format did the job. |
You might on to something. I had the same error but it turns out the file as not excel but rather plain text, despite the misleading "xls" extension. |
Can confirm. In my case, the mime type was txt but the file was saved as xls.
|
I just confirmed that the standalone xls2csv tool built from https://github.com/libxls/libxls/releases/tag/v1.6.2 cannot read the file provided by OP.
This thread became something of a mess and is presumably a mix of several different issues. But I'm closing it because the OP's file appears to be unreadable with libxls at this time. (And, yes, often 3rd party Excel-writing software creates weird, legacy |
Apparently I already did report this file a while ago: |
I had to save the file as *.xlsx and that worked for me. |
FYI, I had the same |
|
I think the fix you all are looking for is changing the mode in download.file() which will fix the temp file corruption issue. mode = 'wb' fixed the issue I had.
|
I am unable to load xls files (an example file can be downloaded here) using readxl and get the following error:
I have made sure that the file exists and is of the correct format
I have already discussed this on StackOverflow and it seems that others are able to read this file - but I cannot figure out how to alleviate the problem with my setup, nor how to better diagnose it. I am able to read the file using
gdata::read.xls
(though I'm hoping that I'd be able to do so faster withread_xls
, as I need to open multiple files);read_excel
gives me the same error. I have already tried uninstalling readxl, restarting RStudio, and installing readxl again - to no avail.I am using the following:
Mac OS 10.15.1
RStudio 1.2.5019
R 3.6.1
readxl 1.3.1
The text was updated successfully, but these errors were encountered: