Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

File not found with xls2csv, segfault with readxl #14

Closed
jennybc opened this issue Apr 15, 2018 · 3 comments
Closed

File not found with xls2csv, segfault with readxl #14

jennybc opened this issue Apr 15, 2018 · 3 comments

Comments

@jennybc
Copy link
Contributor

jennybc commented Apr 15, 2018

tidyverse/readxl#417

xls from here

https://www.seco.admin.ch/dam/seco/de/dokumente/Wirtschaft/Wirtschaftslage/VIP%20Quartalsschätzungen/qna_p_na.xls.download.xls/qna_p_na.xls

With xls2csv:

jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/Downloads/qna_p_na.xls
FILE: /Users/jenny/Downloads/qna_p_na.xls
File not found

With readxl:

> library(readxl)
> read_excel("~/Downloads/qna_p_na.xls")

 *** caught segfault ***
address 0x0, cause 'memory not mapped'

Traceback:
 1: .Call(`_readxl_read_xls_`, path, sheet_i, limits, shim, col_names,     col_types, na, trim_ws, guess_max)
 2: read_fun(path = path, sheet_i = sheet, limits = limits, shim = shim,     col_names = col_names, col_types = col_types, na = na, trim_ws = trim_ws,     guess_max = guess_max)
 3: tibble::as_tibble(read_fun(path = path, sheet_i = sheet, limits = limits,     shim = shim, col_names = col_names, col_types = col_types,     na = na, trim_ws = trim_ws, guess_max = guess_max), validate = FALSE)
 4: tibble::repair_names(tibble::as_tibble(read_fun(path = path,     sheet_i = sheet, limits = limits, shim = shim, col_names = col_names,     col_types = col_types, na = na, trim_ws = trim_ws, guess_max = guess_max),     validate = FALSE), prefix = "X", sep = "__")
 5: read_excel_(path = path, sheet = sheet, range = range, col_names = col_names,     col_types = col_types, na = na, trim_ws = trim_ws, skip = skip,     n_max = n_max, guess_max = guess_max, format = format)
 6: read_excel("~/Downloads/qna_p_na.xls")
@jennybc
Copy link
Contributor Author

jennybc commented Apr 15, 2018

Hmmm ... rebuilding xls2csv with 6218a25, I still struggle to read this file:

jenny@2015-mbp libxls-evanmiller-github $ ./xls2csv ~/Downloads/qna_p_na.xls
FILE: /Users/jenny/Downloads/qna_p_na.xls
Error reading XLS file: Unable to allocate memory

I can read it with Excel fwiw. But I note that all the sheets appear to be protected.

@evanmiller
Copy link
Collaborator

The file is failing because libxls thinks the SST table is too large (1.8B entries). There's likely a bug in the code.

Did this file work in older versions of libxls?

@jennybc
Copy link
Contributor Author

jennybc commented Apr 15, 2018

Did this file work in older versions of libxls?

No, it did not. The person who opened the issue suggests it can be read by a different R package, but there's a typo in the package name and AFAIK that package can only read xlsx. So hard to know what that actually means. Lots of xls problems go away if you open in Excel and resave 🙂 esp to xlsx.

We can let this one go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants