Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

xlsx files not loading #51

Open
pstaabp opened this issue Feb 26, 2021 · 3 comments
Open

xlsx files not loading #51

pstaabp opened this issue Feb 26, 2021 · 3 comments

Comments

@pstaabp
Copy link

pstaabp commented Feb 26, 2021

I just tried to load an xlsx file using the load function.

Evidently, since down deep this depends on the python xlrd package, this is no longer supported:

There's a disclaimer on the website

@pstaabp
Copy link
Author

pstaabp commented Feb 26, 2021

I did notice that #26 will obvious fix this.

@chris-b1
Copy link

As a workaround, you can downgrade to the last 1.x release of xlrd

using Conda
Conda.add("xlrd==1.2.0")

This could probably be pinned here -
https://github.com/queryverse/ExcelReaders.jl/blob/master/src/ExcelReaders.jl#L12

@hhaensel
Copy link

hhaensel commented Mar 31, 2021

If speed is your concern for large data files (as for me), you can gain a factor of 2 by using pandas via PyCall:

EDIT: There is still an error in this function, sorry

using PyCall, DataFrames
pd = pyimport("pandas")

function read_excel(f; kwargs...)
  pdf = pd.read_excel(f; kwargs...)
  DataFrame(Any[pdf.values[:, i] for i in 1:size(pdf.values, 2)], Symbol.(pdf.columns))
end

Forcing the openpyxl engine, as recommended by xlrd, shows again worse performance...

julia> @time DataFrame(load(f, "Tabelle1"));
  0.149713 seconds (221.37 k allocations: 6.085 MiB, 12.28% gc time)

julia> @time read_excel(f);
  0.077299 seconds (1.12 k allocations: 2.093 MiB)

julia> @time read_excel(f, engine = "openpyxl");
  0.135302 seconds (1.13 k allocations: 2.094 MiB)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants