Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Allow for passing in engine_kwargs to read_excel #43053

Closed
driscoll42 opened this issue Aug 15, 2021 · 6 comments
Closed

ENH: Allow for passing in engine_kwargs to read_excel #43053

driscoll42 opened this issue Aug 15, 2021 · 6 comments
Labels
Enhancement IO Excel read_excel, to_excel

Comments

@driscoll42
Copy link

Is your feature request related to a problem?

In experimenting with openpyxl, I have found that I can load my files in faster when read_only=True, and I would like to pass in that argument to read_excel

Describe the solution you'd like

Similar to ExcelWriter, I would like to have engine_kwargs added to read_excel to pass in engine specific kwargs.

API breaking implications

This shouldn't break it.

Describe alternatives you've considered

Just use read_excel as is, works fine but I can load large excel files more quickly with using engine specific kwargs

Additional context

I imgine it would look like pd.read_excel('example.xlsx', engine='openpyxl', engine_kwargs={'read_only : True}) for am example

@driscoll42 driscoll42 added Enhancement Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 15, 2021
@simonjayhawkins simonjayhawkins added IO Excel read_excel, to_excel and removed Needs Triage Issue that has not been reviewed by a pandas team member labels Aug 16, 2021
@simonjayhawkins simonjayhawkins added this to the Contributions Welcome milestone Aug 16, 2021
@rhshadrach
Copy link
Member

Is there is a case where we shouldn't be passing read_only=True?

@rhshadrach
Copy link
Member

But makes sense to allow passing kwargs through to the engine regardless.

@driscoll42
Copy link
Author

I don't believe so. When I looked at the source code further after opening this ticket I realized that it was already being passed along with a few variables with data_only and keep_links which probably give all the speedup possible. I was looking if the convert_cell loop could be sped up at all as that, at least in my test cases, was 93% of the load time and couldn't see anything.

@scotscotmcc
Copy link

I'm happy to take this (or try to, at least).

@rmhowe425
Copy link
Contributor

@mroeschke @rhshadrach Should this issue be closed now that #52214 has been merged?

@rhshadrach
Copy link
Member

Thanks - agreed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Enhancement IO Excel read_excel, to_excel
Projects
None yet
Development

No branches or pull requests

6 participants