-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: Adding table name parameter to pandas.read_excel #58464
Conversation
As far as I can tell this should be passing all of the tests. Some pre-commit hooks seem to not be passing in this PR but some make no sense, saying there is no attribute, yet there is. Tons of "not indexable errors", that are indexable, and were existing functionality prior to opening this so I'm not certain how it's failing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR. This issue is tagged as Needs Discussion
, and I am hesitant to add this functionality to pandas' already complicated Excel reading code. This is especially true because, as the implementation currently stands, it's only supported by openpyxl.
That said, it seems to me the move of code in parse
to it's own method (currently - your parse_mutliindex
) is a good cleanup anyways (without the reading tables feature), and it would simplify the diff here. Would you consider doing that refactor as a pre-cursor?
I'm also curious as to why parse_multiindex
since (I think) there isn't necessarily a MultiIndex. I would suggest parse_sheet
instead.
Agreed with the hesitancy to implement this feature as of now. Going to close since I think more buy-in needs to happen in the issue first, but happy to have a separate PR with the cleanups |
I guess I'm a bit confused, as I had tried to get more of a discussion on the issue but could not get any response from you (the development team), only from other contributors. The functionality only adds complexity when the user wants to read in tables, and there are plenty of other functionality in the excel io area of pandas that only works with one of the engines. How does getting more buy-in happen? @mroeschke @rhshadrach |
Each feature adds a certain amount of technical debt - it increases the surface area of pandas requiring tests, documentation, dealing with new bug reports. It would be unmaintainable to accept all feature requests, therefore we must carefully evaluate what features we do accept.
This is something we would like to reduce. That said - I would take a look at the diff here after #58497 is merged. |
Thank you for the explanations, those certainly make sense. Please let me know any thoughts, issues, or improvements. I would like to understand the discussion as to if or how this functionality should look in pandas. |
For the future - we can reopen the PR. But let's continue this in #58500. |
closes ENH: pd.read_excel with table parameter #38937 (Replace xxxx with the GitHub issue number)
Tests added and passed if fixing a bug or adding a new feature
All code checks passed.
Added type annotations to new arguments/methods/functions.
Added an entry in the latest
doc/source/whatsnew/vX.X.X.rst
file if fixing a bug or adding a new feature.