-
-
Notifications
You must be signed in to change notification settings - Fork 18k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ENH: per axis and per level indexing (orig GH6134) #6301
Conversation
Would it be possible not to take code I (anyone for that matter) spent significant effort writing, Granted, OSS contributors come from diverse backgrounds and a varied, richly-textured |
@y-p that was YOUR comment https://github.com/y-p/pandas/commit/120c4c513feb5318eacbcf1133c8cdadf4dd4bac |
Thanks, much better. |
cc @dragoljub if you could review this PR, would be gr8 |
This is a fantastic feature to add and has been long overdue. Thanks to y-p for the coding effort and Jeff for docs, discussion, etc. 👍 All the features look good. My major feedback would be to add an option to allow multilevel indexing to return the complete index depth (all levels) even if you select one specific level with only one value like this: Many times I find myself relying on a global indexing scheme that I would like to preserve regardless of the selection I make. This is epically true when I apply multivariate functions on groupby's, since I'm used to having the full index depth Quick comments on the first warning: df.loc[(slice('A1','A3'),.....),:] |
I think the examples I posted are slightly old we could support something like this df.loc(drop_level=True)[......] where drop_level will normally be False |
any further comments on the API?....I think this is mergable cc @dragoljub, cc @nehalecky, cc @immerrr |
cc @timcera |
cc @aharoon123 |
@jreback, ha, I now get it what you meant when you said [0] of the tuple. That's my mistake: |
@immerrr fixed up.... |
Some questions: If you have a Series with a multi-index, would something like the following (without using the
I think this could be possible? Or does this make it to complex for users to know when and when not they can use And you could also have something were you can specify on which axis you want to slice:
were this would be the same as
or as (with the IndexSlicer idea):
Although I think we should go to one 'preferred' way of doing this (not saying that only one could work, but just choose one to use consistently in the docs). |
Your Series example will work (but will add as a test). This is the sort of ambiguity that yep...thinking about adding arguments to
I think those are the right way to do it, but more 'conviences' than anything I defined only
or
(this is how it is in the doc example). I think |
The folliwng are now possible
|
side issue.. @jorisvandenbossche was thinking that the 'indexing and selecting data' section is getting too long....split off multiindex into another section? |
|
||
.. code-block:: python | ||
|
||
df.loc[(slice('A1','A3'),.....,:] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
a missing )
? (also in the example below)
CLN: add comments in indexing code CLN: comment out possibly stale kludge fix and wait for explosion CLN: Mark if clause for handling of per-axis tuple indexing with loc PERF: vectorize _spec_to_array_indices, for 3-4x speedup PERF: remove no longer needed list conversion. 1.4x speedup
ENH: add core/indexing.py/_getitem_nested_tuple to handle the nested_tuple cases for partial multi-indexing
… a particular level ENH: remove get_specs/specs_to_index -> replace with get_locs, to directly compute an indexer for a multi-level specification
TST: better error messages when levels are not sorted with core/index/get_locs ENH: add boolean indexer support on per_axis/per_level BUG: handle a multi-level indexed series passed like with a nested tuple of selectors e.g. something like: s.loc['A1':'A3',:,['C1','C3']]
DOC: release notes and issues for mi_slicing
…dex of differeing levels (GH3738)
ENH: allow the axis keyword to short-circuit indexing
ENH: per axis and per level indexing (orig GH6134)
This is a reprise of #6134, with tests, and multi-axis support; it is dependent on #6299
closes #4036
closes #4116
closes #3057
closes #2598
closes #5641
closes #3738
This is the whatsnew/docs
MultiIndexing Using Slicers
In 0.14.0 we added a new way to slice multi-indexed objects. You can slice a multi-index by providing multiple indexers. You can use slice(None) to select all the contents of that level. You do not need to specify all the deeper levels, they will be implied as slice(None). See the docs
It is possible to perform quite complicated selections using this method on multiple axes at the same time.
Furthermore you can set the values using these methods
You use a right-hand-side of an alignable object as well.