Feature request: purely int-index based version of .ix #1052

Closed
tkf opened this Issue Apr 14, 2012 · 5 comments

Comments

Projects
None yet
6 participants
Contributor

tkf commented Apr 14, 2012

First of all, thank you very much for the great library! I really like the scatter plot in the latest release. I am very new to the library, but I am enjoying it very much.

However, I am really uneasy about the fallback behavior of the .ix attribute. Why can't we have a purely label based version of .ix and a purely integer index version of .ix? I want pandas to raise an error when a data frame does not have integer label I pass rather than fallback to integer index based indexing, because "explicit is better than implicit".

I can write a pull request if it is likely to be pulled. Or maybe you should write it because it is a big API change.

Owner

wesm commented Apr 14, 2012

hi-- I think http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#api-changes-to-integer-indexing has the answer to your question. If you have an integer index and pass an integer that is not contained in it, a KeyError will be raised. Or is that not what you're saying?

Contributor

tkf commented Apr 14, 2012

What I was saying about key error was just an example... which was not correct! Your answer make me little bit relieved. Thanks for that.

The point (from another direction) is that I cannot say what df.ix[1:2, 3:4] does until I find out what is the type of index and column of the data frame df is. I believe .ix is carefully designed so that it won't "mixed up" label and index (like correctly issuing KeyError in the case above). But I still think having different attributes for doing different things (specifying index or specifying label) is better. I mean, why not? It just reduces mental load from user caring about the type of the index.

Regarding implementation, you can just add two attributes, move the functionality of .ix into them and call them internally from .ix. Or maybe change what .ix does completely. It is a design decision stuff so I cannot say much; all I want to say is that I think it won't be too hard because you just need to split up one function into two.

njsmith commented Jan 3, 2013

iget/irow/icol are nice, but it would be even better to have a .ix-like interface. Advantages:

  • Uniform interface across all pandas objects (and generalizable to NDFrame)
  • Allows slice syntax. The current functions accept slices, but you have to construct the objects by hand (!)
  • Allows getting and setting via the same interface (right now there is no interface for assignment!)
  • Allows simultaneous multidimensional selection (df.by_loc[i, j] versus df.icol(j).iget(i))
  • More consistent with numpy...

What to call it? There are two many names with i in them already, and anyway the word 'index' isn't actually useful here, because the key thing about this interface is that it uses a different kind of indexing than the ordinary indexing, so the name should reflect that. by_loc? by_offset?

Contributor

changhiskhan commented Jan 3, 2013

Agreed. Definitely pretty high up on the todo list, just a matter of finding time to build it.
As for naming, we were throwing around iix/pix/oix (purely integer/positional/ordinal indexing) and lix (purely label-based indexing) before. iix and lix doesn't pair well (looks too much alike on smaller fonts) but pix and oix would be fine. by_loc and by_label would be fine too though a bit more typing is involved.

Contributor

janschulz commented Mar 9, 2013

This can probably be closed (by #2922)?

jreback closed this Mar 9, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment