Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
API: add 'level' kwarg to 'Index.isin' method #7892
Conversation
|
Also, probably related to #3268 |
jreback
added API Design Enhancement Indexing MultiIndex
labels
Jul 31, 2014
jreback
added this to the
0.15.0
milestone
Jul 31, 2014
jreback
commented on the diff
Jul 31, 2014
| @@ -687,13 +687,29 @@ def _engine(self): | ||
| # property, for now, slow to look up | ||
| return self._engine_type(lambda: self.values, len(self)) | ||
| + def _validate_index_level(self, level): | ||
| + """ | ||
| + Validate index level. |
jreback
Contributor
|
|
Performance patch is in, here's a synthetic benchmark: In [7]: idx = pd.MultiIndex.from_product([np.arange(10000), ['a', 'b', 'c']])
In [8]: timeit idx.isin(['a'], level=1)
10000 loops, best of 3: 104 µs per loop
In [9]: timeit idx.get_level_values(1).isin(['a'])
1000 loops, best of 3: 1.37 ms per loop
In [10]: timeit idx.isin([1], level=0)
1000 loops, best of 3: 1.02 ms per loop
In [11]: timeit idx.get_level_values(0).isin([1])
100 loops, best of 3: 2.84 ms per loop
|
jreback
and 1 other
commented on an outdated diff
Jul 31, 2014
| + passed set of values | ||
| + | ||
| + Parameters | ||
| + ---------- | ||
| + values : set or sequence of values | ||
| + level : int or level name | ||
| + | ||
| + Returns | ||
| + ------- | ||
| + is_contained : ndarray (boolean dtype) | ||
| + """ | ||
| + | ||
| + if level is None: | ||
| + return lib.ismember(self._array_values(), set(values)) | ||
| + else: | ||
| + num = self._get_level_number(level) |
jreback
Contributor
|
|
looks good....doc and then can merge FYI, if you want to revise/update isin section gr8! (I would also make it a top-level, rather than below boolean indexing) |
|
re: doc other than that I've already added? |
|
I think it makes sense to combine those 2 sections (isin of Index objects) and isin of PandasObject as they are really the same (and confusing to look in 2 sections). http://pandas-docs.github.io/pandas-docs-travis/indexing.html#indexing-with-isin I would simply change the isin section (that I am pointing), make it a top-level (e.g. same as Boolean Indexing), them put the examples there |
|
I'm a bit concerned about introducing functionality of classes that themselves have not yet been described, because Although, I see some of that happening for docs on |
|
I agree with all of that index/MultiIndex should have an intro at the top of indexing (side issue) but I think splitting isin up (as it is now) is a bigger problem so I would consolidate and out as a big bullet after boolean indexing feel free to do a short intro if u would like as well (for index/mi) |
jorisvandenbossche
and 2 others
commented on an outdated diff
Aug 3, 2014
| @@ -2157,12 +2178,17 @@ def isin(self, values): | ||
| Parameters | ||
| ---------- | ||
| values : set or sequence of values | ||
| + level : {0, None} | ||
| + | ||
| + `level` argument is provided for compatibility with MultiIndex. |
jorisvandenbossche
Owner
|
|
looks good... @jorisvandenbossche anything else? |
|
no, looking good! |
jreback
added a commit
that referenced
this pull request
Aug 4, 2014
|
|
jreback |
0646ad5
|
jreback
merged commit 0646ad5
into pandas-dev:master
Aug 4, 2014
1 check passed
|
thanks let's check docs after it's built and review for correctness |
immerrr commentedJul 31, 2014
closes #7890
This PR adds
levelkwarg forIndexobjects as discussed (briefly in #7890):Summary
MultiIndexclasses:levelvalues areNone,0,-1andself.nameleveldoesn't match the name, it's aKeyError(wasAssertionErrorbefore)MultiIndexclasses:level=None, elements ofself.values(tuples) are usedlevelvalues are the same as inMultiIndex.get_level_valuesmethod,-self.nlevels..(self.nlevels - 1)plus all unique index namesTODO