Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: do not check for label presence preventively #21594

Merged
merged 2 commits into from
Jun 25, 2018

Conversation

toobaz
Copy link
Member

@toobaz toobaz commented Jun 22, 2018

ASV run:

       before           after         ratio
     [f1ffc5fa]       [0af738b0]
-         433±7μs          390±5μs     0.90  indexing.MultiIndexing.time_series_ix
-      79.0±0.9μs       53.1±0.8μs     0.67  indexing.IntervalIndexing.time_loc_scalar
-      66.3±0.8μs         42.7±2μs     0.64  indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.Int64Index'>)
-      86.9±0.4μs       54.7±0.2μs     0.63  indexing.NumericSeriesIndexing.time_loc_scalar(<class 'pandas.core.indexes.numeric.Float64Index'>)

SOME BENCHMARKS HAVE CHANGED SIGNIFICANTLY.

We don't check anymore if errors raised in indexing have to do with ordering different types (in Python 3)... but I can't think of any possible case (and none is present in the tests) in which looking for a single label could create a sorting problem.

@toobaz toobaz added Refactor Internal refactoring of code Indexing Related to indexing on series/frames, not to indexes themselves Performance Memory or execution speed performance labels Jun 22, 2018
@@ -198,6 +199,7 @@ Indexing
^^^^^^^^

- The traceback from a ``KeyError`` when asking ``.loc`` for a single missing label is now shorter and more clear (:issue:`21557`)
- When ``.ix`` is asked for a missing integer label in a :class:`MultiIndex` with a first level of integer type, it now raises a ``KeyError``, consistently with the case of a flat :class:``Int64Index`` (:issue:`21593`)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Extra backticks in :class:``Int64Index``

@codecov
Copy link

codecov bot commented Jun 22, 2018

Codecov Report

Merging #21594 into master will increase coverage by <.01%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master   #21594      +/-   ##
==========================================
+ Coverage    91.9%    91.9%   +<.01%     
==========================================
  Files         153      153              
  Lines       49547    49530      -17     
==========================================
- Hits        45537    45523      -14     
+ Misses       4010     4007       -3
Flag Coverage Δ
#multiple 90.3% <100%> (ø) ⬆️
#single 41.79% <50%> (+0.01%) ⬆️
Impacted Files Coverage Δ
pandas/core/indexing.py 93.73% <100%> (+0.36%) ⬆️
pandas/core/indexes/category.py 97.01% <0%> (-0.28%) ⬇️
pandas/core/indexes/multi.py 94.88% <0%> (-0.09%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f1aa08c...0b499c5. Read the comment docs.

Copy link
Contributor

@jreback jreback left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sufficient asv's here? small comments, otherwise lgtm.

@@ -130,6 +130,7 @@ Performance Improvements

- Improved performance of :func:`Series.describe` in case of numeric dtpyes (:issue:`21274`)
- Improved performance of :func:`pandas.core.groupby.GroupBy.rank` when dealing with tied rankings (:issue:`21237`)
- Improved performance of label indexing when key is a single label (:issue:`21594`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mention / reference .loc/.at here

@@ -230,7 +230,7 @@ def test_iloc_getitem_multiindex(self):
# corner column
rs = mi_int.iloc[2, 2]
with catch_warnings(record=True):
xp = mi_int.ix[:, 2].ix[2]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not that we care too much, but is this case looks slightly different from below (where testing for .ix)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, because as long as first level is non-int, .ix is fine. Adding a comment.

@jreback jreback added this to the 0.24.0 milestone Jun 22, 2018
@@ -198,6 +199,7 @@ Indexing
^^^^^^^^

- The traceback from a ``KeyError`` when asking ``.loc`` for a single missing label is now shorter and more clear (:issue:`21557`)
- When ``.ix`` is asked for a missing integer label in a :class:`MultiIndex` with a first level of integer type, it now raises a ``KeyError``, consistently with the case of a flat :class:``Int64Index`` (:issue:`21593`)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what did this raise before? (or did it work?)

@toobaz
Copy link
Member Author

toobaz commented Jun 24, 2018

@jreback added a couple of comments, ready for me

@jreback jreback merged commit 51d548d into pandas-dev:master Jun 25, 2018
@jreback
Copy link
Contributor

jreback commented Jun 25, 2018

thanks @toobaz nice code removal! keep em coming!

@toobaz toobaz deleted the no_pre_lookup branch July 8, 2018 08:22
Sup3rGeo pushed a commit to Sup3rGeo/pandas that referenced this pull request Oct 1, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Indexing Related to indexing on series/frames, not to indexes themselves Performance Memory or execution speed performance Refactor Internal refactoring of code
Projects
None yet
Development

Successfully merging this pull request may close these issues.

.ix[int_number] falls back to positional indexing with MultiIndex with integers on first level
3 participants