Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: performance improvement in MultiIndex.sortlevel #9445

Merged
merged 1 commit into from
Feb 16, 2015

Conversation

behzadnouri
Copy link
Contributor

-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------
multiindex_sortlevel_int64                   | 664.1977 | 2443.2590 |   0.2718 |
-------------------------------------------------------------------------------
Test name                                    | head[ms] | base[ms] |  ratio   |
-------------------------------------------------------------------------------

Ratio < 1.0 means the target commit is faster then the baseline.
Seed used: 1234

Target [dd31bae] : performance improvement in MultiIndex.sortlevel
Base   [c4a996a] : Merge pull request #9411 from dashesy/add_sql_test

test sql table name

comp_ids, obs_ids = _compress_group_index(group_index)
max_group = len(obs_ids)
if not compress:
ngroups = (ids.size and ids.max()) + 1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Both idx.size and ids.max() are integers, yes?

I find using and for integers rather confusing. Perhaps this could be: (idx.max() if ids.size else 0) + 1

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

boolean-operations:

The expression x and y first evaluates x; if x is false, its value is returned; otherwise, y is evaluated and the resulting value is returned.

note that the documentation follows with an example of s or 'foo', i.e. oring two strings, as a use-case of boolean operations.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@behzadnouri I agree with @shoyer here. We get the idiom, but I don't think a casual glance is intuitve here. Pls change to what @shoyer suggests.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i gave an example from python's documentation that this is an actual use-case for boolean operations:

This is sometimes useful, e.g., if s is a string that should be replaced by a default value if it is empty, the expression s or 'foo' yields the desired value.

this is right off from python's documentation, and i have seen it used a lot.

that said, if you like to change it, please do.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I get that it is from python docs
but not all of that idioms are useful
in the context of assignment with a value that could be None I would agree with you
but these are integers and it's not obvious at all

pls make the change

@jreback jreback added Performance Memory or execution speed performance MultiIndex labels Feb 10, 2015
@jreback jreback added this to the 0.16.0 milestone Feb 10, 2015
@jreback
Copy link
Contributor

jreback commented Feb 10, 2015

looks good otherwise, ping when green after that change.

jreback added a commit that referenced this pull request Feb 16, 2015
PERF: performance improvement in MultiIndex.sortlevel
@jreback jreback merged commit b714262 into pandas-dev:master Feb 16, 2015
@jreback
Copy link
Contributor

jreback commented Feb 16, 2015

@behzadnouri thanks!

@behzadnouri behzadnouri deleted the i8sort branch March 5, 2015 12:48
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MultiIndex Performance Memory or execution speed performance
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants