Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PERF: tab completion with a large index #18587

Open
jreback opened this Issue Dec 1, 2017 · 6 comments

Comments

Projects
None yet
3 participants
@jreback
Copy link
Contributor

commented Dec 1, 2017

from #16326 (comment)

If you have a very large index, _dir_additions (for tab completion) actually takes quite a bit of time

So what I would do is if the index is say < 100, use the currently _dir_addition, otherwise return an empty list! (its essentially too big to use tab completion for anyhow). can you make this change and add an asv for this (could be a separate PR as well)

@jreback jreback added this to the 0.21.1 milestone Dec 1, 2017

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Dec 1, 2017

@BibMartin

This comment has been minimized.

Copy link
Contributor

commented Dec 1, 2017

One may have a very large index with few distinct values. I would suggest to limit the number of values returned rather than the size of the index. (It seems that the delay is due to the handling of the results rather than the computation of dir)
Something like:

additions = set([c for c in self._info_axis.get_level_values(0).unique()[:100]
                 if isinstance(c, string_types) and isidentifier(c)])

Anyway, I think I can address this issue in #16326 ; the topics are quite related.

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Dec 1, 2017

Do we know why _dir_additions is slow for large objects?

@jreback

This comment has been minimized.

Copy link
Contributor Author

commented Dec 2, 2017

you can use self._info_axis.unique(level=0) here as a generic way to do this.

@jreback jreback modified the milestones: 0.21.1, Next Major Release Dec 2, 2017

@BibMartin

This comment has been minimized.

Copy link
Contributor

commented Dec 5, 2017

@TomAugspurger

Do we know why _dir_additions is slow for large objects?

I don't know exactly, but the slowdown seem to come from the IHM: When I create a large Series (s = Series(index=tm.makeStringIndex(10000))) in a notebook or in ipython console, then dir(s) is fast (much less than 1 sec) while asking for tab-completion is slow (several seconds).

@jreback

you can use self._info_axis.unique(level=0) here as a generic way to do this.

Yes thanks, that's an awesome new feature.

@TomAugspurger

This comment has been minimized.

Copy link
Contributor

commented Apr 7, 2019

Was this fixed by #20834? Tab completion on the following seems quick

In [21]: s = Series(index=tm.makeStringIndex(10000))

In [22]: s.<tab>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.