Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

Key the root modules cache by sys.path entries. #2695

Merged
merged 1 commit into from Apr 17, 2013

Conversation

Projects
None yet
4 participants
Contributor

anntzer commented Dec 17, 2012

Instead of a single root modules cache, cache in a dict the list of modules
associated with each sys.path entries (except $PWD). On future import
completions, create the list of completions by merging the appropriate cache
entries (while updating the cache if sys.path changed) and adding in the
modules in $PWD as well. This allows the completer to keep up with multiple
Python installations, virtualenvs, etc.

Right now the root modules cache grows indefinitely. It may be worth
implementing a LRU cache, a la functools.lru_cache.

cf #2688.

@ghost ghost assigned takluyver Dec 18, 2012

Owner

takluyver commented Dec 18, 2012

Looks good at a glance, assigned myself to look more closely at it later. Thanks!

@bfroehle bfroehle commented on the diff Dec 18, 2012

IPython/core/completerlib.py
- sys.stdout.flush()
- if time() - t > TIMEOUT_GIVEUP:
- print("This is taking too long, we give up.\n")
- ip.db['rootmodules'] = []
- return []
-
- modules = set(modules)
- if '__init__' in modules:
- modules.remove('__init__')
- modules = list(modules)
+ try:
+ modules = rootmodules_cache[path]
+ except KeyError:
+ modules = module_list(path)
+ try:
+ modules.remove('__init__')
@bfroehle

bfroehle Dec 18, 2012

Contributor

You can write this more simply as modues.pop('__init__', None).

@anntzer

anntzer Dec 18, 2012

Contributor

No, because modules is now a list (as returned by the aptly named module_list) so 2-argument form pop is not supported. Perhaps it actually should but well...

@bfroehle

bfroehle Dec 18, 2012

Contributor

Aha, yes right. Thanks.

Contributor

anntzer commented Dec 18, 2012

I started thinking about limiting the size of the cache, for which an a priori simple implementation would be to use a LRU cache (so I'd need to backport OrderedDict to support Python 2.6 -- by the way does anyone know why the implementation of functools.lru_cache in Python 3.3 does not use OrderedDict, but instead seems to reimplement it?). However there is more to it than that, because if we actually use a LRU cache then we need to rewrite the rootmodules cache on every tab completion, which may be too costly (I don't know exactly but the IPython 0.13 version of the code writes the rootmodules cache to disk only if the list takes more than 2s to create, so I guess there was a desire to avoid writing files to the disk if possible). Or another solution would be to cache the last value of sys.path as well, which would allow rewriting the rootmodules cache only when a tab completion detects that sys.path changed (which should be much less often; at least it should generally occur more than once per interactive session).
Any thoughts?

Contributor

bfroehle commented Dec 23, 2012

@anntzer It's hard for me to know what the right thing to do here. In my normal use the function usually completes quickly enough that the cache mechanism does not kick in. I'm not too worried about the size of the cache, especially if %rehashx continues to clear the cache.

Owner

minrk commented Mar 29, 2013

This is broken because OrderedDict isn't imported. What else is there to do here?

Key the root modules cache by sys.path entries.
Instead of a single root modules cache, cache in a dict the list of modules
associated with each sys.path entries (except $PWD).  On future import
completions, create the list of completions by merging the appropriate cache
entries (while updating the cache if sys.path changed) and adding in the
modules in $PWD as well.  This allows the completer to keep up with multiple
Python installations, virtualenvs, etc.

Right now the root modules cache grows indefinitely.  It may be worth
implementing a LRU cache, a la functools.lru_cache.
Contributor

anntzer commented Apr 2, 2013

OrderedDict does not exist in Python 2.6 so I will either have to bundle an implementation of OrderedDict to ipython (I don't think one is included right now?) or just forget about limiting the size of the cache. As @bfroehle mentioned it may just not be worth it. If we decide, instead to bundle an implementation of OrderedDict, then just importing it in completerlib.py is enough.

Owner

minrk commented Apr 16, 2013

Based on @bfroehle's comments, and the fact that it doesn't work right now, let's drop the size-limit on the cache.

Contributor

anntzer commented Apr 16, 2013

OK. Do you want me to reset --hard the branch to the first commit (which is what we want, right?), or do something else?

Owner

minrk commented Apr 16, 2013

That would be fine, or you can revert it with a new commit so it stays in the history - all up to you.

Contributor

anntzer commented Apr 16, 2013

I reseted module-cache to the first commit. The second commit is still available in the new module-cache-sizelimit branch, which can be set aside for now.

Owner

minrk commented Apr 17, 2013

Thanks, merging.

minrk added a commit that referenced this pull request Apr 17, 2013

Merge pull request #2695 from anntzer/module-cache
Key the root modules cache by sys.path entries.

Instead of a single root modules cache, cache in a dict the list of modules
associated with each sys.path entries (except $PWD). On future import
completions, create the list of completions by merging the appropriate cache
entries (while updating the cache if sys.path changed) and adding in the
modules in $PWD as well. This allows the completer to keep up with multiple
Python installations, virtualenvs, etc.

Right now the root modules cache grows indefinitely. It may be worth
implementing an LRU cache, a la functools.lru_cache.

closes #2688

@minrk minrk merged commit 31f91fd into ipython:master Apr 17, 2013

1 check passed

default The Travis build passed
Details

@anntzer anntzer deleted the anntzer:module-cache branch Apr 17, 2013

mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014

Merge pull request #2695 from anntzer/module-cache
Key the root modules cache by sys.path entries.

Instead of a single root modules cache, cache in a dict the list of modules
associated with each sys.path entries (except $PWD). On future import
completions, create the list of completions by merging the appropriate cache
entries (while updating the cache if sys.path changed) and adding in the
modules in $PWD as well. This allows the completer to keep up with multiple
Python installations, virtualenvs, etc.

Right now the root modules cache grows indefinitely. It may be worth
implementing an LRU cache, a la functools.lru_cache.

closes #2688
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment