Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow completion for existing object with long __repr__ method #919

Closed
ElieGouzien opened this issue Apr 29, 2017 · 9 comments
Closed

Slow completion for existing object with long __repr__ method #919

ElieGouzien opened this issue Apr 29, 2017 · 9 comments

Comments

@ElieGouzien
Copy link
Contributor

Hi,

For completion with existing objects it seems that the __repr__() method is evaluated for an error message within the inspect module which is afterward caught. This makes the completion as slow as __repr__() while this evaluation in unnecessary for the user. This method can bee rather slow with big pandas objects or custom class.

This as been seen here :
qtconsole issue #90

A minimal code to reproduce it :

import jedi
class Bugger(object):
    def __init__(self, size=10000):
        """Create bigg data."""
        self.big_data = [list(range(i)) for i in range(size)]

    def easy_method(self):
        """Method that should be really fast."""
        return self.big_data[-1][-1]

    def __repr__(self):
        output = ""
        for nested in self.big_data:
            for elem in nested:
                output += str(elem)
            output += '\n'
        return output

test = Bugger()
jedi.Interpreter('test.ea', [locals()]).completions()

And a more convenient working version :

import jedi, pdb, inspect
class Bugger(object):
    def __init__(self, size=10):
        """Create bigg data."""
        self.big_data = [list(range(i)) for i in range(size)]

    def easy_method(self):
        """Method that should be really fast."""
        return self.big_data[-1][-1]

    def __repr__(self):
        frames = inspect.getouterframes(inspect.currentframe())
        for frame in frames:
            print(frame)
        pdb.set_trace()
        output = ""
        for nested in self.big_data:
            for elem in nested:
                output += str(elem)
            output += '\n'
        return output

test = Bugger()
jedi.Interpreter('test.ea', [locals()]).completions()

Jedi seems to end here through jedi/evaluate/compiled/mixed.py, line 121.
Maybe making checks before the calling inspect.getsourcefile() could fix it, but not I don't know jedi enough to claim it.

@mangecoeur
Copy link

I also experience this issue, though ipython 6.0 . Note that the completion appears to be slower than a single repr call, suggesting it may be calling it many times.

@davidhalter
Copy link
Owner

For what it's worth, I'm pretty sure it's not the __repr__ that is slowing Jedi down. Jedi doesn't execute source code. Or let's say it tries to actively avoid it.

@ElieGouzien
Copy link
Contributor Author

Let me precise that this performance issue was been found from IPython completion (since it uses Jedi). Then it sounds reasonable to me that it does work with "executed code" since I don't think IPython gives the full code history to Jedi but the existing objects (but I don't know internals of IPython neither Jedi so don't give to much credits to my guesses).

A typical example (from which I actually determined that Jedi is involved) is available here : qtconsole issue #90

@davidhalter
Copy link
Owner

@ElieGouzien It does obviously work with executed code. It just tries to not execute it. One of the issues you might be having is that Jedi tries to load the corresponding files of code (to improve autocompletion). This might be a lot of work (depending on the size of the library).

What do you guys generally say that would be slow? (in seconds)

@mangecoeur
Copy link

@davidhalter this seems to be specifically related to data objects rather than code files - particularly things like Pandas tables or large numpy arrays. It seems something related to jedi is doing something with the data object. It might be something to do with the way IPython uses Jedi? Perhaps some serialization going on?

@ElieGouzien
Copy link
Contributor Author

ElieGouzien commented May 4, 2017

@davidhalter In my case I had something like 10-30 s (with custom class). Basically it's as long as computing repr on the object takes.

What happens (I think) is that when inspect.getsourcefile fails it computes repr to include in it's error message ; but as jedi catches it and finds another way to make the completion it's useless. If It sounds good to you I can try to make a check function to anticipate the failure of inspect.getsourcefile and just don't call it in that case.

@ElieGouzien
Copy link
Contributor Author

ElieGouzien commented May 6, 2017

Ok, I have a patch !

@davidhalter Where should I put a test for that fix ?

EDIT : I think I figured out where to put it but I'm still not 100% sure I'm right. See in the pull request #922.

davidhalter pushed a commit that referenced this issue May 6, 2017
Anticipate the raise of TypeError from inspect.getfile to prevent the computation of repr() for the error message wich is not used.
Useful for some big pandas arrays.
Fix tentative of #919.
davidhalter pushed a commit that referenced this issue May 6, 2017
@takluyver
Copy link
Contributor

I've submitted a bug and a PR to Python to try to improve this in the inspect module:

http://bugs.python.org/issue30639
python/cpython#2132

@davidhalter
Copy link
Owner

Nice! Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants