function specific tab completion via annotations #2701

rmcgibbo · 2012-12-18T07:20:13Z

This is a new attempt at #2636. The feature enables custom tab completion for different arguments of a function, and also for the return value of a function (without calling it).

It's easier to show in action though. Check it out!

Here's tab completion on the return value of a function, without ever evaling the function

In [1]: from IPython.extensions.completion import tab_complete, globs_to,  literal

In [2]: @tab_complete
  ...: def baz(x) -> str:
  ...:     pass
  ...:

In [3]: baz(notevaluated).<TAB>
.capitalize  .find        .isspace     .partition   .rstrip      .translate
.center      .format      .istitle     .replace     .split       .upper
.count       .index       .isupper     .rfind       .splitlines  .zfill
.decode      .isalnum     .join        .rindex      .startswith
.encode      .isalpha     .ljust       .rjust       .strip
.endswith    .isdigit     .lower       .rpartition  .swapcase
.expandtabs  .islower     .lstrip      .rsplit      .title

Note that sense this never calls the function, all the tab completion is based on methods defined on the str class. If you notate the return value as some class that dynamically defines its methods when it is
instantiated, the tab complete engine is not going to be able to find them. There's no way around this without actually invoking the code on the TAB keypress, which is not desired (and is already possible by configuring your ipython config file).

Here, tab completion on the first argument only shows literal strings and directories that match a glob pattern. For the second argument, it shows some string literals.

In [4]: @tab_complete
   ...: def foo(x : globs_to('*.txt'), mode : literal('r', 'w')):
   ...:     pass
   ...:

In [5]: foo(<TAB>
'COPYING.txt'        'dist/'              'setupext/'
'IPython/'           'docs/'              'tools/'
'__pycache__/'       'ipython.egg-info/'
'build/'             'scripts/'

In[6]: foo('COPYING.txt', <TAB>
'r', 'w'

The default type of tab completion is based on isinstance matching. You could also be explicit about it:

In [7]: @tab_complete
   ...: def bar(x : str, x : instance_of(str)):
   ...:     pass
   ...:

In [8]: string1 = string2 = 'some string'

In [9]: bar(<TAB>
string1  string2

In [9]: bar('a', <TAB>
string1  string2

All of this functionality is possible in python2 syntax (without PEP3107 annotations) by calling the @tab_complete decorator with keyword arguments.

The design here was influenced by Nick Coghlan's emails on python-ideas, and an annotation-based typechecking library on activestate.

Carreau · 2012-12-18T07:51:44Z

That's nice !

Not much time to look at it, but the travis faillure are mostly issues with <tab>in doctest than can be fixed with the @skipdoctestdecorator (or something like that).

rmcgibbo · 2012-12-18T07:56:30Z

@Carreau, unfortunately, there doesn't seem to be a way to skip the doctests that run on the module docstring, since you can't decorate the module docstring.

Carreau · 2012-12-18T08:14:26Z

Oh, it was on the module docstring, I didn't saw that, I only saw doctest errors :-)

bfroehle · 2012-12-18T16:29:04Z

Wow, fantastic! I'll give this a whirl in a few days.

takluyver · 2012-12-18T17:29:13Z

You could try setting skip_doctest=True at the module level - that's all the decorator does to functions. I'm curious about how your 'sleazy' fix works.

asmeurer · 2012-12-18T21:51:48Z

It's unfortunate that the easiest way to add return completions in Python 2 is @tab_complete(**{'return': completer}). Would it make sense to add a convenience decorator @tab_complete_return, which would work like

@tab_complete(a=str, **{'return': str})
def f(a):
    return a + '1'

is the same as

@tab_complete_return(str)
@tab_complete(a=str)
def f(a):
    return a + '1'

?

…syntax

rmcgibbo · 2012-12-19T03:05:48Z

@takluyver, good idea. Setting skip_doctest=True at the module level did the trick.

@asmeurer, I added the python2 convenience decorator.

rmcgibbo · 2012-12-19T04:53:03Z

In the qtconsole, the tooltip and the tab completions come up on top of each other for entries like foo(<TAB>, and it doesn't look great.

rmcgibbo · 2012-12-19T15:26:27Z

A few unordered thoughts:

I'm sure there are Windows-related bugs in the globs_to annotation. There are most likely issues with filenames that have spaces in them as well.
When tab completing for a class constructor, under what conditions should the code check __new__ vs. __init__? What is the correct behavior when both methods exist and take different arguments? Is that even allowed?
There might be a nicer way to, given a class, infer (approximately) what methods are available on the instances of that class. Currently, I'm just using all of the entries of utils.dir2() on the class, minus the mro. But it should instead try to filter out class methods. I think the best way to do this (at least in python2) is with inspect.classify_class_attrs.
Using inspect.getcallargs could make the logic of utils.tokens.cursor_argument simpler and more robust.
A nice extension to the return value tab completion would be some specific support for return values that are a tuple of different types, like

@tab complete 
def f() -> (str, int):
    return 'hello', 5

f()[1].<TAB>
.bit_length   .conjugate    .denominator  .imag         .numerator    .real

takluyver · 2012-12-19T16:13:24Z

I wouldn't filter out class methods - they can still actually be called on the instance, so it's valid for them to show up in tab completions.

The question about tuples nrings us back to the discussion on python-ideas - should that syntax be for "2-tuple of str, int", or for "either str or int"? In any case, I suspect the complexity of tracking those variables isn't worth it.

rmcgibbo · 2012-12-19T16:24:30Z

Maybe you're right. I feel like the most common thing is 2-tuple of str, int. But perhaps it's not worth it.

A new issue with architectural implications is that most of this code fails with functions defined in C (or cython) since they can't be properly introspected. You also can't set random attributes on functions defined in C trying to set f._tab_completions = data is a nonstarter. Since this is where the tab complete data is stored, it's not a good situation.

I was trying to manually "decorate" numpy by parsing the docstrings, but since so many of the functions are defined in C, the possibilities are really limited.

rmcgibbo · 2012-12-19T16:28:49Z

Also, it would be possible to have both semantics with something like

@tab_complete
def f(x) -> (str, int):
    return 'a', 5

@tab_complete
def g(x) -> one_of(str, int):
    if np.random.randn() < 0:
        return 'a'
    return 5

given them the two different semantics

But I am -1 on this idea.

bfroehle · 2012-12-19T16:50:07Z

Or would it be the much less satisfying:

@tab_complete
def f(x) -> tuple:
    return 'a', 5

rmcgibbo · 2012-12-19T16:51:59Z

right. that doesn't provide much functionality with the current code (just .count and .index)

takluyver · 2012-12-19T17:25:56Z

I think that unpacking return values quickly gets complex. E.g. if the return value is guaranteed to be a 2-tuple, I would often unpack it: message, result = f(x). Now you could parse that, figure out that message should be a string, and show me tab completions next time I reference that variable.

But before long, you're almost reimplementing Python to keep track of variable changes. And if that's the route we want to go down, we should look more closely at existing refactoring and completion solutions (like rope and jedi). And in any case, let's not try to do too much in one pull request.

rmcgibbo · 2012-12-19T17:27:36Z

Yeah. I concur. I think we're on the same page.

rmcgibbo · 2012-12-19T17:32:24Z

@takluyver, do you think it's better to store the function tab completion info on the function (e.g. f._tab_completions) or in an IPython.core.completer module level dict/db?

The first seems less spooky, but doesn't let you do builtins like np.ndarray. The code complexity is about equal both ways.

bfroehle · 2012-12-19T17:59:40Z

I think it seems pretty reasonable to only support a single return object (and therefore use tuple for a compound return object) on the first iteration of this.

takluyver · 2012-12-19T18:10:19Z

Hmm, it's annoying that you can't attach arbitrary information to a compiled function. I wonder if it's worth looking for information both on the function, and then in a global registry? E.g. our display machinery looks for methods like _repr_html_() on the object, and then looks in a dictionary of formatters to see if one matches the type at hand.

Alternatively, perhaps we should look at a module-level structure - then packages like numpy could add a dictionary of annotation-like info to the module, rather than having to play the 'are we inside IPython' game to add the information to an IPython registry.

minrk · 2013-02-04T04:37:55Z

What's the decision here on whether this extension belongs in IPython vs separately maintainted? It seems like bundling every new extension in IPython misses much of the point of extensions. Pinging @ellisonbg and @fperez on that one.

takluyver · 2013-02-05T15:44:29Z

At a glance, I don't think that module is really an extension (as in something that you would do %load_ext foo with). It looks more like infrastructure for extensions, in which case I'd say it belongs in IPython.lib.

I also think we might have to rationalise our tab-completion infrastructure a bit to better cope with more complex parsing.

asmeurer · 2013-02-07T02:52:43Z

I would argue for this one to be included.

It is general. It's not like this only applies to numpy functions or something like that. In fact, by itself, it applies to nothing, because it rather sets up an API for others to use. To me, that indicates it should be part of IPython. On the other hand, specific decorations belong in their respective libraries (e.g., decorating numpy functions' return types).
From my understanding, tab completion is a pretty fundamental part of the IPython mission (@fperez can correct me if I'm wrong). I personally would like to see much better tab completion in IPython in the future, especially in non-terminal clients where more is possible.

But it's your decision. I could also see very nice tab completion being an external library that plugs in to IPython.

Carreau · 2013-02-07T09:52:07Z

I think there are 2 things here.

The modification of the completer
The "tab_completion" extension decorator.

So I think the question of having this part with the decorator "separate" make sens, because it can be used without the rest of this PR or IPython.

Carreau · 2013-02-07T09:52:28Z

BTW, it conflicts now.

ellisonbg · 2013-02-20T22:20:00Z

I agree with @minrk that most "extension" like things should not be in IPython itself. My preferences would be:

If it is really important, we include it into the core, not as an extension.
If it is not that important, it is third party extension.

I am -1 on the in between of having it as an extension that is shipped with IPython.

rmcgibbo · 2013-02-20T22:23:37Z

Sure. I realize that this PR is called an "extension" (and there's code in the 'extension' directory), but most of it is additions to the completer module that enable the tab completion mechanism to be extended via annotations. That's the core feature. The question, to me, is, do you want that in IPython?

ellisonbg · 2013-02-20T23:57:07Z

I like the functionality, but this is a lot of added code and complexity to an already messy and complex part of our code base. It adds a lot of potential for future breakage points and we will have to maintain it. All of that points to not including it in IPython. I don't know enough about our completer logic to know if/how this could be simplified. I would like to get a good read on these issues from the other core devs.

takluyver · 2013-02-21T00:18:31Z

On the one hand, it is quite a bit of extra, non-trivial code for something that users aren't especially crying out for. On the other hand, tab completion is often one of those things that you don't know how valuable it is until you're using it heavily. And this wouldn't be much use as an extension - it's only likely to see much use if it's a default component of IPython.

This is made harder by the fact that the completer stuff is some of the older, thornier code, that I don't think any of us are especially familiar with.

I think I'd regretfully defer this for now, but hope that parts of it can be reintroduced when someone gets to spring-cleaning the completer stuff.

rmcgibbo · 2013-02-21T00:28:16Z

Is that spring cleaning on anyone's todo list? I agree that the existing completer code is quite obtuse. I had to learn its ins-and-outs to write this PR. I have some ideas about how it could be improved, that I can share with whoever.

ellisonbg · 2013-02-21T00:30:26Z

Robert,

We are discussing this over here:

www.hipchat.com/ghtNzvmfC
*
*
*
*

On Wed, Feb 20, 2013 at 4:28 PM, Robert McGibbon
notifications@github.comwrote:

Is that spring cleaning on anyone's todo list? I agree that the existing
completer code is quite obtuse. I had to learn its ins-and-outs to write
this PR. I have some ideas about how it could be improved, that I can share
with whoever.

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/2701#issuecomment-13865862.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

takluyver · 2013-02-21T00:36:35Z

It's on my 'maybe get to this one day' list. ;-)

You might be the world expert on it after doing this pull request, so if you want to have a go at it, be our guest. Off the top of my head, I think the things to look at include:

It was originally tightly coupled to readline, which provides the interface for tab completion in the terminal. Now completions can either go to readline or over the wire to frontends. I think there has been some separation of concerns to allow that, but it can probably go further.
Like other bits of our code, the IPython completion stuff appears to be built over a plain Python version (IPython.core.completer.Completer). We're moving away from the 'pure Python' modes, which tended to be little used and poorly tested, so if that's not architecturally useful, we might get rid of it.

No doubt there'll be plenty of other fun things waiting to be discovered ;-)

fperez · 2013-02-21T01:11:45Z

After reading the whole code, here's my take on the matter: this is simply too complex to put into IPython right now, and part of the complexity stems from the fact that it's hacking around a sub-optimal completion API that doesn't return the right pieces of information in various places.

There are also a number of minor issues with the code itself, but since I don't think it can go in at all, I don't want to get into line-by-line review.

The functionality is potentially quite useful if adopted widely, but because people are very reluctant to put ipython dependencies into their own codes, I'm somewhat skeptical of how well that will go. Obviously we want to break the chicken-and-egg problem, but right now the cost/benefit analysis just isn't there.

Furthermore, the 2.x version of this is hideous (no fault of yours, @rmcgibbo, its' just how the language goes). I don't think many people will adopt it for 2.x, but I can see it working much better in 3.x.

For all these reasons, here's the path forward I see:

A new PR that focuses only on refactoring the completion API to make something like this possible. It cleans up any limitations of the current API, such as returning full line information to the completers, etc. We'll need to see how much backwards compatibility we break by doing so.
This is released as a standalone, small package that contains the api specification for how people should annotate their functions and the ipython extension they can install.

Note that the two parts above should be separate: people should be able to annotate their codes without creating any dependency on IPython. The annotations are just metadata, possibly aided by your tool. And only users of IPython would load the extension and thus take advantage of the metadata.

In doing so, I'd try to see if it's possible for the py3 version to require zero imports and rely strictly on the annotation protocol (maybe that's already the case, I didn't study all your examples in detail). That way we'd have a protocol for this, rather than a dependency.

@rmcgibbo, you've done a terrific job and we really appreciate it. So I hope you can continue working with us, we just need to find the right path forward.

fperez · 2013-02-21T01:35:58Z

For reference, the h5py custom completers do exist and rely on our API. So let's make every reasonable effort to come up with a solution that doesn't break their code, even if it means making a slightly uglier API (such as having the new one use different names and keep deprecated methods with the old functionality).

fperez · 2013-02-21T01:37:19Z

Also @rmcgibbo, have a look at that h5py completer to see if a similar regex-based hook isn't sufficient to trigger your completer machinery without further changes.

ellisonbg · 2013-02-21T05:43:07Z

@rmcgibbo I think the first step will be to remove everything from this PR that is not related to your refactoring of the core completer logic. I am hopeful that we can come up with a good API that doesn't require completely rewriting the inner workings of everything. I would try to think about what the simplest API is that will work for you. I am hopeful that a small amount of refactoring will suffice. I don't want you to feel like you have to completely rewrite the completer logic to implement your stuff.

As Fernando said, we really appreciate your work - please don't hesitate to bug us with questions...

takluyver · 2013-02-21T10:52:56Z

If we're talking about refactoring the core completer logic, I suggest we
do it in a new PR, rather than repurposing this branch.

On 21 February 2013 05:43, Brian E. Granger notifications@github.comwrote:

@rmcgibbo https://github.com/rmcgibbo I think the first step will be to
remove everything from this PR that is not related to your refactoring of
the core completer logic. I am hopeful that we can come up with a good API
that doesn't require completely rewriting the inner workings of everything.
I would try to think about what the simplest API is that will work for you.
I am hopeful that a small amount of refactoring will suffice. I don't want
you to feel like you have to completely rewrite the completer logic to
implement your stuff.

As Fernando said, we really appreciate your work - please don't hesitate
to bug us with questions...

—
Reply to this email directly or view it on GitHubhttps://github.com//pull/2701#issuecomment-13874117.

rmcgibbo · 2013-02-21T16:02:46Z

@takluyver agreed.

fperez · 2013-02-21T22:23:50Z

@rmcgibbo, just read your summary of the issues, this is fantastic work. Thanks so much, we'll continue the discussion off that. I'm very impressed.

rmcgibbo · 2013-02-21T22:28:01Z

@fperez, where is the best forum in which to continue the discussion? Once I've finished that document (I want to add more on the proposal), do you want me to ping this thread, or is another format preferable, such as using that text as the basis of a new "Proposal" GH issue?

takluyver · 2013-02-21T23:40:58Z

To echo @fperez , that looks like an excellent examination of the problems. If we can flesh out the proposed solutions a bit, I think making an IPEP for it would be the way forwards.

rmcgibbo · 2013-02-21T23:47:59Z

Got it.

fperez · 2013-02-22T08:19:57Z

Yup, go ahead and start an IPEP, it's the perfect way to move forward these kinds of more complex discussions. I just noticed typeahead.js from Twitter, I'd suggest thinking about what could be done with this kind of tool on the client side if we had a richer completion API.

We can always flatten the returned object into a simple list for readline, but our internal design shouldn't be constrained by that at all.

rmcgibbo · 2013-02-22T19:18:56Z

Thanks @fperez. That's awesome. So many good ideas.

fperez · 2013-03-06T23:41:01Z

@rmcgibbo, I'm going to be at Stanford giving a talk about IPython on Friday at 10am, in the Stats dept. How about we book some time later that day to pound on this IPEP?

rmcgibbo · 2013-03-07T00:20:28Z

@fperez, yeah, sounds good. I have a meeting with my advisor from 2-3pm, but otherwise I'm free. Your schedule is probably more constrained than mine.

Where is the talk? If it's a public seminar, I think some lab members and I might like to come.

fperez · 2013-03-07T00:22:32Z

SPECIAL TALK
DEPARTMENT OF STATISTICS
10:00AM, Friday, March 8, 2013
Huang Engineering 18

rmcgibbo · 2013-03-07T00:31:55Z

Awesome. I'll come up & find you after the talk and we can figure something out?

fperez · 2013-03-07T01:05:18Z

yes!

rmcgibbo added 7 commits December 17, 2012 18:57

Refactored a little

234adba

fixes

c217a2f

python3 fix

14c5ccf

Some more documentation. Improve error message

7575a3a

make it work for classes too

38db648

travis fix

851eb1e

fix of fix

b374e78

fix travis doctest issue with <TAB> in a sleazy way

22f2606

rmcgibbo added 2 commits December 18, 2012 18:42

Fix the doctest issue per @takluyver's suggestion

dbcb3ac

Add convenience decorator for annotating the return value in python2 …

b92bab4

…syntax

small bugfix and refactored out repetetive code

d3bd4fc

rmcgibbo closed this Feb 21, 2013

function specific tab completion via annotations #2701

function specific tab completion via annotations #2701

Conversation

rmcgibbo commented Dec 18, 2012

Carreau commented Dec 18, 2012

rmcgibbo commented Dec 18, 2012

Carreau commented Dec 18, 2012

bfroehle commented Dec 18, 2012

takluyver commented Dec 18, 2012

asmeurer commented Dec 18, 2012

rmcgibbo commented Dec 19, 2012

rmcgibbo commented Dec 19, 2012

rmcgibbo commented Dec 19, 2012

takluyver commented Dec 19, 2012

rmcgibbo commented Dec 19, 2012

rmcgibbo commented Dec 19, 2012

bfroehle commented Dec 19, 2012

rmcgibbo commented Dec 19, 2012

takluyver commented Dec 19, 2012

rmcgibbo commented Dec 19, 2012

rmcgibbo commented Dec 19, 2012

bfroehle commented Dec 19, 2012

takluyver commented Dec 19, 2012

minrk commented Feb 4, 2013

takluyver commented Feb 5, 2013

asmeurer commented Feb 7, 2013

Carreau commented Feb 7, 2013

Carreau commented Feb 7, 2013

ellisonbg commented Feb 20, 2013

rmcgibbo commented Feb 20, 2013

ellisonbg commented Feb 20, 2013

takluyver commented Feb 21, 2013

rmcgibbo commented Feb 21, 2013

ellisonbg commented Feb 21, 2013

takluyver commented Feb 21, 2013

fperez commented Feb 21, 2013

fperez commented Feb 21, 2013

fperez commented Feb 21, 2013

ellisonbg commented Feb 21, 2013

takluyver commented Feb 21, 2013

rmcgibbo commented Feb 21, 2013

fperez commented Feb 21, 2013

rmcgibbo commented Feb 21, 2013

takluyver commented Feb 21, 2013

rmcgibbo commented Feb 21, 2013

fperez commented Feb 22, 2013

rmcgibbo commented Feb 22, 2013

fperez commented Mar 6, 2013

rmcgibbo commented Mar 7, 2013

fperez commented Mar 7, 2013

rmcgibbo commented Mar 7, 2013

fperez commented Mar 7, 2013