Prompt manager #507

Merged
merged 12 commits into from Nov 30, 2011

Projects

None yet

4 participants

@takluyver
IPython member

I'm not expecting this to land before 0.11, but I thought I'd get it on the radar.

Every time I've wondered how prompts were produced, I've ended up rather confused. So this is an attempt at cleaning it up, with a new PromptManager class responsible for handling everything to do with the prompts. The critical part is its render method, which assembles the necessary information, then uses the string formatting introduced in Python 2.6 to fill in the prompt template.

I've expanded the definition of 'prompts' to include the auto_rewrite prompt ("------> " by default). So there are now four prompts: input, continuation, output, and rewrite.

This definition of prompts does not include input/output separators. For now, I've left those as attributes of the main InteractiveShell object.

@minrk
IPython member

Thanks, this looks good.

This should be part of the general move removing all prompt/separator/ui-specific code from the InteractiveShell object to frontends, which definitely won't make it into 0.11.

@takluyver
IPython member

Rebased to bring it up to date.

@minrk
IPython member

Can you hook up the qtconsole prompts to your new PromptManager as part of this?

I think making that change will help us make the right decisions on the PromptManager, since it will require support for things like HTML prompts without ANSI colors, divorcing the prompt from the InteractiveShell object, and executing things like os.getcwdu() either locally or in the kernel.

@takluyver
IPython member

Does it make sense to simply generate an HTML prompt in the kernel, and send it over? Or, more generally, have the frontend make a prompt_request with a template (which could be in any format), and the kernel replies with the filled in version?

@minrk
IPython member

The kernel should probably know exactly nothing about prompts.

@takluyver
IPython member

Pragmatically, though, I suspect it will, at least to the extent of filling in slots in a template.

@minrk
IPython member

Possibly, but hopefully not. Any such information should be accessible via the user_expressions/user_variables fields of execute requests.

@takluyver
IPython member

OK, that makes more sense, I didn't know about those fields. So the frontend needs to parse the format string, work out which variables it needs, and retrieve them from the backend. I think we still need to think a bit more about how this works, though, because a lot of the details that could go in the prompt aren't accessible from the user namespace without importing extra modules (like os or time).

@fperez
IPython member

Yes, the idea was to have frontends request whatever information/expressions they want along with each new prompt (possibly including calls to things like os.getcwd() or time.time(), etc). Then they can build whichever display they want with this. In principle, the same kernel could be monitored by two clients that offer different levels of customizability for the prompts.

So the only thing the kernel supplies always is the execution counter, plus whatever variables or expressions may be requested in this form.

@fperez
IPython member

Thomas, this overall looks pretty good, and even if it doesn't get us all the way in terms of restructuring the prompt system as we'd like to, it's probably worth merging now. The prompt rework will likely take a few passes until we have a full separation of duties between kernels (who just handle the namespace) and frontends (who present data from that namespace to the users in various ways, including possibly in a prompt).

But there's one thing we should think about: prompts were the one thing for which we kept using Itpl, because it allows the interpolation of arbitrary python expressions, while the new-style string formatting only allows simple variable name access, nothing else. And I know there are projects out there embedding ipython and making weird prompts that update dynamically with function calls, something like: "${some_function()[somearg]}> ".

So it seems your change removes that ability, right?

Now, I'm not totally against that, as long as we plan to complete the transition into a system that allows users to achieve their goal again, even if by different means (i.e. by asking the kernel for these things and using them to build the prompt).

@minrk
IPython member

@fperez the EvalFormatter I used in the parallel templating code is a step towards exactly that. It doesn't support more than super-simple evals, but that shouldn't be hard. It's in utils.text.

For more arbitrary execution, you can do:

class EvalFormatter2(Formatter):
    def get_field(self, name, args, kwargs):
        v = eval(name, kwargs)
        return v, name

fmt = EvalFormatter2()
import os
ns = dict(os=os)
fmt.format(' we are in {os.getcwdu()}...', **ns)
# returns 'we are in /home/you/...'
@takluyver
IPython member

Plain string formatting can handle attribute.access and item[access], but specifically doesn't handle function calls. It should be trivial to use Min's EvalFormatter here instead.

@minrk
IPython member

Note that neither my EvalFormatter in utils.text nor the one I typed up in 5 minutes above are finished implementations (at least something breaks in both), but it shouldn't be hard to get something that works.

@fperez
IPython member

I guess one question is: do we really need to write a new one? I mean, Itpl may be old code, but it works and hasn't needed major updating. So what is the argument for effectively reimplementing itpl?

@takluyver
IPython member

Well, just that we can do essentially the same thing with a fraction of the code. Also, Itpl parses the string character-by-character in Python code, so we should get better performance from string formatting. I had the idea of dropping some of our external dependencies, and there are only a couple of places that we actually use Itpl. But I'm not going to push the issue.

@minrk
IPython member

I don't think Itpl handles unicode at all. So either we need to reimplement Itpl with real string formatting (I doubt such a formatter would be more than ~20 lines), or we need to fix Itpl for unicode support.

@fperez
IPython member

Fair enough, I'm ok with both of these arguments. I just wanted to see them :)

So Thomas, do you want to finish up a full EvalFormatter and we can merge this puppy?

@fperez
IPython member

ps - we're on IRC now if you want to have a look at any of this right away.

@takluyver
IPython member

I could do with hitting the sack now. @minrk, what breaks in the current implementation of EvalFormatter? And is it a condition that we should care about? The only one I'm currently aware of is that you can't readily put a dict literal ({}) inside an expression in a string (it might work if you double both braces, although I've not checked).

@fperez
IPython member

A few quick tests of EvalFormatter2 show that it probably supports everything we need for this PR:

In [2]: from string import Formatter 

class EvalFormatter2(Formatter): 
    def get_field(self, name, args, kwargs): 
        v = eval(name, kwargs) 
        return v, name 

fmt = EvalFormatter2() 
import os 
ns = dict(os=os) 
print fmt.format(' we are in {os.getcwdu()}...', **ns)

we are in /home/fperez/ipython/notebooks...

In [3]: print fmt.format(' we are in {os.getcwdu()+"hi"}...', **ns)

we are in /home/fperez/ipython/notebookshi...

In [5]: print fmt.format(' we are in {os.getcwdu()}, a is {a} and a0 is {a[0]}...', 
                 **ns)

we are in /home/fperez/ipython/notebooks, a is [1, 2] and a0 is 1...

So @takluyver, it seems to me that if you update the code to use this (perhaps providing EvalFormatter2 as utils.text.FulEvalFormatter), we'd be good to go.

@minrk
IPython member

Lots of stuff (function calls) don't work in the EvalFormatter in master. EvalFormatter2 seems to work for just about everything, but slicing definitely doesn't work: fmt.format("we are in ...{os.getcwdu()[-10:]}", os=os). It raises a SyntaxError, because the token is split at the colon for some reason, so os.getcwdu()[-10 is what gets eval'd.

Like I said above, I didn't spend more than 5-10 minutes on either of these, and I had never looked at any Formatter code before. So perhaps a more careful eye (and a test suite for various known Itpl use cases) would be prudent.

@fperez
IPython member

OK, the issue is that ':' is part of the format specification, see http://www.python.org/dev/peps/pep-3101/. Specifically:

Format Specifiers
    Each field can also specify an optional set of 'format
    specifiers' which can be used to adjust the format of that field.
    Format specifiers follow the field name, with a colon (':')
    character separating the two:

        "My name is {0:8}".format('Fred')

so the parser splits on ':' and returns the stuff after the colon as the format specifier.

Which means that this syntax simply is incompatible with arbitrary Python expressions.

That doesn't mean we can't use it, just that we'd be setting some restrictions on what kinds of things can go in there.

So Thomas, if you can do a little audit of our uses of Itpl and see if a more restrictive version based on EvalFormatter2 (hopefully with a nicer name) isn't a problem, then I have no issues with ditching itpl.

And I hope we'll find the time to complete the prompt machinery for frontends before 0.12.

@minrk
IPython member

@fperez, I see, so possibly slicing is out, but it's still possible that even the colon-splitting is implemented in a method that we can easily override. If we can take over right at where the Formatter decides what to do with something inside '{...}', then we should be able to just eval that block, and be done.

I don't see any reason for the new one to not just replace EvalFormatter. I just called the example I tossed up EvalFormatter2 because it was my second 10-minute attempt at writing a formatter that evals code. There should be no need for more than one of these.

@minrk
IPython member

Turns out it's actually pretty easy. I looked at the string.Formatter source, and the string docs, and all I have to do is override the _vformat method instead of get_field/get_value like I was doing. Now I can execute arbitrary code, and with a switch interpret ':' for slicing (disabling the format_spec part of fancy-formatting), or leave it as-is, preventing slicing (obviously you can't have both).

I'll do a PR soon, but I'm writing some tests first. For now a preview:

class EvalFormatter(Formatter):
    """A String Formatter that allows evaluation of simple expressions.

    Any time a format key is not found in the kwargs,
    it will be tried as an expression in the kwargs namespace.

    This is to be used in templating cases, such as the parallel batch
    script templates, where simple arithmetic on arguments is useful.

    Examples
    --------

    In [1]: f = EvalFormatter()
    In [2]: f.format('{n/4}', n=8)
    Out[2]: '2'

    In [3]: f.format('{range(3)}')
    Out[3]: '[0, 1, 2]'

    In [4]: f.format('{3*2}')
    Out[4]: '6'

    """

    # should we allow slicing by disabling the format_spec feature?
    allow_slicing = True

    # copied from Formatter._vformat with minor changes to allow eval
    # and replace the format_spec code with 
    def _vformat(self, format_string, args, kwargs, used_args, recursion_depth):
        if recursion_depth < 0:
            raise ValueError('Max string recursion exceeded')
        result = []
        for literal_text, field_name, format_spec, conversion in \
                self.parse(format_string):

            # output the literal text
            if literal_text:
                result.append(literal_text)

            # if there's a field, output it
            if field_name is not None:
                # this is some markup, find the object and do
                #  the formatting

                if self.allow_slicing:
                    # override format spec, to allow slicing:
                    field_name = ':'.join([field_name, format_spec])
                    format_spec = ''

                # eval the contents of the field for the object
                # to be formatted
                obj = eval(field_name, kwargs)

                # do any conversion on the resulting object
                obj = self.convert_field(obj, conversion)

                # expand the format spec, if needed
                format_spec = self._vformat(format_spec, args, kwargs,
                                            used_args, recursion_depth-1)

                # format the object and append to the result
                result.append(self.format_field(obj, format_spec))

        return ''.join(result)

And the difference between this and the base Formatter is tiny. If we disallow slicing, then the EvalFormatter2 above is cleaner and identical to this when allow_slicing=False.

[edit: obviously without the print statement I had initially]

@minrk
IPython member

EvalFormatter updated: #716

Just pass the namespace in which you want the code to execute as the kwargs to format.

@fperez
IPython member

OK, I've merged #716, so that @takluyver can update this one cleanly, to use the new EvalFormatter that has full slicing support for these situations.

We do need to make one design decision: for these kinds of expressions, do we value more the ability to do slicing or the ability to specify formatting (i.e. using the ':N' syntax for format specification)? I'm leaning towards the latter, both to keep this syntax in sync with normal Python formatting syntax and because fine-tuning formatting seems to be more likely to be useful in producing things like prompts than slicing. And if slicing is needed, people can always hide it behind a little utility function they can write, so it's not like slicing is impossible.

But I'd like to get some consensus before we pull the trigger on the final design for this part.

@takluyver
IPython member

I'd lean towards having format string enabled as well. If people do want slicing, they don't even need a utility function: x[slice(1,10)] is equivalent to x[1:10]. To give a particular use case, we could expose the timestamp as a datetime object, which can be formatted using the format string.

@minrk
IPython member

It is a hard choice. The reason I discovered that slicing didn't work in the EvalFormatter2 above, is the first example I tried was a truncated version of cwd:

f.format("...{os.getcwd()[-10:]}", os=os)

however

you can actually get slicing by using slice objects, and never need a colon:

f.format("...{os.getcwd()[slice(-10,None,None)]}", os=os)

So you can get slice behavior without colon notation, if slightly inconvenient to type, but you can't get formatting-notation back in any reasonably way if you allow slicing. Further, if we don't want to support the slicing, the three-line EvalFormatter2 implementation above is much simpler, and probably preferable (and functionally identical when allow_slicing=False).

@fperez
IPython member

OK, it seems we're in agreement that give how slice(a,b,c) is an option (if slightly less convenient), that's the way to go. In that case, we should perhaps go with the really simple EvalFormatter2 (renamed) and be done with it.

It's still useful to have the full one @minrk wrote, as it gives us a replacement for itpl with the modern python syntax and unicode support, but for this we should probably stick to the easy, comprehensible 3-liner.

@takluyver
IPython member

The only other place we're using Itpl is for expanding variables in %magic and !system calls. My gut feeling is that slicing is more useful than format strings there, but on the other hand we may want to keep closer to the standard Python format as people get more used to that.

@minrk: Should EvalFormatter2 replace the existing EvalFormatter that we shipped with 0.11? I think the main difference is that the existing EvalFormatter will treat {0} as referring to a positional argument, but EvalFormatter2 will treat it as a literal 0.

@minrk
IPython member

I think the EvalFormatter has no need to support positional arguments. I would just replace the 0.11 EvalFormatter with EvalFormatter2 under the EvalFormatter name, and keep the big one that supports splitting around as a separate class (FullEvalFormatter, RichEvalFormatter, or something), just in case there is a desire to split args somewhere down the line.

I think of the EvalFormatter as evaluating code in a namespace. Positional args don't really make sense in that context.

The Itpl syntax is really much nicer for expanding variables (especially in the shell profile when it is restored), but it's not very Python, and as we have discovered doesn't play nice with unicode.

@fperez
IPython member

@takluyver, did you see my message on-list about the idea of a frontend_ns to handle this issue? The more I think about it, the more it seems it's the way to go, as it gives us a clean solution to the prompt namespace question as well as a place for other things frontends may legitimately want to do in isolation from the user's namespace.

Now, we could go ahead and finish merging this before tackling that to ensure this doesn't bitrot away... What's your take?

@takluyver
IPython member

Hi Fernando, I did see your message - sorry, I've been a bit busy for the last week or so.

I think a frontend_ns makes sense, but a few questions about it:
1. Do we add frontend_variables & frontend_expressions fields to the execute request message? Or add another message type for accessing those?
2. What goes in the frontend_ns initially? I guess os and sys are obvious choices, along with the cwd-truncating functions currently defined for prompts. And something for timestamps. Anything else?
3. How can the frontend manipulate what's in the frontend_ns, e.g. for importing extra stuff? Can it send arbitrary Python code to be executed? Or a specific way to request an import? Or a way to "push" variables, perhaps from the user_ns (or even from the frontend, if it's in Python)?

As for merging this now - do you think this is the general structure of what we want to go forward with? It has the advantage that it makes the prompts code much more tractable for other people to work on. But if large parts are going to be redone, it might lead people down the wrong track. First, though, I need to sort out the various EvalFormatters. I'll get onto that now.

@takluyver
IPython member

I've reintroduced the very simple EvalFormatter, and made the slicing-enabled one FullEvalFormatter.

@fperez
IPython member

No worries :)

  1. I'm thinking right now that the cleanest approach would be to use the standard execute request message, but simply add an optional field to it indicating the target namespace. Normal messages would go (as they do now) to the user_ns, but if frontend_ns is specified as the target, then they would get executed there.

  2. I'd say we don't put anything in there other than perhaps our prompt-manipulation machinery (since that's an obvious use case). With 1 above, frontends can simply execute import statements.

  3. Via 1, it's taken care of.

Currently our spec has the notion of user_variables and user_expressions. Those would be interpreted as variables and expressions in the execution namespace (perhaps we should change these names accordingly).

So basically, all we'd need is to add a new optional field, namespace, to the execute_request message. And the behavior would be such that if namespace != user_ns, then silent=True automatically, so that the user counter isn't auto-incremented behind the user's back.

How does this sound?

@takluyver
IPython member

So each namespace would have a standard string name, e.g. "user_local"? That seems a reasonable way to do it. Although I should note that technically you evaluate or execute code in two namespaces - local and global - just to complicate matters a little.

On the other hand, @ellisonbg has a point (from his email) that this should probably be kept as simple as possible, because we don't want the burden of complex code for a fairly minor feature. Is there a simpler way to handle prompts for the frontends? The idea I had previously was that the frontend would send a prompt template, and the kernel would return a formatted prompt (although the frontend still needs some flexibility so it can update prompt numbers).

@fperez
IPython member
@takluyver
IPython member
@fperez
IPython member
@takluyver
IPython member
@ellisonbg
IPython member

I think there are some good points here. If multiple frontends want to watch the value of a variable, then this data needs to be published on the iopub channel. There would need to be a way of registering variables that get published in this way after each execution. Also, there would need to be a way for those variables to become modified. For that the simplest thing is to use the user_ns, but you might want to compute some expression each time a variable is published (like os.getcwd()).

@minrk
IPython member

Brian,

What you are describing does indeed sound quite useful, and would suggest something along the lines of a 'state change' message. frontends could register names or expressions to check, and a check_state() call is made at the end of each execute request, publishing its results. Of course, this could add arbitrarily expensive computations to every execute request.

@ellisonbg
IPython member

I am also thinking about how we can design things to enable "Manipulate" style rich widgets to exist in the browser and receive updates as a computation progresses. This type of design might help in that context as well.

@minrk
IPython member

Do we want to merge this PromptManager work into 0.12, since it seems like the user_expressions / zmq frontend prompts still has some discussion to resolve?

@fperez
IPython member

@takluyver, just noticed this one needs a rebase... I can work on it later if you want, we might be able to flush this and your other two before 0.12...

@takluyver
IPython member

Rebased, run the tests on 2.7 and 3.2.

@fperez
IPython member

@takluyver, I was going to start testing this puppy but a conflict seems to have appeared between yesterday and today, sorry... The conflict is tiny and I can fix it here while testing, but you may want to rebase this guy.

@fperez
IPython member

Also note that the test suite doesn't pass anymore. Make sure to clear your .pyc files first to see the problem. It's the same reason as the conflict, simply that lib.pylabtools got moved to core.pylabtools.

@fperez
IPython member

Sorry about those, it was my emergency merge last night that caused these. Fortunately it's a very easy fix.

@fperez
IPython member

Other than the above small problems, here's a proposal for moving forward with this one. The code looks solid, and it's indeed already an improvement and cleanup of the prompt machinery, so I don't see a reason for holding it back further, risking more (and possibly non-trivial) merge conflicts popping up.

The only thing I see missing is a good description of the changes in the docs, since the format of the special expressions for prompts has now changed as we move away from itpl.

As far as the larger discussion on the whole frontend_ns ideas, we should probably not rush into it. @ellisonbg's caution is often wisely guided, and I find @minrk's argument about non-atomicity of a separate request and the possibility of it being 'backed up' in a busy kernel very compelling.

The whole question of how to let multiple clients interact with a kernel regarding 'sideband' information (i.e. stuff beyond the raw execution of pure code blocks for the user) is a very tricky design problem, and we're just feeling our way through it. So it's probably best to go in small steps and see what works. We can keep thinking about how to serve those needs slowly, and for now the user_var,exp system will probably be sufficient in most cases. We might add some functionality to register variables protected of deletion in %reset so that people with special prompts can safely use %reset.

How does that sound? We could move forward with this (pending rebase and docs) and then we'll see how things work out in practice with the prompt system, and whether we can continue to work just with user_vars/exp or not. Given we have no actual use of even those components yet, it may be premature to extend the system even further.

@takluyver
IPython member

I've just rebased. I've done my best to clear .pyc files, and import IPython.lib.pylabtools fails, but I'm not seeing any test failure. I'll look into docs later.

@fperez
IPython member

Yes, pylabtools was moved to core. But that change was made in master already, so a rebase should pick it up for you; did it not?

@takluyver
IPython member
@takluyver
IPython member

Just to mention: I haven't got time to go through the docs tonight, but hopefully I can get to it tomorrow.

@fperez
IPython member

OK, yes: working correctly is what you should expect. No problem with the docs, thanks for taking care of them.!

@takluyver
IPython member

I've updated the docs; let me know if you spot any other bits that need to be updated.

Still to do:

  • Update pysh profile to use new prompt config system.
  • Work out why %config doesn't seem to see PromptManager. Probably a trivial fix.
@takluyver
IPython member

I've tidied those bits up - I think this is all OK now.

@fperez
IPython member

Is the test suite passing for you? I'm having some very bizarre problems with the test suite right now on my box at work, but they are odd enough that I think it's my problem and nothing with the codebase. What do you get?

@takluyver
IPython member
@fperez
IPython member

OK, great. Let's merge this then as-is, and we'll continue mulling the deeper frontend namespace questions later. No point in holding this any further, lots of work has gone into it. Thanks!

@fperez fperez merged commit 272152c into ipython:master Nov 30, 2011
@takluyver
IPython member

Thanks, Fernando. It's an odd feeling, not having an open pull request against IPython ;-).

@fperez
IPython member
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment