Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPEP1: Cell magics and general cleanup of the Magic system #1611

Closed
fperez opened this issue Apr 16, 2012 · 12 comments
Closed

IPEP1: Cell magics and general cleanup of the Magic system #1611

fperez opened this issue Apr 16, 2012 · 12 comments
Assignees
Milestone

Comments

@fperez
Copy link
Member

fperez commented Apr 16, 2012

Note: the final implementation of this was done in #1732, and there are some differences in the details with the API proposed here. But by and large this document was implemented as described here.

This issue is meant to be the place to gather all feedback and discussion about IPEP1, a proposal to clean up the magic system and add cell-level magics. As the discussion evolves I will continue updating the main document, here we only keep the most current version for convenience of reading.


IPEP 1: Cleanup and extension of the Magic system in IPython

This document reviews the status of the magic command system in IPython and
proposes an extension of magics to work in multiline contexts, at a 'cell'
level. The most obvious use of this proposed extension will be the notebook,
but the extension will similarly work in the Qt console and even at the
terminal.

In the spirit of Python PEPs, this document is marked as IPEP 1, the first
'IPython Enhancement Proposal'.

Background

Early since its start, IPython has had a system of 'magic' commands, which in
its current incarnation uses (optionally) a % prefix to indicate special
commands that live in a separate namespace. These commands have single-line
scope: when IPython encounters the line

%foo --flags  arguments

it checks whether foo is registered as a magic command, and if so, it calls
it passing the entire rest of the line, as a string, as the only argument to
the magic.

For historical reasons, implementation-wise the magic system is a fairly nasty
hack: by default, a magic %foo is actually the method magic_foo of the main
IPython InteractiveShell object, with signature magic_foo(self, parameter_s).

We offer a mechanism for users to register new magics defined from standalone
functions, by using the define_magic method of the main IPython object, as
follows:

# Define your function here
def foo_impl(self, parameter_s=''):
    'My very own magic!. (Use docstrings, IPython reads them).'

# Register it as a magic
get_ipython().define_magic('foo', foo_impl)

Over the years we have reduced to a minimum the intertwining of the various
magic methods with the main object itself, hoping one day to completely
separate the magics into standalone objects, thereby reducing significantly the
footprint and complexity of the main object.

Specific goals

This proposal seeks to accomplish mainly two goals:

  1. Finish up the aforementioned separation of the magics away from the main
    IPython object. This will allow, amongst other good things, users to define
    their own magics by subclassing a lightweight base object. This is not
    possible today since the main magic object is enormous and contains every
    default magic method in its implementation.
  2. Extend the concept of magics to operate on multi-line blocks of text,
    introducing the concept of cell magics.

These two will be discussed separately, starting with the conceptually more
interesting cell magics (goal #1 is mostly just an implementation cleanup).

Cell-level magics

We propose to introduce the concept of a cell-level magic, akin to how Sage
uses the % syntax at the cell level. Sage uses the line-magic syntax from
IPython in its notebook with a cell-wide meaning; here we propose to keep
separate line- and cell-level magics, and our implementation will have a number
of details different from how Sage does it. But the user-facing behavior will
be very similar.

The idea is most easily illustrated with an example. Consider a cell (in the
notebook or Qt console, we'll discuss later how this can work in terminal
clients) that contains:

#!foo --flags  args
text - line 1
text - line 2
...
text - line N

In this case, if foo is a cell magic, it will be a function or method called
with two arguments as:

foo('--flags args', 'text - line1\n...text - line N')

That is, a cell magic will be passed as a first argument the (possibly empty)
rest of the line on which it was called, and as a second the body of the cell
after the first line and until the end.

Execution semantics

In practice, cell magics (just as line magics) will be methods of an object
that always has a self.shell attribute pointing to the main IPython
InteractiveShell instance. The execution logic will be the following: IPython
will return to the user, as the output of the cell the result of the call above
to foo(...), with the only caveat being that it will trap any unhandled
exceptions.

This means that if a user implements a magic meant to only do some rewriting of
the input (for example to support an alternate syntax), this magic will still
be responsible for calling IPython's execution machinery with the transformed
output.

This choice of execution semantics is the only option if we want to allow these
magics to have complete freedom on what they do with their input text
implementation-wise. While there will likely be many magics meant to do simple
transformations of their input meant later for regular execution, others may
dispatch their input to be run by external programs, for example. Therefore
there is no generic output API we can impose on them.

Choice of sigil

The sigil proposed above, #!, follows from the common pattern of unix scripts
whose first line may start with this same sigil (the 'shebang') to indicate
what program is meant to execute the rest of the file. In that regard, cell
magics behave very similarly and therefore it seemed appropriate to rely on
familiarity to make the concept easier to understand for new users.

The major downside of this sigil is that it requires two different
characters, and hence is more annoying to type in cases of repeated use of the
same magic.

Some other possible sigils we can consider: %%, //, >, &, $.

These are either binary operators or invalid syntax, hence they are all
meaningless at the start of a cell. I haven't listed every possible binary
operator, just the ones I felt could provide good readability and ease of
typing. Other possible alternatives can obviously be discussed.

Of these, I find the following as particularly good candidates:

  • %% dovetails nicely with the current % for line magics.
  • $ is fully invalid Python syntax, easy to type and common in programming
    languages.

Implementation-wise, a single-character sigil is a bit more convenient.

Possibilities

If we adopt this proposal, a number of interesting possibilities can be
implemented, such as (ignoring the sigil choice here):

  • timeit, prun: extending these timing/profiling utilities to work on whole
    cells instead of requiring the user to cram everything in one line.
  • cython: allow the user to type cython code and load it automatically (this
    is extremely useful in Sage). Similar things can be done with cython.inline
    and f2py for inlining C/C++ and Fortran.
  • R: a magic could keep a connection to an R interpreter, and allow the user
    to type in blocks of R code, optionally pulling back results to the user's
    python namespace automatically.
  • sh: pass everything to the system shell for execution, without having to
    prepend each line with ! separately.

These are just a few simple examples to motivate the utility of the feature,
ultimately it will be up to the users to develop useful cell magics.

Terminal use

While the terminal client doesn't have the concept of a cell, we can still
accomodate cell level magics in this environment, as follows. If a cell level
magic is detected, the code path in the main IPython object that calls it will
check first if there's any content in the cell itself (terminal clients will
only have the first line, so they will have no cell content). In this case, it
can use raw_input() to ask the user to input the content of the cell, prior
to making the call.

Since this behavior is not desirable in the notebook or qt console, it will be
off by default, and turned on only by the in-process terminal client or out of
process console clients who initialize their own kernel. In all other cases it
will be off, which simply means that a console client who connects to an
existing kernel started by a notebook or Qt console will not have the ability
to type cell magics. This is a very small restriction that is a reasonable
compromise to keep the overall execution model simple and predictable.

Stacking cell magics

We consider the possibility of 'stacking' multiple cell magics akin to how
stacked decorators work in Python, e.g.:

#!magic1 args...
#!magic2 args...
#!magic3 args...
...
cell body
...

Semantically, these would be applied bottommost-first to match how stacked
decorators work in Python.

However, we must note an important difference here that complicates this idea:
the api of decorators is very simple: they take a function as input and they
return a function. In contrast, we've said that the input to a magic would be
the body of the cell, but the magic can return any kind of output it wants.
This means that, after one cell magic is applied, the result is not necessarily
textual anymore, but instead it can be anything returned by the magic.

For this reason, we will most likely defer the idea of stacked magics until we
have more experience with the basic system to better inform the decsision.

Implementation details and separation from main IPython object

We propose to stop having the current Magic class be a mixin used in
InteractiveShell, and instead we will refactor the basic Magic to be a
simple class with all the machinery for magic functions, but none implemented.
Then, classes that wish to implement new magics can subclass this base class
and provide their own methods.

A single class can provide more than one line magic and more than one cell
magic if desired; this eliminates the need to create many unnecessary objects
when common functionality can be shared, as well as allowing stateful magics
(such as a hypothetical R one that would keep a live R interpreter) to expose
multiple user-facing entry points with a single copy of the state.

To register line and cell magics, the class will declare two attributes:
line_magics and cell_magics. Each of these will be a list of names, that
must correspond to methods with the actual implementation, using the convention
that line magic methods are named magic_$name and cell magic methods are
named cmagic_$name. A simple example should make it clear:

class MyMagics(Magic):
    line_magics = ['foo', 'bar']
    cell_magics = ['foo', 'baz']  # the same name can be used in both 
                                  # line and cell magics

    def magic_foo(self, line):
        "The line magic %foo"

    def magic_bar(self, line):
        "The line magic %bar"

    def cmagic_foo(self, line, cell):
        "The cell magic #!foo"

    def cmagic_baz(self, line, cell):
        "The cell magic #!baz"

The justification for having these lists is to avoid having to manually scan
the entire namespace of these objects at registration time. A small amount of
duplication of information at object creation time lets us do the registration
in a more efficient manner. We keep the implementation methods organized with
the magic_ and cmagic_ prefixes to ensure there will never be any name
collisions between the functionality of the base class (which may evolve over
time) and any methods users may choose to implement in their own magics.

The signature of the constructor will be such that by default, when a Magic
object is initialied all of its magics get registered, but this behavior can be
overridden to invoke the registration method manually later on.

Furthermore, new magics can be added to an existing instance at runtime; these
will need to be registered manually. We will update our implementation of the
define_magic method to do this with the same signature (so user code will not
need to be modified in this transition). We will also add a partner
define_cell_magic to do the same thing with cell magics. These two methods
will operate on an instance of the Magic class that will carry no other
manually defined magics, and hence can be used to store all user-added magics
that call these functional entry points.

As an alternative to explicit lists of names, we could instead use decorators
to tag specific methods as line/cell magics:

class MyMagics(Magic):

    @magic
    def foo(self, line):
        "The line magic %foo"

    @cell_magic
    def foo(self, line, cell):
        "The cell magic #!foo"

Conversion of the current codebase

By now, our Magic objects only manipulate the main IPython object via their
self.shell attribute, so converting the current codebase to this architecture
should be fairly straightforward. We will break up the large Magic object into
the base class and a few (probably no more than 3 or 4) objects carrying all
our current builtin magics. Since we are preserving the magic_ naming
convention we already use, this conversion should be straightforward and very
low-risk.

@ghost ghost assigned fperez Apr 16, 2012
@minrk
Copy link
Member

minrk commented Apr 16, 2012

Thanks for doing the writeup!

I'm not 100% fond of having magic/cmagic prefixes be separate. The signatures proposed would suggest that cell magics args are a clear superset of line magics, so for instance things like %timeit could be:

class TimingMagic:
    line_magics = ['timeit']
    cell_magics = ['timeit']

    def magic_timeit(self, line, cell=None):
        if cell is None:
            cell = parsed_out_of(line)
        ...

But it also sounds like derivatives would have no need to follow this naming convention - an explicit mapping of magicname:callable is made by define_magic, right? But the default in our Magic constructor would expect this naming convention.

Why not use decorators, instead of manually typed lists, so you could do:

class MyMagic(Magic):

    @magic('foo1')
    def foo(self, line):
        pass

    @cell_magic('cfoo')
    def bar(self, line, cell):
        pass

And these decorators would construct the lists to expose.

@takluyver
Copy link
Member

Another use case I ran into the other day: someone wanted to use a pandas dataframe like this (in the syntax I vaguely envisaged):

%indf mydata[mydata.age > 20]:
    plot(weight, height)    # these are columns from the dataframe

I'm not convinced by the shebang sigil. It makes sense for something like #!R, where the text is actually being run by that executable, but not so much for transforming Python code, to my mind. Users who see #!R might also expect that they can put any shebang line to send the code to that executable. Come to think of it, that idea might be worth considering separately.

Would %% work as a sigil? It makes it clear that line and cell magics are related concepts, and typing a repeated character isn't much harder than a single character.

@jstenar
Copy link
Member

jstenar commented Apr 16, 2012

How about piped cellmagics? I mean it might be a convenient to be able to apply cellmagics after each other.

Will cell magics also work on non-code cells?

@minrk
Copy link
Member

minrk commented Apr 16, 2012

@jstenar - I would not think magics would apply to anything but code cells, as code cells are the only ones that involve any kernel communication at all. Everything else is strictly in the browser.

@ellisonbg
Copy link
Member

I am also not convinced that #! is the best sigil. For technical reasons I am assuming we can't use the ingle % that we use the line magics? I do sort of like the `%%~ idea of @takluyver.

In terms of the pipelining of the cell level magics, I am not sure how that would work. Would you just feed the output of one to the next? This would allow users to build chains of magics similar too how decorators work. Can anyone think of a good usage case for this pattern?

@takluyver
Copy link
Member

In terms of the pipelining of the cell level magics, I am not sure how that would work. Would you just feed the output of one to the next? This would allow users to build chains of magics similar too how decorators work. Can anyone think of a good usage case for this pattern?

Perhaps it makes sense to look at the different things cell-level magics might do:

  • Transform Python (or Python-like) code before the cell is executed, like a preprocessor.
  • Affect the context in which the cell runs, like a decorator or context manager (but with extra powers, like manipulating the namespace). timeit and indf are examples of this. Like decorators and context managers, it would make sense to stack these.
  • Handle the cell as something other than Python code, like compiling it with Cython, or running it in R.

@fperez
Copy link
Member Author

fperez commented Apr 16, 2012

Thanks everyone for the feedback! I'll reply to each now...

On Sun, Apr 15, 2012 at 11:16 PM, Min RK
reply@reply.github.com
wrote:

Thanks for doing the writeup!

I'm not 100% fond of having magic/cmagic prefixes be separate.  The signatures proposed would suggest that cell magics args are a clear superset of line magics, so for instance things like %timeit could be:

class TimingMagic:
   line_magics = ['timeit']
   cell_magics = ['timeit']

   def magic_timeit(self, line, cell=None):
       if cell is None:
           cell = parsed_out_of(line)
       ...

I think a good justification for keeping them separate is that it's
perfectly OK to have a magic that is only line-oriented, and whose
signature therefore will be (self, line). It seemed to me cleaner
to keep the two types of signatures separate rather than blending them
like this, to simplify the implementation logic needed for people who
may only do one thing or the other. Furthermore, if someone wants the
semantics of interpreting the line to be a little different when
called in a cell manner (which people could have a valid reason to
want), merging the signatures into one makes that more complex.

So my vote is to keep them separate, but I'm happy to reconsider upon
further discussion.

But it also sounds like derivatives would have no need to follow this naming convention - an explicit mapping of magicname:callable is made by define_magic, right?  But the default in our Magic constructor would expect this naming convention.

Correct, the choice of convention was to minimize the amount of typing
needed: if we allowed arbitrary naming in the class version, users
would have to supply a mapping from magic name to method name.

Why not use decorators, instead of manually typed lists, so you could do:

class MyMagic(Magic):

   @magic('foo1')
   def foo(self, line):
       pass

   @cell_magic('cfoo')
   def bar(self, line, cell):
       pass

And these decorators would construct the lists to expose.

Yes, that could be done as well. In fact, with a bit of care, the
argument could even be optional so the decorators could be invoked
'naked', reading the name from the method name itself.

Would people in general prefer this? It does have the issue of
potential name conflicts with the base class, which are completely
avoided if we follow the prefix convention.

As we iterate on these questions, I'll do my best to update the main doc.

@fperez
Copy link
Member Author

fperez commented Apr 16, 2012

On Mon, Apr 16, 2012 at 3:26 AM, Thomas Kluyver
reply@reply.github.com
wrote:

Would %% work as a sigil? It makes it clear that line and cell magics are related concepts, and typing a repeated character isn't much harder than a single character.

True, and that's another one I'd considered but forgot to put in.
I'll add it to the list of candidates. It does make the parsing a bit
more complicated since we can't assume that %??? is automatically a
line magic, but it's doable (having a separate sigil makes that part
easier).

@fperez
Copy link
Member Author

fperez commented Apr 16, 2012

On Mon, Apr 16, 2012 at 9:01 AM, Jörgen Stenarson
reply@reply.github.com
wrote:

How about piped cellmagics? I mean it might be a convenient to be able to  apply cellmagics after each other.

Very good point, I've been thinking about this but also forgot to put
it in the doc. I'll add some thoughts about it in the doc. It would
make the implementation a bit trickier, but it's doable. Though I
might punt on this particular feature on the first implementation,
it's something we can always add later.

Will cell magics also work on non-code cells?

No, I concur with @minrk on this one. This is meant strictly for
cells that go to the kernel for execution, non-code cells are purely
client-side stuff.

@fperez
Copy link
Member Author

fperez commented Apr 16, 2012

On Mon, Apr 16, 2012 at 12:20 PM, Brian E. Granger reply@reply.github.com wrote:

I am also not convinced that #! is the best sigil.  For technical reasons I am assuming we can't use the ingle % that we use the line magics?  I do sort of like the `%%~ idea of @takluyver.

Yes, I'll add %% to the list. Once we settle the main ideas down,
we can have a quick discussion on the sigil choice.

In terms of the pipelining of the cell level magics, I am not sure how that would work.  Would you just feed the output of one to the next?  This would allow users to build chains of magics similar too how decorators work.  Can anyone think of a good usage case for this pattern?

Yes, I was exactly thinking of stacked-decorator-style execution. I could imagine writing

%%timeit
%%cython
...

for example. Actually using timing/profiling magics on top of anything else is probably a really useful feature to have, and it would need the ability to stack the cell magics.

@fperez
Copy link
Member Author

fperez commented Apr 17, 2012

Folks, I've added the new %% sigil as well as a section on stacked magics. Note that upon further reflection, it's not clear at all to me how that would work, see the new text for details.

@ellisonbg
Copy link
Member

I agree that the stacked magics are a nice idea but that it is not
clear what the return value would be. To be really useful, we would
probably need it to be a uniform return value similar to how
decorators work.

On Mon, Apr 16, 2012 at 6:00 PM, Fernando Perez
reply@reply.github.com
wrote:

Folks, I've added the new %% sigil as well as a section on stacked magics.  Note that upon further reflection, it's not clear at all to me how that would work, see the new text for details.


Reply to this email directly or view it on GitHub:
#1611 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@fperez fperez closed this as completed in 61eb2ff May 27, 2012
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this issue Nov 3, 2014
Refactoring of the magics system and implementation of cell magics.

This PR completely refactors the magic system, finally moving the magic objects to standalone, independent objects instead of being the mixin class we'd had since the beginning of IPython.  Now, a separate base class is provided in IPython.core.magic.Magics that users can subclass to create their own magics.  Decorators are also provided to create magics from simple functions without the need for object orientation.

All builtin magics now exist in a few subclasses that group together related functionality, and the new IPython.core.magics package has been created to organize this into smaller files.

This cleanup was the last major piece of deep refactoring needed from the original 2001 codebase.

Secondly, this PR introduces a new type of magic function, prefixed with `%%` instead of `%`, which operates at the cell level.  A cell magic receives two arguments: the line it is called on (like a line magic) and the body of the cell below it.

Cell magics are most natural in the notebook, but they also work in the terminal and qt console, with the usual approach of using a blank line to signal cell termination.

This PR closes ipython#1611, or IPEP 1, where the design had been discussed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants