Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

%%script and %%file magics #1855

Merged
merged 11 commits into from Jun 11, 2012
Merged

%%script and %%file magics #1855

merged 11 commits into from Jun 11, 2012

Conversation

minrk
Copy link
Member

@minrk minrk commented Jun 4, 2012

As discussed on the ML, a few more basic cell magics, as requested:

  • %%file writes to a file (-f to force overwrite)
  • %%script runs a cell with a particular script

The ScriptMagics also defines a few common magics that wrap %%script with common interpreters, such as %%bash by default, and this list, as well as the full path for each, is configurable.

I still have to do some testing, particularly on Windows, but it seems ready for public eyes anyway.

For fun, the %%script magic is also presented as %%!, but I am happy to remove that if we don't like it.

minrk added 3 commits June 4, 2012 16:52
Configurables don't allow positional args to init, and HasTraits classes don't identify as `type`.
Base %%script magic, and add wrappers for a few common interpreters.

The list of wrapped magics is configurable.
@Carreau
Copy link
Member

Carreau commented Jun 5, 2012

about %%file , wouldn't it be counter intuitive to not be able to rerun a cell if there is no -f ?
I would have guess that the intent of %%file magic was to have interactive editing file... but then people will start to always use -f option (which is bad...)

I don't see nice workaound, all have drabacks

  • specific extension that does not require -f to be overwritten
  • special header in the file
  • special lock file near original.
  • keep track in kernel of editted files...

Also maybe we can add an option to %load that load the file in next cell, without stripping the first two line and prepend %%file (-f?) foo.py.

I guess the best would be not a %%file magic, but a real file cell, that could compare it's content to the content of a on disk file.

@minrk
Copy link
Member Author

minrk commented Jun 5, 2012

%%file is implemented at the request of @ellisonbg on the list, and meant for quickly writing csv data, etc. I imagine it would be like %loadpy, in that it's usually a one-time use magic, that likely wouldn't persist across many runs of a notebook.

Prompt for overwrite makes the most sense as default action, but the notebook still doesn't support stdin, so it won't help there.

@Carreau
Copy link
Member

Carreau commented Jun 5, 2012

It does not prevent it from beeing misused by user to write anything on the disk.
Thinking of it, I would add %fopen filename [filename [...]] as a magic to whitelist soem files, and %fclose [--all] | [filename [filename ...]] to unwhitelist them, with eventually %%file foo that autowhitelist foo if it does not exist.

@minrk
Copy link
Member Author

minrk commented Jun 5, 2012

I think that adds a great deal of unnecessary complexity and state with no real benefit. It would also be confusing to use open/close to mean something other than opening and closing files.

Prompt to overwrite covers safety and efficiency in a standard, totally expected way. We will just have to wait for the notebook to support stdin before it can behave as well as other Interfaces.

@Carreau
Copy link
Member

Carreau commented Jun 5, 2012

Prompt to overwrite covers safety and efficiency in a standard, totally expected way. We will just have to wait for the notebook to support stdin before it can behave as well as other Interfaces.

I still think people will see %%file filename as a file editor, and will start using it always with %%file -f. I'm fine with the -f option, but I think that we should strongly support the update of the same file without having to use -f explicitely as long as the file as been created by %%file.

What made me come with %fopen is how to keep track of these open %%file across notebook.

@takluyver
Copy link
Member

I agree that people will probably start using it as a lightweight editor, but I don't much mind them using -f to achieve that.

@ellisonbg
Copy link
Member

My initial thought is to do away with the -f option to %%file and always rewrite the file. These files are going to tend to be small, so I don't see the problem in rewriting them each time. I should note that the other problem with the -f flag is that it makes the %%file magic stateful depending on whether or not the file has been written previously. I think that in general we should try to make cell magics stateless.

@minrk
Copy link
Member Author

minrk commented Jun 5, 2012

I could go the same way as %reset, using ask_yes_no, defaulting to yes on StdinNotImplemented. That seems to make the most sense averaging across environments.

I don't expect people would use %%file as a lightweight editor, since it's write-only. It's the Python equivalent of echo "$cell" > $filename in bash.

I also agree with @ellisonbg and @takluyver that force overwrite with this magic is not problematic, because intent is clear.

@minrk
Copy link
Member Author

minrk commented Jun 5, 2012

I added ask_yes_no like %reset, so %%file implies -f in the notebook.

I also added -a as a flag for amending.

@Carreau
Copy link
Member

Carreau commented Jun 5, 2012

I also agree with @ellisonbg and @takluyver that force overwrite with this magic is not problematic, because intent is clear

If you read the doc. For me %%file is not that clear.
I'm more concerned about someone downloading a .ipynb from a colleague and 'run all' ...

If it is so temporary, maybe it could then write the file to a real temp location, and return a handle or a path to it.
Then call it %%tempfile, and let's use it as

filehandle, filename = %%tempfile
...
...

or

%%tempfile variable_to_which_assign_file_handle variable_to_which_assign_file_path
...
...

If you give the user a magic that takes a filename to write on as argument, they will use it to edit any kind of files, by doing a %load the appending %%file to the first line and their will be data loss.

@minrk
Copy link
Member Author

minrk commented Jun 5, 2012

I'm more concerned about someone downloading a .ipynb from a colleague and 'run all' ...

If you give someone a notebook with the cell:

with open ("filename", 'w') as f:
    f.write('garbage')

or !rm -rf foo

You have the exact same data loss problem. Should we disallow write-access to the filesystem? I really don't see a need to protect people from themselves this much.

If it is so temporary, maybe it could then write the file to a real temp location

The file is not temporary, but the use of the magic is, just like %loadpy. %loadpy is a run-once magic for initializing a cell. It doesn't make sense to keep it in your notebook (except for demonstration purposes, after deleting its results). The exact same is true of %file, but in the opposite direction (cell to file, rather than file/url to cell).

I think your %%tempfile magic idea is a good one, but it is not what %%file is for.

If you give the user a magic that takes a filename to write on as argument, they will use it to edit any kind of files
by doing a %load [then] [prepending] %%file to the first line and [there] will be data loss.

Where is the data loss? There will be no change in the file unless the user makes edits, in which case those edits are written to the file in place of what was there, exactly as expected.

@tkf
Copy link
Contributor

tkf commented Jun 8, 2012

How about adding --bg flag to %%script? It allows you to run some server process from IPython. It should be useful when experimenting some server/client code.

@fperez
Copy link
Member

fperez commented Jun 9, 2012

My interpretationn of %%file had also been precisely that it would be always hard-overwrite, and I think that's OK. The intent is simply 'dump the content of this cell to the filesystem as-is, everytime this cell is run, end of story. If somehow people want it not to overwrite once they've created the file (maybe because they intend to later edit it manually), they can just delete the cell once it's done its job.

So I think %%file is perfectly OK with being always, unconditionally destructive, and I think it would be an API mistake to add any complexity of any other kind.

I also like the idea of --bg, but I don't know if it'll handle output correctly in that case. Min, would it know to continue redirecting stdout/err back to the originating cell as the user continues?

At the Strata conference, there was a demo from Microsoft of a JS web shell (not full notebook, just more terminal-like flow but in a browser). It wasn't very impressive overall, but it did handle multiple long-running jobs extremely well. Each job would keep a little JS spinner wheel at the end of its stdout/err, cleanly and discreetly indicating the job was still pulling output, and it would update and move the wheel to the end as new output arrived from several jobs. Clean, slick, not distracting and very informative, really good UI/X.

Finally, just a thought: I can see the shell one being used a lot. Should we go for purity and call it bash or convenience and shorten it to sh?

@takluyver
Copy link
Member

If we go with sh, should it refer to the user's default $SHELL?

In favour: if the user has chosen a different default shell, they might reasonably expect 'shell' to refer to that. Against: if I write a notebook and send it to someone with a different default shell, it could break.

If we go with sh referring to $SHELL as a convenience, bash could still be used for definitely-bash cells. But perhaps that would be confusing, as for many users there would be no difference.

@minrk
Copy link
Member Author

minrk commented Jun 9, 2012

My interpretationn of %%file had also been precisely that it would be always hard-overwrite, and I think that's OK. The intent is simply 'dump the content of this cell to the filesystem as-is, everytime this cell is run, end of story. If somehow people want it not to overwrite once they've created the file (maybe because they intend to later edit it manually), they can just delete the cell once it's done its job.

So should I remove the ask y/n from frontends which support it?

So I think %%file is perfectly OK with being always, unconditionally destructive, and I think it would be an API mistake to add any complexity of any other kind.

I also like the idea of --bg, but I don't know if it'll handle output correctly in that case. Min, would it know to continue redirecting stdout/err back to the originating cell as the user continues?

The way we associate outputs with cells makes this impossible. Output is always associated with the most recent cell, so a background command will output to the current cell, whichever that is, totally regardless of which cell originated it. The only sensible option to me is either to suppress the output entirely, or store it in a variable.

At the Strata conference, there was a demo from Microsoft of a JS web shell (not full notebook, just more terminal-like flow but in a browser). It wasn't very impressive overall, but it did handle multiple long-running jobs extremely well. Each job would keep a little JS spinner wheel at the end of its stdout/err, cleanly and discreetly indicating the job was still pulling output, and it would update and move the wheel to the end as new output arrived from several jobs. Clean, slick, not distracting and very informative, really good UI/X.

Finally, just a thought: I can see the shell one being used a lot. Should we go for purity and call it bash or convenience and shorten it to sh?

There is not a shell one, I don't know what you are referring to. There is already bash, sh, perl, etc. which map directly to their respective commands. The list of exposed script magics is configurable, so it's one line of config to add zsh, any other command line program.

I can add shell that maps to $SHELL, but I haven't done this. I don't think we should have sh map to something other than sh though.

@fperez
Copy link
Member

fperez commented Jun 9, 2012

On Sat, Jun 9, 2012 at 5:14 AM, Thomas Kluyver
reply@reply.github.com
wrote:

Against: if I write a notebook and send it to someone with a different default shell, it could break.

Good point, and probably enough to shut it down.

Every potentially convenient feature that has a silent, surprising and
nasty failure mode should be always shot down on those grounds, even
if the convenience appears tempting.

Cheers,

f

@fperez
Copy link
Member

fperez commented Jun 9, 2012

On Sat, Jun 9, 2012 at 9:58 AM, Min RK
reply@reply.github.com
wrote:

So should I remove the ask y/n from frontends which support it?

That would be my vote.

In fact, now that we have the notebook so front and center, I've been
mulling if we shouldn't make all our default aliases for things like
'cp' and 'rm' default to their non-interactive forms. Only the
in-process terminal client should override that explicitly to switch
to '-i' forms. Thoughts? It would help for example with the second
issue @ctb raises here:
http://ivory.idyll.org/blog/jun-12/teaching-with-ipynb.html.

The way we associate outputs with cells makes this impossible.  Output is always associated with the most recent cell, so a background command will output to the current cell, whichever that is, totally regardless of which cell originated it.  The only sensible option to me is either to suppress the output entirely, or store it in a variable.

That's what I recalled, bummer. I have the sketch of an idea for
this, but such thoughts will have to be post-0.13.

There may be an intermediate solution for the --bg idea though, if
we follow the asyncresult pattern: we could offer --bg X as a flag,
where X is a name that will hold the wrapper of the backgrounded
process. The code for most of this already exists in
lib/backgroundjobs.py, so perhaps that's a good option.

In the future if we find a cleaner solution, we can always then offer
--bg without names as the in-place fancy solution, and it would be
backwards compatible.

How does this sound?

Finally, just a thought: I can see the shell one being used a lot.  Should we go for purity and call it bash or convenience and shorten it to sh?

There is not a shell one, I don't know what you are referring to. There is already bash, sh, perl, etc. which map directly to their respective commands.  The list of exposed script magics is configurable, so it's one line of config to add zsh, any other command line program.

I can add shell that maps to $SHELL, but I haven't done this.  I don't think we should have sh map to something other than sh though.

The advantage of 'shell' would be, I guess, to be a simple way of
writing somewhat-portable (posix-windows) cell magics to call to the
OS. There's a small but non-zero overlap between windows and posix
commands that work, and this would basically be our way to spell
os.system. I'm OK if you like that idea, but won't push for it.

I agree with you and Thomas on not aliasing sh ambiguously.

Cheers,

f

@minrk
Copy link
Member Author

minrk commented Jun 9, 2012

On Jun 9, 2012, at 13:38, Fernando Perezreply@reply.github.com wrote:

On Sat, Jun 9, 2012 at 9:58 AM, Min RK
reply@reply.github.com
wrote:

So should I remove the ask y/n from frontends which support it?

That would be my vote.

In fact, now that we have the notebook so front and center, I've been
mulling if we shouldn't make all our default aliases for things like
'cp' and 'rm' default to their non-interactive forms. Only the
in-process terminal client should override that explicitly to switch
to '-i' forms. Thoughts? It would help for example with the second
issue @ctb raises here:
http://ivory.idyll.org/blog/jun-12/teaching-with-ipynb.html.

The way we associate outputs with cells makes this impossible. Output is always associated with the most recent cell, so a background command will output to the current cell, whichever that is, totally regardless of which cell originated it. The only sensible option to me is either to suppress the output entirely, or store it in a variable.

That's what I recalled, bummer. I have the sketch of an idea for
this, but such thoughts will have to be post-0.13.

There may be an intermediate solution for the --bg idea though, if
we follow the asyncresult pattern: we could offer --bg X as a flag,
where X is a name that will hold the wrapper of the backgrounded
process. The code for most of this already exists in
lib/backgroundjobs.py, so perhaps that's a good option.

In the future if we find a cleaner solution, we can always then offer
--bg without names as the in-place fancy solution, and it would be
backwards compatible.

How does this sound?

Sure, this is exactly what I meant above by "store it in a variable". I will give it a try, and follow the growing convention of --out foo, so storing output is independent of backgrounding.

Finally, just a thought: I can see the shell one being used a lot. Should we go for purity and call it bash or convenience and shorten it to sh?

There is not a shell one, I don't know what you are referring to. There is already bash, sh, perl, etc. which map directly to their respective commands. The list of exposed script magics is configurable, so it's one line of config to add zsh, any other command line program.

I can add shell that maps to $SHELL, but I haven't done this. I don't think we should have sh map to something other than sh though.

The advantage of 'shell' would be, I guess, to be a simple way of
writing somewhat-portable (posix-windows) cell magics to call to the
OS. There's a small but non-zero overlap between windows and posix
commands that work, and this would basically be our way to spell
os.system. I'm OK if you like that idea, but won't push for it.

We already have %sx/! for this, and that can be made available as a cell magic for calling os.system. If we want a longer name, 'system' seems most logical for this one, as opposed to %%script and its descendants, all of which so far are explicit in which interpreter to use.

I agree with you and Thomas on not aliasing sh ambiguously.

Cheers,

f


Reply to this email directly or view it on GitHub:
#1855 (comment)

@fperez
Copy link
Member

fperez commented Jun 9, 2012

On Sat, Jun 9, 2012 at 3:05 PM, Min RK
reply@reply.github.com
wrote:

Sure, this is exactly what I meant above by "store it in a variable".  I will give it a try, and follow the growing convention of --out foo, so storing output is independent of backgrounding.

Great, thanks!

We already have %sx/! for this, and that can be made available as a cell magic for calling os.system.  If we want a longer name, 'system' seems most logical for this one, as opposed to  %%script and its descendants, all of which so far are explicit in which interpreter to use.

Back in the terminal days we pushed pretty hard for super-short names
b/c we were trying to save typing, and all input was ephemeral. The
notebook changes that design brief quite a bit, though obviously we
don't want to break backwards compatibility nilly-willy. I guess
extending %sx for now is the least disruptive change.

There's a twist with the 'system' name: we have our own ip.system
because of the whole pexpect issue. And as I mentioned recently, that
can be a life-saver when talking to code that goes crazy if not in a
tty (like git shortlog). So I don't know if we want to say 'sytem
is like os.system but not really' :)

Cheers,

f

@takluyver
Copy link
Member

On 9 June 2012 21:38, Fernando Perez
reply@reply.github.com
wrote:

In fact, now that we have the notebook so front and center, I've been
mulling if we shouldn't make all our default aliases for things like
'cp' and 'rm' default to their non-interactive forms.  Only the
in-process terminal client should override that explicitly to switch
to '-i' forms.  Thoughts?

I'm not sure if even the terminal should have the -i flag by
default. If the user doesn't need to confirm "rm foo" at a bash
prompt, it seems OK to have the same level of danger at an ipython
prompt. What was the reason for using -i in the first place?

@fperez
Copy link
Member

fperez commented Jun 10, 2012

I'd be ok with that too. I added -i over 10 years ago, just thinking of being 'beginner friendly', whatever that meant in my mind back then :) Not a decision we need to consider ourselves beholden to at this point, by any means.

@minrk
Copy link
Member Author

minrk commented Jun 10, 2012

I actually have alias rm="rm -i" in my bash profile, because I think deletion without confirmation is just a bad idea altogether, but it's certainly appropriate for IPython to match the underlying behavior.

@fperez - I do not intend to use the pexpect code for %%script here, so perhaps the istty-based output would be a good reason for people to use %sx/%system instead of %%script in some cases.

I'm working on this right now, so just to clarify what I should be doing:

  • remove -i from %%file, so it always clobbers / appends (%%file essentially maps directly to io.open)
  • add --out/err for %%script magics to store stdout/err
  • add --bg for backgrounding, using lib.backgroundjobs (the ScriptMagics object should have a job_manager, if I read correctly)
  • alias %sx to %system, and make it a cell magic
  • bring in updated %quickref to show short-hand for %sc and %sx #1215 changes to sx docstrings

@fperez
Copy link
Member

fperez commented Jun 10, 2012

On Sun, Jun 10, 2012 at 2:08 PM, Min RK
reply@reply.github.com
wrote:

I actually have alias rm="rm -i" in my bash profile, because I think deletion without confirmation is just a bad idea altogether, but it's certainly appropriate for IPython to match the underlying behavior.

Sounds good. People can always alias themselves (like you did). I
should note I also have aliased rm to rm -i in my bashrc,
forcing me to manually use -f when I mean to. I'm tpyically not a
'safety everywhere' guy, but rm is destructive enough on *nix that I
made this concession to safety years ago, and I don't regret it.

Let's not forget to warn of this change loudly though, some people may
get bitten by it. A note on what's changed doc would be good so we
don't miss it on the release notes.

@fperez - I do not intend to use the pexpect code for %%script here, so perhaps the istty-based output would be a good reason for people to use %sx/%system instead of %%script in some cases.

Sounds good.

I'm working on this right now, so just to clarify what I should be doing:

  • remove -i from %%file, so it always clobbers / appends (%%file essentially maps directly to io.open)
  • add --out/err for %%script magics to store stdout/err
  • add --bg for backgrounding, using lib.backgroundjobs (the ScriptMagics object should have a job_manager, if I read correctly)
  • alias %sx to %system, and make it a cell magic
  • bring in updated %quickref to show short-hand for %sc and %sx #1215 changes to sx docstrings

All of that sounds perfect. Thanks!!

minrk and others added 4 commits June 10, 2012 15:16
* add --out/err for storing output
* add --bg for backgrounding scripts
also exposed as aliases %%system and %%!
@minrk
Copy link
Member Author

minrk commented Jun 10, 2012

Okay, I think all bullets are addressed, and demo notebook added. Should be just docs/tests left.

@fperez
Copy link
Member

fperez commented Jun 11, 2012

Beautiful! Merging now, awesome...

fperez added a commit that referenced this pull request Jun 11, 2012
%%script and %%file magics

* `%%file` writes to a file (-f to force overwrite)
* `%%script` runs a cell with a particular script

The ScriptMagics also defines a few common magics that wrap `%%script` with common interpreters, such as `%%bash` by default, and this list, as well as the full path for each, is configurable.

For fun, the `%%script` magic is also presented as `%%!`.
@fperez fperez merged commit 18f728c into ipython:master Jun 11, 2012
@minrk minrk deleted the moremagics branch March 31, 2014 23:36
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
%%script and %%file magics

* `%%file` writes to a file (-f to force overwrite)
* `%%script` runs a cell with a particular script

The ScriptMagics also defines a few common magics that wrap `%%script` with common interpreters, such as `%%bash` by default, and this list, as well as the full path for each, is configurable.

For fun, the `%%script` magic is also presented as `%%!`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants