Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Always save the .py file to disk next to the .ipynb #1060

Closed
fperez opened this issue Nov 28, 2011 · 16 comments
Closed

Always save the .py file to disk next to the .ipynb #1060

fperez opened this issue Nov 28, 2011 · 16 comments
Labels
Milestone

Comments

@fperez
Copy link
Member

fperez commented Nov 28, 2011

The more I think about it, the more it seems to me that we should always write the .py file next to the notebook. This would let more library-like uses of the code in a much easier workflow than now (where one must go through the download button, that puts the files in a different place, etc). Furthermore, by putting at the top:

script = __name__ == `__main__`

one could then write the 'script parts' of the notebook with

if script:
  ... code

and the notebook could then be 'imported'. Since it would be a normal .py file, it would make it much easier to reuse code from one notebook to another via normal import mechanisms.

We may want later to consider other forms of cross-notebook referencing and loading, but I think that having out of the box mechanism to reuse notebooks as normal libraries would be extremely valuable.

@ellisonbg, @minrk, do you think it's easy to add a flag to do this automatically? I'd vote for having it always on, but I guess one could also want to turn it off, case in which it would need to be configurable. Ideally per-notebook, stored in the metadata, but initially it could be a server-level flag stored in the config file.

@takluyver
Copy link
Member

Similar to the discussion about ipynbo files, I'd be wary of creating extra files in the working directory by default. But I agree that there should be an easy way to make importable .py files.

@rkern rkern closed this as completed Nov 28, 2011
@rkern rkern reopened this Nov 28, 2011
@rkern
Copy link
Contributor

rkern commented Nov 28, 2011

Sorry. Wrong button.

@ellisonbg
Copy link
Member

I think this idea is worth thinking about as we really do want the notebooks to be usable for more "library" style code. But I do worry about what @takluyver brought up about adding extra files to the working directory. We would have to be very careful about not overwriting existing .py files a user has. Are there other ways we could enable this type of library mode.

But the implementation of this idea should not be too difficult if we want it on always. Making it configurable from the notebook Ui would add a bit of work. Making it configurable using the regular IPythyon config system would not be too bad.

@minrk
Copy link
Member

minrk commented Nov 28, 2011

yes, implementing this, especially if it's server-configurable or not configurable should be very easy (~2 extra lines in the save routine). I do really worry about clutter, but at the same time this could be quite useful.

I know that for my work, if every IPython notebook created 4 separate files next to each other (currently proposed: ipynb, ipynbo, autosaved backup, python script), I would be very angry. That level of clutter is just not acceptable. We should consider the cost of adding unrequested files in the working dir very high indeed.

Just like @takluyver recommended using something akin to the __pycache__ for output caching, and I hope very much that we do something similar for autosaved backups, perhaps the .py scripts should go in a location that we add to sys.path? That's obviously adding a layer between using notebooks as modules and using scripts as modules, so I don't know if it would be ideal.

Another thing we should probably do, regardless of where we come down on this - support %run notebook.ipynb. That certainly shouldn't rely on the .py file already having been created.

Of course, since I would imagine this to be a minority use case (if not an uncommon one), if putting the script adjacent to the notebook is off by default, I wouldn't object to that being the behavior.

@ellisonbg
Copy link
Member

I agree with @minrk 's comments about not wanting to clutter the
working directory. I guess the thing that keeps coming back to me is
if there is a better way of allowing notebooks to function as
importable modules.

On Mon, Nov 28, 2011 at 3:44 PM, Min RK
reply@reply.github.com
wrote:

yes, implementing this, especially if it's server-configurable or not configurable should be very easy (~2 extra lines in the save routine).  I do really worry about clutter, but at the same time this could be quite useful.

I know that for my work, if every IPython notebook created 4 separate files next to each other (currently proposed: ipynb, ipynbo, autosaved backup, python script), I would be very angry.  That level of clutter is just not acceptable.  We should consider the cost of adding unrequested files in the working dir very high indeed.

One option would be to create a visible subdirectory for all of the
.py files. Then at least the clutter consists of a single entry per
working directory rather than a file per notebook.

Just like @takluyver recommended using something akin to the __pycache__ for output caching, and I hope very much that we do something similar for autosaved backups, perhaps the .py scripts should go in a location that we add to sys.path?  That's obviously adding a layer between using notebooks as modules and using scripts as modules, so I don't know if it would be ideal.

Having such a cache for the .py files also makes it difficult to
version control the entire thing in a clean way. If someone was using
the .py files as library modules, they would certainly want those to
be visible in version control.

Another thing we should probably do, regardless of where we come down on this - support %run notebook.ipynb.  That certainly shouldn't rely on the .py file already having been created.

Yep, doing that won't be too difficult.

Of course, since I would imagine this to be a minority use case (if not an uncommon one), if putting the script adjacent to the notebook is off by default, I wouldn't object to that being the behavior.

I agree it should be off by default, but I guess I am not quite
convinced this is the way to go. It feels like we are still missing
something about how all of this would work.


Reply to this email directly or view it on GitHub:
#1060 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

@fperez
Copy link
Member Author

fperez commented Nov 29, 2011

On Mon, Nov 28, 2011 at 3:42 AM, Robert Kern
reply@reply.github.com
wrote:

I would probably opt for providing a script that extracts a .py file from the notebook file and a GUI option for exporting to a .py file.

Yes, we should certainly have a command-line form of this. Note that
the gui option exists, if you click on the download button you can
select 'py' as the format.

@fperez
Copy link
Member Author

fperez commented Nov 29, 2011

On Mon, Nov 28, 2011 at 3:19 PM, Brian E. Granger
reply@reply.github.com
wrote:

I think this idea is worth thinking about as we really do want the notebooks to be usable for more "library" style code.  But I do worry about what @takluyver brought up about adding extra files to the working directory.  We would have to be very careful about not overwriting existing .py files a user has.  Are there other ways we could enable this type of library mode.

The clutter issue is indeed a problem, and this should certainly never
be the default behavior. One of the reasons I went with this approach
(rather than files hidden in a subdir or similar) was that I'd like to
see a very low barrier to code reuse between notebooks. I'm already
finding myself doing a lot more copy/pasting than I'd like between
notebooks, simply because it's hard to import from a notebook. So I
think we need to find a way to make that process as natural as
importing from a script, and I'd rather not introduce special
functions to have to spell it like

from IPythhon.something import importnb
foo = importnb('foo')

also because that would force us to reimplement a fair amount of machinery.

But the implementation of this idea should not be too difficult if we want it on always.  Making it configurable from the notebook Ui would add a bit of work.  Making it configurable using the regular IPythyon config system would not be too bad.

I think that going with just our config system for now is perfectly
OK. Eventually we'll work out a system for configuring UI stuff, but
for now a regular server-side flag is more than enough.

@fperez
Copy link
Member Author

fperez commented Nov 29, 2011

On Mon, Nov 28, 2011 at 3:44 PM, Min RK
reply@reply.github.com
wrote:

I know that for my work, if every IPython notebook created 4 separate files next to each other (currently proposed: ipynb, ipynbo, autosaved backup, python script), I would be very angry.  That level of clutter is just not acceptable.  We should consider the cost of adding unrequested files in the working dir very high indeed.

I think the .py files have a stronger justification for being put next
to the notebooks, and that is to be importable from other notebooks.
If we put them somewhere else, the import logic gets more complicated.
The others could certainly go in directories elsewhere, and all of
this should certainly be off by default.

Just like @takluyver recommended using something akin to the __pycache__ for output caching, and I hope very much that we do something similar for autosaved backups, perhaps the .py scripts should go in a location that we add to sys.path?  That's obviously adding a layer between using notebooks as modules and using scripts as modules, so I don't know if it would be ideal.

Yes, that's why I don't like much the idea of putting them elsewhere;
breaking the natural flow of the import logic seems like a real
drawback to me.

Another thing we should probably do, regardless of where we come down on this - support %run notebook.ipynb.  That certainly shouldn't rely on the .py file already having been created.

That, I completely agree on.

@fperez
Copy link
Member Author

fperez commented Nov 29, 2011

On Mon, Nov 28, 2011 at 7:15 PM, Brian E. Granger
reply@reply.github.com
wrote:

I agree with @minrk 's comments about not wanting to clutter the
working directory.  I guess the thing that keeps coming back to me is
if there is a better way of allowing notebooks to function as
importable modules.

I still haven't really been able to find a way that makes import
semantics natural, and I don't like the idea of messing with sys.path
under the hood. If we put the .py next to the .ipynb, all the natural
intuition we have about how import works continues to be valid,
without needing to add extra complexity.

I agree it should be off by default, but I guess I am not quite
convinced this is the way to go.  It feels like we are still missing
something about how all of this would work.

I'd certainly love to find a solution that makes everyone happy, since
I see some reservations still from you guys. I do think that in the
long run, having this (in some form) would really be a killer feature.
The more I use the nb for 'real work', the more I find myself wanting
this, so it's probably an indication that in the long run it will
really be important.

@minrk
Copy link
Member

minrk commented Nov 29, 2011

I still haven't really been able to find a way that makes import
semantics natural, and I don't like the idea of messing with sys.path
under the hood. If we put the .py next to the .ipynb, all the natural
intuition we have about how import works continues to be valid,
without needing to add extra complexity.

I think what's probably best, if we want to do anything magic, is just save the script in a special location that IPython has prepended to sys.path, just like we do with extensions. Anything more complicated than that may not be worth it.
Otherwise, saving the script adjacent to the notebook as a non-default behavior is the easiest and most easily understood by users, as Fernando points out.

@rkern
Copy link
Contributor

rkern commented Nov 29, 2011

Writing a sys.meta_path import hook that imports directly from the notebook file wouldn't be too bad. Unlike many of the older forms of import hook, the sys.meta_path method works fairly robustly and plays nicely with others.

http://www.python.org/dev/peps/pep-0302/
http://docs.python.org/library/sys#sys.meta_path

@takluyver
Copy link
Member

We could potentially save .py files in a subdirectory from the pwd called
something like nblib or localnb, with a __init__.py file in there. Then
importing from one would look like from localnb.foo import myfunc.

I think the bigger issue is that we need a good settings panel in the
notebook. Then it's as simple as having a checkbox for 'Make importable
module on save'.

@fperez
Copy link
Member Author

fperez commented Nov 29, 2011

On Tue, Nov 29, 2011 at 2:17 AM, Robert Kern
reply@reply.github.com
wrote:

Writing a sys.meta_path import hook that imports directly from the notebook file wouldn't be too bad. Unlike many of the older forms of import hook, the sys.meta_path method works fairly robustly and plays nicely with others.

http://www.python.org/dev/peps/pep-0302/
http://docs.python.org/library/sys#sys.meta_path

I'd thought about that (Ilan told me about these guys, which I didn't
know about before), but what worries me about this approach is
debuggability. I think the simple foo.ipynb -> foo.py approach will
be easier for users to understand if they need to debug anything
coming from the notebook. %debug will work in a familiar fashion, the
relative paths will all be the same, and they can easily open the .py
file for inspection in an editor.

With an import hook, we'd probably want to create
semi-hidden/temporary files from the notebook, which adds complexity
and makes debugging/understanding less transparent. Do we make them
with tempfile, hence having random-looking names? If so, do we track
their lifetime and try to clean them up on exit? If not, do we create
a bigger clutter problem by having all these junk-looking filenames
around? If we put them somewhere else, the logic with relative
imports gets even worse...

I'd love to be proven wrong, but every time I look at this, any
solution other than a straight mapping of X.ipynb to X.py and using
the standard, familiar import machinery and semantics seems to
introduce more new problems than it solves.

@ellisonbg
Copy link
Member

I don't think we should do anything on this front yet. We have a lot of redesigning of the notebook coming up and that will give us a chance to think about this further.

@fperez
Copy link
Member Author

fperez commented Jan 5, 2012

I just realized that we should probably close this issue. We now have the --script option that does precisely what this issue was about, and I think that's a pretty good solution for now. We can revisit more esoteric approaches later...

@fperez
Copy link
Member Author

fperez commented Jan 6, 2012

Closing now; we can open a new one later if we want fancier featuers, but for now the simple --script takes care of this particular point.

@fperez fperez closed this as completed Jan 6, 2012
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants