Implement autosave in notebook #1378

Closed
v923z opened this Issue Feb 5, 2012 · 46 comments

Comments

Projects
None yet
Contributor

v923z commented Feb 5, 2012

I would just like to bring an older issue to the fore. This has been discussed on the mailing list (http://mail.scipy.org/pipermail/ipython-user/2012-January/009160.html) an in one of the issues it is also mentioned (#977), but I couldn't find an issue for it. I hope it is OK to raise again.

The consensus on the mailing list seemed to be that a simple autosave would be useful even before the implementation of a full-fledged revision control system, and it should have the following features:

  1. For a start, the timeout would not have to be configurable, something like 5-10 minutes should be OK.
  2. The backup file for foo_notebook.ipynb should be named .foo_notebook.ipynb, so that it is invisible, but can be loaded as normal notebook, when recovery is necessary.
  3. Backup files should be removed on clean exit.

Are these criteria still OK?

Owner

fperez commented Feb 6, 2012

Thanks for the summary, this is indeed useful to track as it's a fairly self-contained question we've already discussed a fair bit.

Owner

ellisonbg commented Feb 6, 2012

Yes, I think the basic features above would work fine for a first go.

Contributor

tkf commented May 8, 2012

Some points I couldn't find in the threads (Just quickly read them. Sorry if I miss something):

  1. Auto save file must be easier to ignore by .gitignore or .hgignore. To ignore .foo_notebook.ipynb you need .*.ipynb. Well, that's fine.
  2. What happens when you open multiple clients? Can other client can overwrite the auto save file?
Owner

ellisonbg commented May 8, 2012

On Mon, May 7, 2012 at 6:35 PM, Takafumi Arakaki
reply@reply.github.com
wrote:

Some points I couldn't find in the threads (Just quickly read them. Sorry if I miss something):

  1. Auto save file must be easier to ignore by .gitignore or .hgignore. To ignore .foo_notebook.ipynb you need .*.ipynb. Well, that's fine.

We are not even sure we want to use hidden files like this. Not too
fond of cluttering the file system.

  1. What happens when you open multiple clients? Can other client can overwrite the auto save file?

Currently yes, this is unavoidable.


Reply to this email directly or view it on GitHub:
#1378 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

Contributor

tkf commented May 8, 2012

  1. What happens when you open multiple clients? Can other client can overwrite the auto save file?

Currently yes, this is unavoidable.

Just an idea. How about append the session id to the auto save file?

Owner

ellisonbg commented May 8, 2012

On Tue, May 8, 2012 at 11:41 AM, Takafumi Arakaki
reply@reply.github.com
wrote:

  1. What happens when you open multiple clients? Can other client can overwrite the auto save file?

Currently yes, this is unavoidable.

Just an idea.  How about append the session id to the auto save file?

That is the problem though = we don't have any notion of sessions or
users currently. We will have to grow that before we can properly
handle these things. Part of the reason we are moving slow of these
things is that there are numerous complex design problems that are all
interrelated.


Reply to this email directly or view it on GitHub:
#1378 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

Contributor

tkf commented May 8, 2012

Can't we use Kernel.session_id?

this.session_id = utils.uuid();

Contributor

tkf commented May 8, 2012

Ah, maybe you want to separate kernel stuff from notebook. In that case, you will need "client id". But I guess defining it in client side and sending it to server won't be too hard. Well, if you want some more sophisticated session/user control it will be hard though.

Owner

ellisonbg commented May 8, 2012

On Tue, May 8, 2012 at 11:55 AM, Takafumi Arakaki
reply@reply.github.com
wrote:

Ah, maybe you want to separate kernel stuff from notebook.  In that case, you will need "client id".  But I guess defining it in client side and sending it to server won't be too hard.  Well, if you want some more sophisticated session/user control it will be hard though.

Yes, the important session information is that of the notebook client,
not the kernel. The kernel is not involved in the synching in any
way. The notebook client session will have to be coupled with the
user at some level, so other users can see who is editing the notebook
currently. Again, not simple.


Reply to this email directly or view it on GitHub:
#1378 (comment)

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

tmbdev commented Jun 11, 2012

I can only speak for UNIX/Linux, but on those systems, there is a long tradition and a lot of experience with how editors handle this. GNU Emacs started the conventions, but a lot of programs follow them now.

Based on my experience, automatic backups, crash recovery, and source control are three different issues. Source control systems are not a replacement for backup or crash recovery files (otherwise people would have stopped using the latter long ago).

There are conventions for how to do this, namely crash recovery files are of the form '##", and backup files are of the form "" or "something.1". There are reasons for this. '#something#' is hard to remove or even access accidentally and it sorts first in a directory listing. Using "*" and/or ".1" for backup means that the backup files follow directly the files they back up, and they can be removed easily with commands like "rm mynotebook.ipynb" or moved in bulk with "mv mynotebook.ipynb* ../new-project". You can remove all backup files with "rm *~". Furthermore, because these conventions change the extension, people won't accidentally open a backup file and start working on it, which is a really important feature. People know where to look for these files, tools like "ls" understand them, and to experienced users, dealing with them is pretty much reflexive. People set up cron jobs to clean them up, set up "ls" and file managers not to show them, and have revision control and indexing programs ignore them. And just as importantly, lots of tools know not to use file names like those for anything else.

You may find these conventions ugly, and I might even agree, but it's both common usage, and it's a compromise between conflicting design goals. If you come up with your own convention, people won't know where to look for backup files, they won't know what the files are that you are leaving around, and they'll have to set up all their tools to deal with a different convention yet again. You also run the risk of breaking something; for example, using ".foo.ipynb" as a backup for "foo.ipynb" is potentially dangerous.

Owner

fperez commented Jun 11, 2012

@tmbdev, thanks for that comment; having used these features on *nix/emacs for over 15 years, I'd never given much thought to how they actually reflect a careful design process. Indeed a bit ugly filesystem-wise, but certainly not accidental.

It's worth keeping this in mind, because if we want to break away from these conventions, at least we should do so aware of it and of the changes we introduce.

Owner

takluyver commented Jun 11, 2012

I'd favour something more like how LibreOffice saves recovery files - out of the way, but readily accessible through the application (and presumably documented somewhere - we should document our location, anyway). Like us, it's a cross platform app, and Unix conventions don't necessarily work so well on Windows.

Owner

fperez commented Jun 11, 2012

As long as it's not as nagging as the libreoffice one is, that quite often has false positives that attempt recovery of non-existent stuff, and drives me nuts :)

I'm not saying we must follow the *nix approach, just that this design background was very useful so we can keep them in mind as we approach the problem, and try to offer solutions to all the problem spectrum (even if our solutions end up being different today).

Owner

takluyver commented Jun 11, 2012

I think the main annoyance with the LO design is that it's a modal dialog when it launches. I imagine something integrated with the file list, so if the autosave file was more recent than the saved version, it would offer:

My notebook title (saved 12:31)
  - autosave from 12:38

So you can click the title for the last saved version, or the autosave line for the autosaved version. Obviously it would take a bit of tweaking to work out what's clearest - e.g. we might make the title line refer to the autosave, because 95% of the time that's what you want.

Owner

ellisonbg commented Jun 14, 2012

For the current file based storage system we are currently using for notebooks, I actually like using something like "~" files in that directory. I agree that it is not pretty, but it is dead simple and matches the current philosophy of the notebook server that has absolutely no special databases or directories.

Owner

ellisonbg commented Jun 14, 2012

If we can confirm that send2trash really works like is claims to I think we should put this in externals (we need to also check py3 compat) and use it instead of deleting files.

Owner

takluyver commented Jun 15, 2012

send2trash appears to have gone for separate python 2/3 codebases:

http://pypi.python.org/pypi/Send2Trash3k

Although it looks like most of the changes could be handled by 2to3:

http://hg.hardcoded.net/send2trash/compare/py2k..default

Owner

fperez commented Nov 12, 2012

Since we've been discussing this lately on the list, I figured I'd post here the easy python/JS solutions for user-driven autosave, for those who do want a regular autosave:

One can enable autosave by putting this into the ipython startup file:

def autosave(interval=5):
    """Autosave the notebook every interval (in minutes)"""
    from IPython.core.display import Javascript
    interval *= 60*1000 # JS wants intervals in miliseconds
    tpl = 'setInterval ( "IPython.notebook.save_notebook()", %i );'
    return Javascript(tpl % interval)

and call it anywhere in the notebook to activate it.

Alternately, just put this (in this case, using 5 minutes):

<script type="text/javascript">
setInterval ( "IPython.notebook.save_notebook()", 5*60*1000 );
</script>

in the first cell of a notebook and make it a markdown cell. Done, autosave is on for that notebook every time you open it.

@shazow may be able to give us some tips on how to plug this into the browser saving machinery.

shazow commented Nov 12, 2012

My suggestion would be this: Implement the notebook auto-saving in the browser, and keep writing to a disk file on-demand. Here's how I'd do it:

Add a little snippet of JavaScript which basically writes the current state of the notebook into the browser's LocalStorage every so often (assuming anything has changed). Not sure what the best way to dump the state is, could write the entire notebook JSON into a LocalStorage value blob—could even store revisions based on a timestamp key.

If the browser crashes, we should still be able to recover the latest auto-saved version from LocalStorage and load the browser state accordingly. Then the user can manually save to a disk file as normal.

Thoughts?

Owner

Carreau commented Nov 12, 2012

Just be carefull that what Fernando gives this will overwrite you current saved file.
If you delete a cell and it autosaves, you are screwed. (not anymore on dev which have cell deletion undoing now)
To have a regular backup file with different extension (no need to put anything in md cell)
Use what is after my signature in custom.js (0.13+)

Matthias

// ~/.ipython/profile_default/static/js/custom.js
var make_backup = function(){
var json = IPython.notebook.toJSON();
json.nbformat=3;
json.nbformat_minor=0;
var s = JSON.stringify(json)
var settings = {
processData : false,
cache : false,
type : 'POST',
dataType : 'json',
data : s,
headers : {'Content-Type': 'application/json' },
success : function (data, status, xhr) {
console.log('save success');
}
};

var qs = $.param({name:IPython.notebook.notebook_name+'.bkp', format:'json'});
var url = $('body').data('baseProjectUrl') + 'notebooks?' + qs;
console.log(settings);
$.ajax(url, settings);

}
var seconde = 1000
var minute = 60*seconde

// comment following line to disable.
setTimeout(make_backup,5*minute)

Le 12 nov. 2012 à 18:17, Fernando Perez a écrit :

Since we've been discussing this lately on the list, I figured I'd post here the easy python/JS solutions for user-driven autosave, for those who do want a regular autosave:

Fortunately, the OP can have autosave right now, just put this in your
startup file:

def autosave(interval=5):

"""Autosave the notebook every interval (in minutes)"""

from IPython.core.display import Javascript

interval _= 60_1000 # JS wants intervals in miliseconds

tpl = 'setInterval ( "IPython.notebook.save_notebook()", %i );'

return Javascript(tpl % interval)
and call it anywhere in your notebook to activate it.

Alternately, just put this (in this case, using 5 minutes):

<script type="text/javascript"> setInterval ( "IPython.notebook.save_notebook()", 5*60*1000 ); </script>

in the first cell of your notebook and make it a markdown cell.
You're done, autosave is on for that notebook everytime you open it. @shazow may be able to give us some tips on how to plug this into the browser saving machinery.


Reply to this email directly or view it on GitHub.

Owner

takluyver commented Nov 12, 2012

Having autosave purely in the client avoids the potential for conflicts
when different frontends are autosaving, but I think it might get confusing
for the user that saving happens in two different places.

E.g. I'm working on a notebook at work, when my computer locks up. I recall
that it has autosaved, do a hard shutdown, and go home for the evening.
Loading up the notebook, I find that it's still an old version - looks like
the autosave didn't work. I spend a while redoing some of the bits I
remember, taking care to save manually. The next morning, my work computer
can see both the autosaved version from before the crash, and the notebook
as I'd saved it from home.

I also see that the limits of localStorage are 2.5-10 MB, depending on the
browser. So we'd have to take some care about what we save, as people build
more and larger notebooks.

shazow commented Nov 12, 2012

@takluyver, that's a good example, thank you. That is indeed confusing. Though perhaps better than losing your data altogether?

The LocalStorage limits can be increased at the prompt of the user.

tmbdev commented Nov 12, 2012

"Having autosave purely in the client avoids the potential for conflicts when
different frontends are autosaving"

As far as I can tell, that case is already not working: saving in one
client overwrites changes previously saved in another one.

The two choices for dealing with this are either to keep all the windows in
sync in real time, or to lock all but one; as far as I can tell, iPython
has neither mechanism.

Tom

On Tue, Nov 13, 2012 at 2:37 AM, Thomas Kluyver notifications@github.comwrote:

Having autosave purely in the client avoids the potential for conflicts
when different frontends are autosaving, but I think it might get confusing
for the user that saving happens in two different places.

E.g. I'm working on a notebook at work, when my computer locks up. I recall
that it has autosaved, do a hard shutdown, and go home for the evening.
Loading up the notebook, I find that it's still an old version - looks like
the autosave didn't work. I spend a while redoing some of the bits I
remember, taking care to save manually. The next morning, my work computer
can see both the autosaved version from before the crash, and the notebook
as I'd saved it from home.

I also see that the limits of localStorage are 2.5-10 MB, depending on the
browser. So we'd have to take some care about what we save, as people build
more and larger notebooks.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/issues/1378#issuecomment-10296638.

Owner

fperez commented Nov 12, 2012

@tmbdev, what @shazow meant was saving in the client's private storage area, which is a browser-local storage.

Owner

takluyver commented Nov 12, 2012

@tmbdev: Agreed, we don't do anything to deal with such conflicts at present. But the problem would be more acute with autosaves. Imagine you left a notebook open on another computer (or even in another tab - I often have so many open that I forget some), and both the copy you're working on and the unedited copy have a naive autosave firing every minute. If you do suffer a crash/powercut/meteorite strike, it's pot luck whether the one you want got the last autosave.

Come to think of it, using local storage actually wouldn't solve the case with two tabs open in the same browser. But maybe the solution to that is for me to pay attention to my tabs. ;-)

tmbdev commented Nov 13, 2012

With multiple windows open, the autosave would alternate between one and
the other. That doesn't seem like a big problem. You could append a random
number derived from each window to the autosave file (it's only temporary
anyway) to prevent even that (the same trick works with local storage).

Data loss from multiple browser windows however is a real problem, that's
why almost every common app implements either locking or real-time
sync. "Paying attention" isn't a solution, because it remains a support
problem when you use the tool with students and researchers.

On Tue, Nov 13, 2012 at 7:44 AM, Thomas Kluyver notifications@github.comwrote:

@tmbdev https://github.com/tmbdev: Agreed, we don't do anything to deal
with such conflicts at present. But the problem would be more acute with
autosaves. Imagine you left a notebook open on another computer (or even in
another tab - I often have so many open that I forget some), and both the
copy you're working on and the unedited copy have a naive autosave firing
every minute. If you do suffer a crash/powercut/meteorite strike, it's pot
luck whether the one you want got the last autosave.

Come to think of it, using local storage actually wouldn't solve the case
with two tabs open in the same browser. But maybe the solution to that is
for me to pay attention to my tabs. ;-)


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/issues/1378#issuecomment-10307494.

Owner

Carreau commented Nov 13, 2012

We will get to live syncing between browser/user. It is just not strait forward.

Owner

ellisonbg commented Nov 13, 2012

I think we do need a simple autosave approach that will work for now. But,
once we have live multi-sure notebook syncing, the server will always have
the latest state of the notebook and autosave will possibly look very
different.

On Tue, Nov 13, 2012 at 2:35 AM, Bussonnier Matthias <
notifications@github.com> wrote:

We will get to live syncing between browser/user. It is just not strait
forward.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/issues/1378#issuecomment-10321946.

Brian E. Granger
Cal Poly State University, San Luis Obispo
bgranger@calpoly.edu and ellisonbg@gmail.com

Contributor

drevicko commented Jan 30, 2013

I'm new to iPython Notebooks and am very pleased with what I find - again I am indebted to an open source community - many thanks!

For what it's worth, I have to agree with @tmbdev 's sentiments in #2553 - there is a conceptual simplicity with approaches like GoogleDocs and Dropbox: you are dealing with ONE virtual object, the same one wherever and however you access it. An important feature is the ability to rollback. This solves the issues around multiple versions and how/where to autosave as discussed here (both design complexity and confusion for the user).

The traditional model of "don't hit save until you feel it's ready" is more suited to a version control paradigm like git (eg: @minrk's comment mentioned that 'save' could do a git add && commit)

Thankyou @fperez for a way to set up autosave for those who want it now.

Owner

minrk commented Jan 30, 2013

Indeed, and there's even an extension I use regularly, so in my notebooks I just do:

%load_ext autosave
%autosave 300

to autosave every five minutes. It's quite trivial.

Owner

Carreau commented Jan 30, 2013

I'll repost this as everybody forget it:

Autosave overwrite current notebook, which is bad. even in VCS, we shouldn't do it.
If you want a swap file like vim/emacs with a different extension use the following in your custom.js
It does not even need to be launched manually.

// ~/.ipython/profile_default/static/js/custom.js
var make_backup = function(){
    var json = IPython.notebook.toJSON();
    json.nbformat=3;
    json.nbformat_minor=0;
    var s = JSON.stringify(json)
    var settings = {
                    processData : false,
                    cache : false,
                    type : 'POST',
                    dataType : 'json',
                    data : s,
                    headers : {'Content-Type': 'application/json' },
                    success : function (data, status, xhr) {
                                console.log('save success');
                            }
    };

    var qs = $.param({name:IPython.notebook.notebook_name+'.bkp', format:'json'});
    var url = $('body').data('baseProjectUrl') + 'notebooks?' + qs;
    console.log(settings);
    $.ajax(url, settings);
}
var seconde = 1000
var minute = 60*seconde

// comment following line to disable.
setTimeout(make_backup,5*minute)

Rather than auto-save, what about save-on-run? Every time you hit shift-enter (or some other shortcut) it saves the notebook and runs the cell. This would be the functional equivalent of working on the desktop, where you have to save your work before you execute it.

Owner

Carreau commented Mar 11, 2013

Rather than auto-save, what about save-on-run? Every time you hit shift-enter (or some other shortcut) it saves the notebook and runs the cell. This would be the functional equivalent of working on the desktop, where you have to save your work before you execute it.

It would be better to have "save on result back".

The problem is every save can send a lot on the wire, so on every run could be really expensive.

Owner

minrk commented Mar 11, 2013

@fonnesbeck your idea can be implemented with this in an extension (or anywhere else):

from IPython.display import display, Javascript

def savenb():
    display(Javascript("IPython.notebook.save_notebook()"))

get_ipython().register_post_execute(savenb)

(this will save after output is done, rather than immediately on execute)

schwehr commented Mar 19, 2013

+1 to stash (aka mv) at least the prior version (perhaps hidden in the .ipython profile directory). Use case... wife deletes ipynb file by accident while on long trip with a lot of work done between backups. This is more important for the novice that the expert. More experienced users are likely to have other strategies (and might want to turn off the feature). On the mac, should would have bumped into the backup when desperately searching via spotlight/mdfind/find.

JDWarner commented Apr 9, 2013

I'm trying to implement something similar but slightly different: I want an auto-save option which saves a .py script, not the .ipnb file. The thought is to .gitignore *.ipnb files, tracking changes through these (much more readable in diffs) Python scripts in rapid development/prototyping.

I have tried but cannot easily find a modification of any of the common Javascript hacks to allow .ipnb autosaving, nor can I find the Javascript API reference behind IPython.notebook. to see if there's a save_script() command or similar.

Owner

minrk commented Apr 9, 2013

If you start the notebook with --script, every notebook save will imply an export to an adjacent .py script. You will probably never be able to trigger this behavior without saving the nb as well with public APIs, as .py is a lossy export not a save.

JDWarner commented Apr 9, 2013

So if I use one of the standard Javascript hacks after starting with --script, will the autosave from IPython.notebook.save_notebook() export just the .py script I'm looking for, or will it overwrite both the .ipnb and the .py script?

Owner

minrk commented Apr 9, 2013

it will overwrite both - it will probably never be supported to export a notebook to a script without actually saving the notebook itself via public APIs, because the .py script is not a notebook - it is a lossy export of a notebook,
so the notebook has to exist first.

Owner

Carreau commented Apr 9, 2013

It will owerwrite both.

Le mardi 9 avril 2013, Josh Warner a écrit :

So if I use one of the standard Javascript hacks after starting with
--script, will the autosave from IPython.notebook.save_notebook() export
just the .py script I'm looking for, or will it overwrite both the .ipnb
and the .py script?


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/issues/1378#issuecomment-16140944
.

JDWarner commented Apr 9, 2013

Thanks for the clarification, even if it's not what I'd hoped. Since the option exists to download the .py script individually (or it seemed so) from the interface, I incorrectly assumed this lossy export occured in memory and could operate independently.

Owner

takluyver commented Apr 9, 2013

I think it's possible for the Python code (the server) to export a .py file
directly, without saving the .ipynb file. But because we haven't seen any
use for that, there's no way for the Javascript code (the client) to
request it. The server only knows the current contents of the notebook when
the client sends it to be saved. So to implement this kind of autosave,
you'd need to modify both parts.

On 9 April 2013 22:33, Josh Warner notifications@github.com wrote:

Thanks for the clarification, even if it's not what I'd hoped. Since the
option exists to download the .py script individually (or it seemed so)
from the interface, I incorrectly assumed this lossy export occured in
memory and could operate independently.


Reply to this email directly or view it on GitHubhttps://github.com/ipython/ipython/issues/1378#issuecomment-16141723
.

Contributor

astrofrog commented Jun 17, 2013

+1 to an integrated/by default auto-save. The solutions above are fine, but I think that this should still be implemented as something that is on by default.

Owner

Carreau commented Jun 17, 2013

+1 to an integrated/by default auto-save. The solutions above are fine, but I think that this should still be implemented as something that is on by default.

Autosave should be on by default on master.

So should this issue still be open ?

Contributor

astrofrog commented Jun 17, 2013

@Carreau - ok thanks, I'm using the latest stable so hadn't noticed. Great that it's implemented now!

Owner

minrk commented Jun 17, 2013

Nope, closing.

@minrk minrk closed this Jun 17, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment