Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cell/Worksheet metadata #1934

Merged
merged 7 commits into from Jun 18, 2012
Merged

Cell/Worksheet metadata #1934

merged 7 commits into from Jun 18, 2012

Conversation

minrk
Copy link
Member

@minrk minrk commented Jun 13, 2012

This implements a few forward-looking changes in the nbformat, which are unused.

closes #1915

  • metadata dicts are attached to cells and worksheets
  • restores collapsed flag to the nbformat - this change happened in the refactor, and was undocumented, and possibly accidental. But we should either document it or fix it, and this includes a fix.
  • nbformat is a float, and the integer cast is used to denote readability. So we can potentially have 3.1 notebooks that must be readable by 3.0, even if not all capabilities are exposed.

These are proposed changes and meant for discussion, not to be merged immediately.

These are unused for now, but will allow adding data to the notebook without a full version bump.
js refactor removed this, but did not document the change.  This restores it, but it would be fine to revert this and update the docs instead.
int(nbformat) denotes readability
@fperez
Copy link
Member

fperez commented Jun 13, 2012

Only a couple of small questions:

  • should this PR re-create all the notebooks in the examples dir with their metadata updated, so users don't see update messages when they open them?
  • I'm pondering what to do about the format: should we use a float, or instead either a version string of 'x.y' format or a tuple (x, y)? I'm just a bit uncomfortable with the fact that floats are really meant for computation and have all kinds of issues with 'fuzziness'. I just wonder if in the long term we won't be more easily served by a representation where we can more easily separate the major/minor parts than by doing casting and floating point operations. Right now I'm leaning towards the tuple, so we don't even have to do string splits...

What do you think?

@minrk
Copy link
Member Author

minrk commented Jun 13, 2012

should this PR re-create all the notebooks in the examples dir with their metadata updated, so users don't see update messages when they open them?

What update message? There should be no update message, it will just work. Notebooks without metadata fields will just end up with empty metadata dicts on the next save. All current notebooks are still valid in this format.

I'm pondering what to do about the format: should we use a float, or instead either a version string of 'x.y' format or a tuple (x, y)? I'm just a bit uncomfortable with the fact that floats are really meant for computation and have all kinds of issues with 'fuzziness'. I just wonder if in the long term we won't be more easily served by a representation where we can more easily separate the major/minor parts than by doing casting and floating point operations. Right now I'm leaning towards the tuple, so we don't even have to do string splits...

If we commit to having two fields (I don't see why we would have more), then there is no functional difference between float and tuple. We definitely shouldn't have string versions, as those are nothing but a pain. My main reason for float was minimal change from the existing pattern. If we go to tuple, then any code that switches on nbformat must have extra type checks to support v2/v3, because the version check would be a different type, and you cannot even try to compare tuples and ints in Python 3, and all current notebooks would not be valid v3, and we would need conversion code.

So tuple is better in general / in principal, but the problems it solves are not ones I would expect to have here with floats, and changing the type of nbformat mid-version, or even cross-version is at least a little bit problematic.

Yet another option that has the drawbacks of neither floats nor changing the basic type of nbformat would be to add this info in a different field altogether (revision, version_info, mdformat, etc.), and leaving nbformat as the integer determining basic structure necessary for backward-readability. This field would determine the growing level of capabilities, within the general structure strictly defined by the nbformat. Given the current model, a likely name would be mdformat, since all such functionality would be implemented in the various metadata dicts.

I have no strong feeling among any of these, but I would lean slightly towards this third solution.

Back to you :)

@minrk
Copy link
Member Author

minrk commented Jun 13, 2012

I should note that the new solution is essentially a tuple as you proposed, and could indeed be a tuple itself, if not just a second integer. With the only (and important) difference that it does not change the type of an existing key.

@fperez
Copy link
Member

fperez commented Jun 13, 2012

Let's go with the last idea, I agree with your reasoning above. Not sure mdformat is the best name: you're correct right now, but I can imagine bumps in that number that could happen for reasons other than metadata. nbsubformat? nbformat_minor? nbformat_revision?

@minrk
Copy link
Member Author

minrk commented Jun 13, 2012

If we make it a tuple, nbformat_info seems to follow the Python convention. In this case, I would give nbformat as the first element.

But maybe nbformat_minor, and just an int.

@fperez
Copy link
Member

fperez commented Jun 13, 2012

Oh yes, I was just thinking of going with another int. I really don't think we should get in the business of having a 3-level format spec! So we have format/format_minor and that's it. How does that sound?

@minrk
Copy link
Member Author

minrk commented Jun 13, 2012

sounds good to me.

@minrk
Copy link
Member Author

minrk commented Jun 13, 2012

What do we want in terms of UI for minor version mismatch?

This cannot (by definition of the minor version) cause breakage, so should we just ignore it?

Case 1: older->newer: nothing can be lost, but you might have access to minor new features that will be unavailable if you give the file back to the original author (which may be you).
Case 2: newer->older: some minor features visible to the author are not visible to you, but the notebook remains perfectly functional.

I'm inclined for case 1 to be silent, but I'm not 100% sure about 2.

@fperez
Copy link
Member

fperez commented Jun 13, 2012

On Wed, Jun 13, 2012 at 1:17 AM, Min RK
reply@reply.github.com
wrote:

I'm inclined for case 1 to be silent, but I'm not 100% sure about 2.

We're starting to want that notification area... But that's too much
for now. Ugly modal dialog in the time being? Or go silent for now?
I'm not too crazy about the silent option.

@minrk
Copy link
Member Author

minrk commented Jun 13, 2012

We've already got a notification area, where the "notebook saved", etc. messages go. But not a growl-style semi-persistent place for longer messages.

One more question:

What happens when a 3.0 user opens and then saves a 3.1 notebook? Does it downgrade the minor version? If it doesn't, they will get the warning about downgrading every time they open the notebook. But if it does, then the version number isn't particularly meaningful anymore, as the metadata written by 3.1 will still be in tact, but the 3.0 tag suggests that it is empty.

@minrk
Copy link
Member Author

minrk commented Jun 13, 2012

Okay, I have a simple message similar to the other nbformat one. It needs looking over by eyes less tired than mine right now, so I think I am done for the night.

The current behavior is that the notebook always saves with its version, so if you get a 3.2 notebook, but you only support 3.0, the next time you save it will be 3.0. Anything 3.2 wrote to the metadata will remain untouched, so there is no data loss, but this does mean that it is not a valid assumption that a 3.0 notebook has empty metadata - any 3.x notebook could have metadata written by any 3.(y≥x).

@minrk
Copy link
Member Author

minrk commented Jun 13, 2012

After reading over and playing with it, I don't have anything left to add.

Question still outstanding:

  • do we still want to store collapsed, and if so, do we put it in the new metadata dict?

It hasn't been stored since the js refactor, and I don't think anyone has even noticed, but I can see it being useful for demos.

@ellisonbg
Copy link
Member

I like the idea of having cell and worksheet metadata and I think this implementation is solid. Not so sure about introducing the notion of the minor version. Can you summarize our policies on when the major and minor version will be incremented?

@minrk
Copy link
Member Author

minrk commented Jun 17, 2012

Major versions are updated when notebooks are no longer readable (new cell types, etc.). minor versions are updated as capabilities are added that can be safely ignored (metadata).

For instance, if we add cell metadata in 3.1, 3.2 (such as timestamps, hashes for unique links, etc.), but make no other changes, a client that can read 3.0 can read these notebooks just fine, but may lack some minor niceties introduced after the release. In the case of the metadata dicts, arbitrary amounts of information can go in there, and 3.0 will be able to read just fine, just ignoring that data.

One edge case right now: multiple worksheets - nbformat v3 supports multiple worksheets, but the javascript does not. If you open a notebook with multiple worksheets in the current interface, your next save will discard any sheets beyond the first (which you were not able to see, anyway).

@minrk
Copy link
Member Author

minrk commented Jun 17, 2012

Here is a question: the nbformat supports multiple worksheets just fine, but the javascript does not. How should we handle that? Should the javascript give a big warning dialog if multiple worksheets are detected? There's no logical reason for the nbformat itself to be revved if we add multiple worksheet support to the js in 0.14, but we need some way to indicate that 0.13 will discard a bunch of data if you try to open/re-save multi-sheet notebooks in 0.13.

@minrk
Copy link
Member Author

minrk commented Jun 17, 2012

I added a dialog warning that the extra worksheets will be lost, if you open a multi-worksheet notebook with the current JS.

@fperez
Copy link
Member

fperez commented Jun 18, 2012

Great. This is looking merge-ready for the beta, no?

@minrk
Copy link
Member Author

minrk commented Jun 18, 2012

yes, I think so.

@fperez
Copy link
Member

fperez commented Jun 18, 2012

OK, merging now. One more down...

fperez added a commit that referenced this pull request Jun 18, 2012
Cell/Worksheet metadata

* metadata dicts are attached to cells and worksheets
* restores collapsed flag to the nbformat - this change happened in the refactor, and was undocumented, and possibly accidental.  But we should either document it or fix it, and this includes a fix.
* adds a new field, `nbformat_minor`, used to denote minor bumps of the notebook format that expose new capabilities but don't prevent loading by older clients.
* Add a warning in Javascript if loading a multiworksheet notebook (which will exist in the future) as current JS code will only save the first.


closes #1915
@fperez fperez merged commit ca7d2a9 into ipython:master Jun 18, 2012
@minrk minrk deleted the cellmd branch March 31, 2014 23:36
mattvonrocketstein pushed a commit to mattvonrocketstein/ipython that referenced this pull request Nov 3, 2014
Cell/Worksheet metadata

* metadata dicts are attached to cells and worksheets
* restores collapsed flag to the nbformat - this change happened in the refactor, and was undocumented, and possibly accidental.  But we should either document it or fix it, and this includes a fix.
* adds a new field, `nbformat_minor`, used to denote minor bumps of the notebook format that expose new capabilities but don't prevent loading by older clients.
* Add a warning in Javascript if loading a multiworksheet notebook (which will exist in the future) as current JS code will only save the first.


closes ipython#1915
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

add cell-level metadata
3 participants