Cell/Worksheet metadata #1934

minrk · 2012-06-13T02:28:20Z

This implements a few forward-looking changes in the nbformat, which are unused.

closes #1915

metadata dicts are attached to cells and worksheets
restores collapsed flag to the nbformat - this change happened in the refactor, and was undocumented, and possibly accidental. But we should either document it or fix it, and this includes a fix.
nbformat is a float, and the integer cast is used to denote readability. So we can potentially have 3.1 notebooks that must be readable by 3.0, even if not all capabilities are exposed.

These are proposed changes and meant for discussion, not to be merged immediately.

These are unused for now, but will allow adding data to the notebook without a full version bump.

js refactor removed this, but did not document the change. This restores it, but it would be fine to revert this and update the docs instead.

int(nbformat) denotes readability

fperez · 2012-06-13T06:24:42Z

Only a couple of small questions:

should this PR re-create all the notebooks in the examples dir with their metadata updated, so users don't see update messages when they open them?
I'm pondering what to do about the format: should we use a float, or instead either a version string of 'x.y' format or a tuple (x, y)? I'm just a bit uncomfortable with the fact that floats are really meant for computation and have all kinds of issues with 'fuzziness'. I just wonder if in the long term we won't be more easily served by a representation where we can more easily separate the major/minor parts than by doing casting and floating point operations. Right now I'm leaning towards the tuple, so we don't even have to do string splits...

What do you think?

minrk · 2012-06-13T06:55:20Z

should this PR re-create all the notebooks in the examples dir with their metadata updated, so users don't see update messages when they open them?

What update message? There should be no update message, it will just work. Notebooks without metadata fields will just end up with empty metadata dicts on the next save. All current notebooks are still valid in this format.

I'm pondering what to do about the format: should we use a float, or instead either a version string of 'x.y' format or a tuple (x, y)? I'm just a bit uncomfortable with the fact that floats are really meant for computation and have all kinds of issues with 'fuzziness'. I just wonder if in the long term we won't be more easily served by a representation where we can more easily separate the major/minor parts than by doing casting and floating point operations. Right now I'm leaning towards the tuple, so we don't even have to do string splits...

If we commit to having two fields (I don't see why we would have more), then there is no functional difference between float and tuple. We definitely shouldn't have string versions, as those are nothing but a pain. My main reason for float was minimal change from the existing pattern. If we go to tuple, then any code that switches on nbformat must have extra type checks to support v2/v3, because the version check would be a different type, and you cannot even try to compare tuples and ints in Python 3, and all current notebooks would not be valid v3, and we would need conversion code.

So tuple is better in general / in principal, but the problems it solves are not ones I would expect to have here with floats, and changing the type of nbformat mid-version, or even cross-version is at least a little bit problematic.

Yet another option that has the drawbacks of neither floats nor changing the basic type of nbformat would be to add this info in a different field altogether (revision, version_info, mdformat, etc.), and leaving nbformat as the integer determining basic structure necessary for backward-readability. This field would determine the growing level of capabilities, within the general structure strictly defined by the nbformat. Given the current model, a likely name would be mdformat, since all such functionality would be implemented in the various metadata dicts.

I have no strong feeling among any of these, but I would lean slightly towards this third solution.

Back to you :)

minrk · 2012-06-13T06:57:13Z

I should note that the new solution is essentially a tuple as you proposed, and could indeed be a tuple itself, if not just a second integer. With the only (and important) difference that it does not change the type of an existing key.

fperez · 2012-06-13T07:47:07Z

Let's go with the last idea, I agree with your reasoning above. Not sure mdformat is the best name: you're correct right now, but I can imagine bumps in that number that could happen for reasons other than metadata. nbsubformat? nbformat_minor? nbformat_revision?

minrk · 2012-06-13T07:53:25Z

If we make it a tuple, nbformat_info seems to follow the Python convention. In this case, I would give nbformat as the first element.

But maybe nbformat_minor, and just an int.

fperez · 2012-06-13T07:56:33Z

Oh yes, I was just thinking of going with another int. I really don't think we should get in the business of having a 3-level format spec! So we have format/format_minor and that's it. How does that sound?

minrk · 2012-06-13T07:57:07Z

sounds good to me.

minrk · 2012-06-13T08:17:53Z

What do we want in terms of UI for minor version mismatch?

This cannot (by definition of the minor version) cause breakage, so should we just ignore it?

Case 1: older->newer: nothing can be lost, but you might have access to minor new features that will be unavailable if you give the file back to the original author (which may be you).
Case 2: newer->older: some minor features visible to the author are not visible to you, but the notebook remains perfectly functional.

I'm inclined for case 1 to be silent, but I'm not 100% sure about 2.

fperez · 2012-06-13T08:29:23Z

On Wed, Jun 13, 2012 at 1:17 AM, Min RK
reply@reply.github.com
wrote:

I'm inclined for case 1 to be silent, but I'm not 100% sure about 2.

We're starting to want that notification area... But that's too much
for now. Ugly modal dialog in the time being? Or go silent for now?
I'm not too crazy about the silent option.

minrk · 2012-06-13T08:48:17Z

We've already got a notification area, where the "notebook saved", etc. messages go. But not a growl-style semi-persistent place for longer messages.

One more question:

What happens when a 3.0 user opens and then saves a 3.1 notebook? Does it downgrade the minor version? If it doesn't, they will get the warning about downgrading every time they open the notebook. But if it does, then the version number isn't particularly meaningful anymore, as the metadata written by 3.1 will still be in tact, but the 3.0 tag suggests that it is empty.

minrk · 2012-06-13T09:36:44Z

Okay, I have a simple message similar to the other nbformat one. It needs looking over by eyes less tired than mine right now, so I think I am done for the night.

The current behavior is that the notebook always saves with its version, so if you get a 3.2 notebook, but you only support 3.0, the next time you save it will be 3.0. Anything 3.2 wrote to the metadata will remain untouched, so there is no data loss, but this does mean that it is not a valid assumption that a 3.0 notebook has empty metadata - any 3.x notebook could have metadata written by any 3.(y≥x).

minrk · 2012-06-13T22:11:40Z

After reading over and playing with it, I don't have anything left to add.

Question still outstanding:

do we still want to store collapsed, and if so, do we put it in the new metadata dict?

It hasn't been stored since the js refactor, and I don't think anyone has even noticed, but I can see it being useful for demos.

ellisonbg · 2012-06-16T06:47:54Z

I like the idea of having cell and worksheet metadata and I think this implementation is solid. Not so sure about introducing the notion of the minor version. Can you summarize our policies on when the major and minor version will be incremented?

minrk · 2012-06-17T03:48:02Z

Major versions are updated when notebooks are no longer readable (new cell types, etc.). minor versions are updated as capabilities are added that can be safely ignored (metadata).

For instance, if we add cell metadata in 3.1, 3.2 (such as timestamps, hashes for unique links, etc.), but make no other changes, a client that can read 3.0 can read these notebooks just fine, but may lack some minor niceties introduced after the release. In the case of the metadata dicts, arbitrary amounts of information can go in there, and 3.0 will be able to read just fine, just ignoring that data.

One edge case right now: multiple worksheets - nbformat v3 supports multiple worksheets, but the javascript does not. If you open a notebook with multiple worksheets in the current interface, your next save will discard any sheets beyond the first (which you were not able to see, anyway).

minrk · 2012-06-17T08:48:13Z

Here is a question: the nbformat supports multiple worksheets just fine, but the javascript does not. How should we handle that? Should the javascript give a big warning dialog if multiple worksheets are detected? There's no logical reason for the nbformat itself to be revved if we add multiple worksheet support to the js in 0.14, but we need some way to indicate that 0.13 will discard a bunch of data if you try to open/re-save multi-sheet notebooks in 0.13.

minrk · 2012-06-17T23:33:31Z

I added a dialog warning that the extra worksheets will be lost, if you open a multi-worksheet notebook with the current JS.

fperez · 2012-06-18T00:14:02Z

Great. This is looking merge-ready for the beta, no?

minrk · 2012-06-18T00:18:15Z

yes, I think so.

fperez · 2012-06-18T00:19:13Z

OK, merging now. One more down...

Cell/Worksheet metadata * metadata dicts are attached to cells and worksheets * restores collapsed flag to the nbformat - this change happened in the refactor, and was undocumented, and possibly accidental. But we should either document it or fix it, and this includes a fix. * adds a new field, `nbformat_minor`, used to denote minor bumps of the notebook format that expose new capabilities but don't prevent loading by older clients. * Add a warning in Javascript if loading a multiworksheet notebook (which will exist in the future) as current JS code will only save the first. closes #1915

Cell/Worksheet metadata * metadata dicts are attached to cells and worksheets * restores collapsed flag to the nbformat - this change happened in the refactor, and was undocumented, and possibly accidental. But we should either document it or fix it, and this includes a fix. * adds a new field, `nbformat_minor`, used to denote minor bumps of the notebook format that expose new capabilities but don't prevent loading by older clients. * Add a warning in Javascript if loading a multiworksheet notebook (which will exist in the future) as current JS code will only save the first. closes ipython#1915

minrk added 3 commits June 12, 2012 19:12

add empty metadata field on cells/worksheets

6f3a3b2

These are unused for now, but will allow adding data to the notebook without a full version bump.

restore collapsed state for cells

081f278

js refactor removed this, but did not document the change. This restores it, but it would be fine to revert this and update the docs instead.

add nbformat is float

f9ccb76

int(nbformat) denotes readability

minrk added 2 commits June 13, 2012 02:20

add nbformat_minor for minor revisions to nbformat

59c2d87

add dialog when loading newer minor-version notebooks

b860834

tkf mentioned this pull request Jun 13, 2012

Add --no-pager argument to pycat #1947

Closed

minrk mentioned this pull request Jun 13, 2012

update some example notebooks #1954

Merged

add minor-version support to .py export

5d4da6e

add future-warning about multiple notebooks

383ef0c

fperez merged commit ca7d2a9 into ipython:master Jun 18, 2012

minrk deleted the cellmd branch March 31, 2014 23:36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Cell/Worksheet metadata #1934

Cell/Worksheet metadata #1934

minrk commented Jun 13, 2012

fperez commented Jun 13, 2012

minrk commented Jun 13, 2012

minrk commented Jun 13, 2012

fperez commented Jun 13, 2012

minrk commented Jun 13, 2012

fperez commented Jun 13, 2012

minrk commented Jun 13, 2012

minrk commented Jun 13, 2012

fperez commented Jun 13, 2012

minrk commented Jun 13, 2012

minrk commented Jun 13, 2012

minrk commented Jun 13, 2012

ellisonbg commented Jun 16, 2012

minrk commented Jun 17, 2012

minrk commented Jun 17, 2012

minrk commented Jun 17, 2012

fperez commented Jun 18, 2012

minrk commented Jun 18, 2012

fperez commented Jun 18, 2012

Cell/Worksheet metadata #1934

Cell/Worksheet metadata #1934

Conversation

minrk commented Jun 13, 2012

fperez commented Jun 13, 2012

minrk commented Jun 13, 2012

minrk commented Jun 13, 2012

fperez commented Jun 13, 2012

minrk commented Jun 13, 2012

fperez commented Jun 13, 2012

minrk commented Jun 13, 2012

minrk commented Jun 13, 2012

fperez commented Jun 13, 2012

minrk commented Jun 13, 2012

minrk commented Jun 13, 2012

minrk commented Jun 13, 2012

ellisonbg commented Jun 16, 2012

minrk commented Jun 17, 2012

minrk commented Jun 17, 2012

minrk commented Jun 17, 2012

fperez commented Jun 18, 2012

minrk commented Jun 18, 2012

fperez commented Jun 18, 2012