Skip to content

IPEP 17: Notebook Format 4

Min RK edited this page Sep 27, 2013 · 8 revisions
Status Active
Author Min RK <benjaminrk@gmail.com>
Created April 29, 2013
Updated September 27, 2013

There are a few changes we need to make to the notebook that will not be backward compatible. We do not intend to make these changes for 1.0, because nbformat changes are quite painful. This is a catalog of the changes we intend to make when we do next rev the nbformat.

remove multiple worksheets

The worksheets field is a list, but we have no UI to support multiple worksheets. Our design has since shifted to heading-cell based structure, so we never intend to support the multiple worksheet model. The worksheets list of lists shall be replaced with a single list, called cells.

use mime-type output keys

We transform mimetype output data to short names, like json or png. These should be restored to proper mimetype values of image/png and application/json, etc. used by the message spec. The output should be generated by a simple passthrough of the messages, rather than a whitelist transform.

Remove python-centric names

Following IPEP 13, Python-specific keys in the message spec and notebook will be removed. Those affecting the notebook format:

  • pyout will become execute_output
  • pyerr will become error

Make cell content key uniform

Currently text cells have a source key, which contains the text, and code cells have an input key. There is no reason for the two cell types to have a different name for their content:

  • CodeCell.input will become CodeCell.source, matching TextCell.source.

metadata changes

  • remove notebook name from metadata
  • move language key from code cells to top-level notebook metadata
  • add kernel info to top-level notebook metadata in some form
  • add format key to raw_cell metadata
  • add state for show/hide (already have) and auto-scroll.

Implementation and Coordination

Tasks involved in creating nbformat v4:

  • thoroughly define the v4 spec
  • update message spec keys (pyout, pyerr, etc.)
  • mime-type keys for output (affects nbconvert, nbformat, javascript)
  • remove worksheets, move cells to top-level list
  • add conversions to nbformat: v3->v2, v4->v3, v3->v4
  • metadata changes
  • widget-related changes (TBD)
  • we will need v4->v4 to track changes to v4 during development. If so, this should probably not be included in release, right?

I think this is the logical order of these tasks:

  1. Define v4 in a doc (not just changes, full spec - v3 was never fully defined)
  2. add downgrade API to nbformat (or nbconvert, unclear which), and implement v3->v2
  3. copy v3 to v4, adding empty v4->v3 and v3->v4, removing the py/json distinction (nbconvert is responsible for .py now)
  4. remove worksheet in v4
  5. update msg spec keys that are reflected in notebook
  6. use mime-type output keys
  7. update various metadata keys (this mainly affects javascript code)

v2<->v3 conversion APIs can be done while v4 is being defined, but no part of v4 should be implemented until the spec is documented. Incremental implementations of v4 features, starting with 4. can be implemented in discrete PRs, probably on a v4 feature branch. Their order relative to each other isn't critically important.

Each time a change is made to the in-development v4 spec:

  • update spec doc
  • update nbformat.v4
  • update v4->v3 and v3->v4
  • update v4->v4?
  • update javascript, if affected
  • update nbconvert, if affected
  • TEST EVERY NEW CHANGE
Clone this wiki locally