IPEP 20: Informal structure of cell metadata
Clone this wiki locally
|Author||Matthias Bussonnier <firstname.lastname@example.org>|
|Created||August 7, 2013|
This IPEP proposes to add informal documentation for expected metadata fields and their structure.
Metadata are optional
Unlike the rest of the fields of the
.ipynb format, which should be strictly
defined, metadata can support arbitrary JSON structures. Hence, it is difficult
for third-party plugins to know what they can write/read and what the
expected structure of data is that they are supposed to handle.
The 3 types of metadata available are:
- notebook-level metadata;
- cell-level metadata;
- mimetype-level metadata.
[Note that there is no worksheet-level metadata, since the worksheet structure will be removed in 2.0.]
This proposal describes the use of cell-level metadata, but we should bear in mind that such a description should also be done for the 2 other types of metadata.
General advice on metadata
IPython notebook file are meant to be cross-language, hence the file format should help with this as much as possible.
For cross-language compatibility and efficiency, it is advised to avoid having the same field in the metadata structure being able to store data of different type.
For the same reason, it is advised in general to avoid when possible to rely on
None values, as well as value that could be considered as
such. In particular some languages might not make the distinction between the
absence of the field and its value being set to undefined or fact that empty
'', the number
0 (zero) and the boolean
false might be
Implementation should not make any assumption in the metadata presence or the value contained in it. The metadata filed itself should always be present in the relevant places, but should be set to at least empty dictionary when no metadata is present. Application/plugin should try as much as possible to keep the structure of metadata they don't know how to handle unless if requested by user interaction. This encompass both unknown field, as well as known field with unknown structure/type.
avoid name conflict
To avoid name collision, it is advised to avoid generic name like
..., and recommended to store plugin/application specific value
under a subkey with an explicit name.
Cell metadata fields:
For IPython Notebook 2.0, it has been decided to include two optional fields in
metadata that could be expected by the Notebook and NbViewer,
tags. The current proposal will explain the goal and usage of the
tags fields, as well as a few good practices when reading/writing them.
These two fields will be added because of the need when converting a notebook to be able to identify an individual cell, as well as groups of cells. This is not needed for the correct use of the notebook, so such data belong in the metadata field of a given cell.
name field is designed to contain a value of type
String which is a
priori unique in a notebook, and which should allow to reference a cell in a
unique way. This field (if present) will mainly be used in the user interface,
so it is recommended to be meaningful to the user, easy to type and easy to
The naming of the key as
name was chosen in favor of
id as it is more
meaningful for the user;
id is reserved for future usage, as a mandatory
field, not in metadata that will not be user-modifiable. The concept of the
name field is similar to that of
id for HTML tags.
When reading these values, an implementation should be able to make the
assumption that the
name field is unique across a notebook. When writing this
value to a cell metadata field, implementations should do their best to avoid
names across cells, warning the user about
duplication and proposing an alternative.
As well as for other metadta, and in addition,
the value of the name field should be a non-empty String. If the
value is an empty string or
undefined, an implementation is able to
assume that the field is non-existent or not set. If the value is
something other than a string, implementations can also act as if
the field was not set, but are advised to do their best not to modify
the current value unless explicitly requested by the user.
Do we stick to ASCII? Do we recommend not to use spaces? Try to avoid quotes? -- single? double? Do we want case sensitivity?
- a list of strings
with the following properties:
- each string avoids commas, since these are often used as separators when a user types.
- the order doesn't count
- each string must be unique
- basically the same recommendation as for
name, except for uniqueness?