## Introduction: `*.ipynb` and nbformat

###  `*.ipynb`

* JSON on-disk representation
* different versions 
    * current major version: 4
* versions define json.schema

Straightforward questions: 

1. **Minimal structure** needed to meet the schema?
2. **Validate** a notebook against the schema?

### nbformat

Python library for for simple programmatic notebook operations.

#### Minimal structure 

In [None]:
import nbformat
from nbformat.v4 import new_notebook
nb = new_notebook()
display(nb)

* `cells`: list
* `metadata`: dict
* `nbformat`, `nbformat_minor`: int, int

**Validate**

In [None]:
nbformat.validate(nb)

What happens if it's invalid?

In [None]:
nb.pizza = True
nbformat.validate(nb)

#### Cells and their `source`s

Before we can add cells, we need to create them.

* Three types of cells:
    * code_cell
    * markdown_cell
    * raw_cell

In [None]:
from nbformat.v4 import new_code_cell, new_markdown_cell, new_raw_cell

#### Markdown cells

In [None]:
nb = new_notebook() # lose the pizza
md = new_markdown_cell("First argument is the source.",
                       attachments={})
display(md)
nb.cells.append(md)

* `cell_type`: str, "markdown"
* `metadata`: dict
* `source`: str or list of strings
* `attachments`: (optional) dict of mimebundles

#### Raw cells

In [None]:
raw = new_raw_cell("Sources can be one (multil-line)\nstring.", 
                   attachments={})
display(raw)
nb.cells.append(raw)

* `cell_type`: str, "raw"
* `metadata`: dict
* `source`: str or list of strings
* `attachments`: (optional) dict of mimebundles

#### Code cells

In [None]:
code = new_code_cell(["#Sources can also be a list of strings.\n", 
                      "print('like this example')"])
display(code)
nb.cells.append(code)

* `cell_type`: str, "code"
* `execution_count`: `None` or int
* `metadata`: dict099
* `outputs`: list
* `source`: str or list of strings

## Creating outputs

Need to specify the output type:

* stream
* display_data
* execute_result
* error

NB: `display_data` and `execute_result` have data that are keyed by mimetypes. This an output to have more than one mimetype. 

For more on messages and output types, check out Matthias Bussonier and Paul Ivanov's talk:  
[Jupyter: Kernels, protocols, and the IPython reference implementation](https://conferences.oreilly.com/jupyter/jup-ny/public/schedule/detail/63159)

## Metadata

Three types
* notebook-level, `nb.metadata`
* cell-level, `nb.cells[0].metadata`
* output_level (for display_data and execute_result types), `nb.cells[0].outputs[0].metadata`


Arbitrary content, with some reserved Jupyter specific fields.

### Notes on notebook metadata

#### Reserved notebook metadata fields:
* `kernelspec`
* `language_info`
* `authors` 
* `title`

### Notes on cell metadata

#### Reserved cell metadata fields (all are optional):
* `deletable`
* `collapsed`
* `autoscroll`
* `jupyter` (jupyter metadata namespace, for internal use)
* `tags` (useful for semantic customization)
* `name` (should be unique in notebook) 


for **raw** cells:
* `format`

### Notes on ouptut metadata

#### Reserved output metadata
* `isolated`

#### NB: on `display_data` and `execute_result` metadata
Just like their `data` objects, these values are objects keyed by their mimetypes.