---
title: 
tags: 小书匠,
grammar_cjkRuby: true
renderNumberedHeading: true
---

[toc]

# Using nbconvert as a library

1. Retrieve the notebook and it’s accompanying resources (you are responsible for this).
2. Feed the notebook into the Exporter, which:
    - Sequentially feeds the notebook into an array of Preprocessors. Preprocessors only act on the structure of the notebook, and have unrestricted access to it.
    - Feeds the notebook into the Jinja templating engine, which converts it to a particular format depending on which template is selected.
3. The exporter returns the converted notebook and other relevant resources as a tuple.
4. You write the data to the disk using the built-in FilesWriter (which writes the notebook and any extracted files to disk), or elsewhere using a custom Writer.

## Retrieve notebook

In [1]:
notebook_name = 'test.ipynb'

In [2]:
import nbformat

with open(notebook_name) as f:
    content = f.read()

notebook = nbformat.reads(content, as_version=4)
notebook.cells[0]

{'cell_type': 'markdown', 'metadata': {}, 'source': '这是个测试文档'}

## Use Exporter

In [3]:
from traitlets.config import Config
from nbconvert import MarkdownExporter

# 2. Instantiate the exporter. We use the `classic` template for now; we'll get into more details
# later about how to customize the exporter further.
html_exporter = MarkdownExporter()
html_exporter.template_name = 'classic'

# 3. Process the notebook we loaded earlier
(body, resources) = html_exporter.from_notebook_node(notebook)

In [4]:
print(body) # body 是文本内容

这是个测试文档


```python
import matplotlib.pyplot as plt
import numpy as np

plt.plot(np.arange(10))
```




    [<matplotlib.lines.Line2D at 0x1159415c0>]




    
![png](output_1_1.png)
    


# References

test


```python
!ls
```

    Jupyter nbconvert.ipynb output_1_1.png          test.ipynb
    [34mout[m[m                     [34mtest[m[m




In [6]:
resources # 是一些资源
resources['outputs']

{'output_1_1.png': b'\x89PNG\r\n\x1a\n\x00\x00\x00\rIHDR\x00\x00\x01n\x00\x00\x00\xfc\x08\x06\x00\x00\x00W{Ns\x00\x00\x00\x04sBIT\x08\x08\x08\x08|\x08d\x88\x00\x00\x00\tpHYs\x00\x00\x0b\x12\x00\x00\x0b\x12\x01\xd2\xdd~\xfc\x00\x00\x009tEXtSoftware\x00matplotlib version 3.0.3, http://matplotlib.org/\x9d\x0b\xab\xa3\x00\x00\x1d\xedIDATx\x9c\xed\xddw|U\xf5\xe1\xc6\xf1\xcf\x97\x84@\x02$\x10v\x80\x90\xb0IH\x10\x08\xdb-\xd6\x81(H\xadZ7\xb5\xd8\xfej\xd5_[!\x80\x03+*\x8eZ\xadu\x81\xfb\xa7\xadU\x12\xa6\x80H\x1d\xc5\x85\x02Bv\x18a$\x84\x11V\x12\xb2\x93\xfb\xfd\xfd\x01mQQ.po\xce\x1d\xcf\xfb/\x12o\x92\xc7C\xf2\xbc\x0e\'\xf7<\xd7Xk\x11\x11\x11\xff\xd1\xc4\xe9\x00""rrT\xdc""~F\xc5-"\xe2gT\xdc""~F\xc5-"\xe2gT\xdc""~F\xc5-"\xe2gT\xdc""~F\xc5-"\xe2gB\xbd\xf1I\xdb\xb5kg\xe3\xe2\xe2\xbc\xf1\xa9ED\x02\xd2\xda\xb5k\xf7Yk\xdb\xbb\xf3X\xaf\x14w\\\\\x1ck\xd6\xac\xf1\xc6\xa7\x16\x11\tH\xc6\x98\xed\xee>V\x97JDD\xfc\x8c\x8a[D\xc4\xcf\xa8\xb8ED\xfc\x8c\x8a[D\xc4\xcf\xa8\xb8ED\xfc\x8c\x8a[D\xc4\xcf\xa8\xb8ED\xfc\x

### Custom preprocessor

只需要继承 [Preprocessor](https://github.com/jupyter/nbconvert/blob/d795db507db1a507343bfa7fe3c0e62d97bd6a87/nbconvert/preprocessors/base.py) 就可以了。又两个属性可以重载：preprocess 和 preprocess_cell

In [None]:
from traitlets import Integer
from nbconvert.preprocessors import Preprocessor

class RemoveEmptyCell(Preprocessor):
    
    def preprocess(self, nb, resources):
        removed_index = set()
        for i, cell in enumerate(nb.cells):
            if cell.source.strip() == "":
                removed_index.append(i)
        nb.cells = [cell for i, cell in enumerate(nb.cells) if i not in removed_index]
        return nb, resources
    
class AddReferences(Preprocessor):
    
    def preprocess_cell(self, cell, resources, index):
        self.log.info("I'll keep only cells from %d to %d")
        if cell.source.startswith("# References"):
            lines = cell.source.split('\n')
            lines.insert(1, "- http://localhost:8888/lab/tree/learnPython/Jupyter/Jupyter%20nbconvert.ipynb")
            cell.source = "\n".join(lines)
        return cell, resources
    
class ProcessBashCell(Preprocessor):
    
    def preprocess_cell(self, cell, resources, index):
        # print(cell.cell_type)
        is_bash_cell = True
        lines = cell.source.strip().split('\n')
        for line in lines:
            if not line.startswith("!"):
                is_bash_cell = False
                break
        if is_bash_cell:
            cell.source = "\n".join(line.lstrip("!") for line in lines)
            cell.metadata = {'magics_language': 'bash'}
        return cell, resources

config =  Config()
config.MarkdownExporter.preprocessors = [AddReferences, RemoveEmptyCell, ProcessBashCell]
exporter = MarkdownExporter(config=config)
print(exporter.from_notebook_node(notebook)[0])

In [None]:
import os
output_dir = os.path.splitext(notebook_name)[0]
notebook_basename = os.path.basename(output_dir)

if not os.path.exists(output_dir):
    os.makedirs(output_dir)

with open(os.path.join(output_dir, "{}.md".format(notebook_basename)), 'w') as f:
    f.write(body)
    
for filename, content in resources['outputs'].items():
    with open(os.path.join(output_dir, filename), 'wb') as f:
        f.write(content)

# References
- [Using nbconvert as a library — nbconvert 6.0.8.dev0 documentation](https://nbconvert.readthedocs.io/en/latest/nbconvert_library.html#Example)