# Writing nbdev plugins

> How to customize nbdev processors to do what you want

- order: 9

In [None]:
#| hide
from __future__ import annotations
from fastcore.test import *
from nbdev.showdoc import *
from nbdev.qmd import *

## What will this cover?

With `nbdev` it is *exceedingly* easy to customize it to your own doing. Does your particular library or setup require you to inject some custom quarto bits in certain cells? What about if you want to do something more trivial as finding shortcuts to replace the `---` annotation used in quarto. 

Writing custom plugins is exactly how you can achieve this, and it's quite trivial to do so! 

This tutorial *won't* cover some of the basics when it comes to nbdev, and instead comes with the understanding you know how to navigate nbdev (such as what are directives, `export`, etc). 

## Getting started, how does `nbdev` make this easy?

First, let's write out some problem we want to achieve here. For the sake of simplicity, let's say instead of doing:

```
::: {.column-margin}
Some text
:::
```

We wanted a shorter way to write this out, making use of how nbdev and quarto writes their *directives*? 

This tutorial will show how to end up with an API that looks like:

```
#| .column-margin

Some text
```

And this will include cases where a div should be put across *multiple* cells as well, by specifying a `start` and an `end.

This can be achieved in under 50 lines of code! 

`nbdev` let's us create what are called *processors* (this is how `export` will shove code into modules, for example). These processors are acted on each **cell** of a notebook and can modify its contents. These can then be wrapped into a module the same way that nbdev will do `nbdev_export` or `nbdev_docs`, however thanks to the power of extensions going too deep into that actually isn't required!

## Bringing in what we need

The actual imports we need to use from `nbdev` is truly not that many! We just need two, `extract_directives` to read in the list of `#|` written, and the `Processor` class that will actually perform what we want. The rest of the imports are there to make some of our lives easier as will be explained later 

In [None]:
#| exports
from nbdev.process import extract_directives
from nbdev.processors import Processor

from fastcore.basics import listify

from string import Template

Lastly for *testing* purposes we'll utilize `nbdev`'s `mk_cell` and `NBProcessor` class which will let us mock running our processor on a real notebook!

In [None]:
from nbdev.processors import mk_cell, NBProcessor

## Writing a converter

The first step here is creating a quick and easy way to take the quarto div we want to use (such as `.column-margin`) and convert it quickly into something quarto will then read (such as `::: {.column-margin}`. This one little string Template will do all of the dirty work when it comes to our layouts quickly:

In [None]:
_LAYOUT_STR = Template("::: {$layout}\n${content}\n")

::: {.callout-tip}

This doesn't have to be a string template, I just found this the easiest to use!
:::

Next we need to write a simple converter that operates at the *cell* level:

In [None]:
def convert_layout(
    cell:dict, # A single cell from a Jupyter Notebook
    is_multicell=False # Whether the div should be wrapped around multiple cells
):
    content = cell.source
    code = cell.source.splitlines(True)
    div_ = cell.directives_["div"]
    # We check if end is in the first line of the cell source
    if "end" in div_:
        # If it just the end tag with nothing else
        if len(code) == 1:
            cell.source = ":::"
        else:
            code.source += ":::"
    else:
        # Actually modify the code
        cell.source = _LAYOUT_STR.substitute(
            layout = " ".join(div_),
            content = content
        )
        if not is_multicell:
            cell.source += ":::"

Let's go into detail on what's happening here.

```python
    content = cell.source
```

The source text of whatever exists in a notebook cell will live in `.source`.

```python
    code = cell.source.splitlines(True)
```
Then I want to extract the content of the cell and split them into multiple lines, seperated by newlines. This let's us check if a cell just contains `#| div end`, which means that the div that was started earlier should stop.

```python
    div_ = cell.directives_["div"]
```
Any directives (that bit marked with `#| `) for any type of cell will exist in the `directives_` attribute as a dictionary. For our particular processor we only care about the `div` directive

```python
   if "end" in div_:
        # If it just the end tag with nothing else
        if len(code) == 1:
            cell.source = ":::"
        else:
            code.source += ":::"
    else:
        # Actually modify the code
        cell.source = _LAYOUT_STR.substitute(
            layout = " ".join(div_),
            content = content
        )
        if not is_multicell:
            cell.source += ":::"
```

From there this last bit simply checks whether to add the ending div block (`:::`) to the cell or will use that `_LAYOUT_STR` made earlier which will inject the boilerplate div CSS code in for Quarto.

Let's see it in action:

In [None]:
cell = mk_cell(
    """#| div .margin-column
Here is something for the sidebar!""",
    cell_type="markdown"
)

`nbdev` will pull out those directives and store them in the cell `directives_` attribute using the `extract_directives` function:

In [None]:
cell.directives_ = extract_directives(cell, "#")
cell.directives_

{'div': ['.margin-column']}

And now we can test out!

In [None]:
convert_layout(cell)
print(cell.source)

::: {.margin-column}
Here is something for the sidebar!
:::


Looks exactly like we wanted earlier! Great!

Now, how do we tell nbdev to use this and create this `Processor`?

## Writing a `Processor`

The second-to-last step here is to create the actual `Processor` nbdev utilizes to apply procs. The basic understanding of these is simply that you should create a class, have it inherit `Processor`, and any modifications that should be done must be defined in a `cell` function which takes in a `cell` and modifies it in-place:

In [None]:
class LayoutProc(Processor):
    "A processor that will turn `div` based tags into proper quarto ones"
    # A class attribute on whether a div should span multiple cells or not
    has_partial = False
    def cell(self, cell):
        # Only apply it to markdown cells that contain the "div" directive
        if cell.cell_type == "markdown" and "div" in cell.directives_:
            # Extract the directives
            div_ = cell.directives_["div"]
            # If it's the end of a multi-span div, convert the layout
            if self.has_partial and "end" in div_:
                convert_layout(cell)
            # Otherwise...
            else:
                is_start = div_[-1] == "start"
                # If we have "start" (which means multi-span div), 
                # change `has_partial` and remove start
                if is_start:
                    self.has_partial = True
                    div_.remove("start")
                # Finally apply the processor
                convert_layout(cell, is_start)

How can we test if this will work or not? Jupyter Notebooks are just specially formatted dictionaries, so we can write one and use `nbdev`'s `dict2nb` to convert it to a "true" dictionary quickly. Then we can apply the processor to those cells though the `NBProcessor` class (what `nbdev` uses to apply these)

In [None]:
from nbdev.process import NBProcessor, dict2nb

In [None]:
nb = {
    "cells":[
    mk_cell("""#| div .column-margin
A test""", "markdown"),
    mk_cell("""#| div .column-margin start
A test""", "markdown"),
    mk_cell("""#| div end""", "markdown"),
]}

The `NBProcessor` takes in a list of procs (processors) that should be applied, and a notebook:

In [None]:
processor = NBProcessor(procs=LayoutProc, nb=dict2nb(nb))

The act of applying these processors is done through calling `.process():`

In [None]:
processor.process()

And now we can run a small test to ensure that the cell contents have been changed to use the properly formatted quarto syntax:

In [None]:
test_eq(processor.nb.cells[0].source, "::: {.column-margin}\nA test\n:::")
test_eq(processor.nb.cells[1].source, "::: {.column-margin}\nA test\n")
test_eq(processor.nb.cells[2].source, ":::")

Great! We've successfully created a plugin for nbdev that will let us lazily write markdown quarto directives easily. Now how can we actually *use* this?

## How to enable the plugin on your project

This requires two simple changes to your `settings.ini`.

First, if say this were code that lived in `nbdev`, we can add a special `procs` key and specify where the processor comes from:

```ini
procs = 
    nbdev.extensions:LayoutProc
```

As you can see it follows the format of `library.module:processor_name`

If say this were something built in an external library (such as how this processor is based on the one that lives in [nbdev-extensions](https://muellerzr.github.io/nbdev-extensions), you should add that to the requirements of your project:
```ini
requirements = nbdev-extensions
```

And you're done! Now when calling `nbdev_docs` or `nbdev_preview` the processor we just made will be *automatically* applied to your notebooks and perform this conversion!

## Conclusion, nbdev-extensions and a bit about me!

Basically if there's any part of a cell and how it should look either from exporting modules, building documentation, or creating your own special command to perform post-processing it can be done quickly and efficiently with this `Processor` class nbdev provides!

If you're interested in seeing more examples of nbdev-extensions and where you can take it I've (Zachary Mueller) written a library dedicated to it called [nbdev-extensions](https://muellerzr.github.io/nbdev-extensions) where any ideas that may benefit how I approach nbdev I then turn into an extension for the world to use.

Thanks for reading!