Skip to content

Commit

Permalink
Merge pull request #31 from innolitics/rich-text-extensions
Browse files Browse the repository at this point in the history
Add test for weird side-case
  • Loading branch information
johndgiese committed Jun 29, 2022
2 parents 25aaf0c + 6433faa commit 7855782
Show file tree
Hide file tree
Showing 22 changed files with 1,377 additions and 459 deletions.
101 changes: 84 additions & 17 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,30 +52,40 @@ n2y PAGE_LINK > page.md

## Plugins

At the core of n2y are a set of python classes that subclass the `Block` class. These classes are responsible for converting the Notion data into pandoc abstract syntax tree objects. We use a python wrapper library that makes it easier to work with pandoc's AST. See [here](https://boisgera.github.io/pandoc/document/) for details. See the [Notion API documentation](https://developers.notion.com/reference/block) for details about their data structures.
At the core of n2y are a set of python classes that represent the various parts of a Notion workspace:

The default implementation of these block classes can be modified using a plugin system. To create a plugin, follow these steps:
| Notion Object Type | Description |
| --- | --- |
| Page | Represents a Notion page (which may or may not be in a database) |
| Database | A Notion database, which can also be though of as a set of Notion pages with some structured meta data, or properties |
| Property | A type descriptor for a property (or column) in a Notion database |
| PropertyValue | A particular value that a particular page in database has for a particular Property |
| Block | A bit of content within a Page |
| RichTextArray | A sequence of formatted text in Notion; present in many blocks and property values |
| RichText | A segment of text with the same styling |
| Mention | A reference to another Notion object (e.g., a page, database, block, user, etc. )
| User | A notion user; used in property values and in page, block, and database metadata |
| File | A file |

1. Create a new Python file.
2. Subclass the various Block classes and modify the `to_pandoc` methods as desired
3. Run n2y with the `--plugins` argument pointing to your python module.
The `Property`, `PropertyValue`, `Block`, `RichText`, and `Mention` classes have subclasses that represent the various subtypes. E.g., there is a `ParagraphBlock` that represents paragraph.

### Example Plugin File
These classes are responsible for converting the Notion data into pandoc abstract syntax tree objects. We use a python wrapper library that makes it easier to work with pandoc's AST. See [here](https://boisgera.github.io/pandoc/document/) for details. See the [Notion API documentation](https://developers.notion.com/reference/block) for details about their data structures.

```python
from n2y.converter import ParagraphBlock
The default implementation of these classes can be modified using a plugin system. To create a plugin, follow these steps:

1. Create a new Python module
2. Subclass the various notion classes, modifying their constructor or `to_pandoc` method as desired
3. Run n2y with the `--plugin` argument pointing to your python module

class ParagraphBlockOverride(ParagraphBlock):
def to_pandoc(self):
# Add custom code here. Call super().to_pandoc() to get default implementation.
return super().to_pandoc()
See the [builtin plugins](https://github.com/innolitics/n2y/tree/rich-text-extensions/n2y/plugins) for examples.

# Add classes to override here
exports = {
'ParagraphBlock': ParagraphBlockOverride
}
```
### Using Multiple Plugins

You can use multiple plugins. If two plugins provide classes for the same notion object, then the last one that was loaded will be instantiated.

Often you'll want to use a different class only in certain situations. For example, you may want to use a different Page class with its own unique behavior only for pages in a particular database.

If your plugin class raise the `n2y.errors.UseNextClass` exception in its constructor, then n2y will move on to the next class (which may be the builtin class if only one plugin was used).

### Default Block Class's

Expand All @@ -93,6 +103,7 @@ Here are the default block classes that can be extended:
| HeadingTwoBlock | |
| HeadingThreeBlock | |
| ImageBlock | It uses the URL for external images, but downloads uploaded images to the `MEDIA_ROOT` and replaces the path with a relative url based off of `MEDIA_URL`. The "caption" is used for the alt text. |
| FileBlock | Acts the same way as the ImageBlock, except that in the documents it only ever shows the URL. |
| NumberedListItemBlock | |
| ParagraphBlock | |
| QuoteBlock | |
Expand All @@ -104,6 +115,30 @@ Here are the default block classes that can be extended:

Most of the Notion blocks can generate their pandoc AST from _only_ their own data. The one exception is the list item blocks; pandoc, unlike Notion, has an encompassing node in the AST for the entire list. The `ListItemBlock.list_to_pandoc` class method is responsible for generating this top-level node.

## Built-in Plugins

N2y provides a few builtin plugins. Brief descriptions are provided below, but see [the code](https://github.com/innolitics/n2y/tree/rich-text-extensions/n2y/plugins) for details.

### Deep Headers

Notion only support three levels of headers, but sometimes this is not enough. This plugin enables support for h4 and h5 headers in the documents exported from Notion. Any Notion h3 whose text begins with the characters "= " is converted to an h4, and any h3 that begins with "== " is converted to an h5, and so on.

### Remove Callouts

Completely remove all callout blocks. It's often helpful to include help text in callout blocks, but usually this help text should be stripped out of the final generated documents.

### Raw Fenced Code Blocks

Any code block whose caption begins with "{=language}" will be made into a raw block for pandoc to parse. This is useful if you need to drop into Raw HTML or other formats. See [the pandoc documentation](https://pandoc.org/MANUAL.html#generic-raw-attribute) for more details on the raw code blocks.

### Mermaid Fenced Code Blocks

Adds support for generating mermaid diagrams from codeblocks with the "mermaid" language, as supported in the Notion UI.

This plugin assumes that the `mmdc` mermaid commandline tool is available, and will throw an exception if it is not.

If there are errors with the mermaid syntax, it is treated as a normal codeblock and the warning is logged.

## Architecture

N2y's architecture is divided into four main steps:
Expand Down Expand Up @@ -134,6 +169,38 @@ Here are some features we're planning to add in the future:

## Changelog

### v0.4.2

- Sanitize filenames (so that a notion page called "HFE/UE Report" won't attempt to create a directory.
- Remove styling that is tracked in Notion but is not visible in their UI, so as
to avoid generating confusing output. In particular, remove styling from page
titles and bolding for header blocks.
- Ignore (and print warnings and links) if there are unsupported blocks.
- Fix issue where images with the same name would collide with each other
- Add a mermaid diagram plugin
- Make page and database mentions more efficient; fix bug related to circular references with page mentions
- Fix pagination bug that occurred with databases with more than 100 pages
- Make it easier to use multiple plugins for the same class


### v0.4.1

- Add the ability to customize the where database page content is stored
(including providing the option not to export the content).
- Add support for the FileBlock
- Add `n2y.plugins.removecallouts` plugin
- Fix a bug that would occur if you had nested paragraphs or callout blocks
- Drop Notion code highlighting language if its not supported
- Ignore table of contents, breadcrumb, template, and unsupported blocks

### v0.4.0

- Split out the various rich_text and mention types into their own classes
- Add plugin support for all notion classes
- Improve error handling when the pandoc conversion fails
- Add a builtin "deep header" plugin which makes it possible to use h4 and h5
headers in Notion

### v0.3.0

- Add support for exporting sets of linked YAML files
Expand Down
Loading

0 comments on commit 7855782

Please sign in to comment.