Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
..
Failed to load latest commit information.
CODE_OF_CONDUCT.md
README.md
example.md

README.md

Documentation

This project started as a Python port of the draftjs_exporter ruby library.

Other Draft.js exporter implementations

Full list on https://github.com/nikgraf/awesome-draft-js

Exporter behavior

  • HTML attributes are added in alphabetical order.
  • unstyled blocks without text render as an empty element.
  • Inline ranges aiming the same offset and length are always rendered in the same order (alphabetical order of the style type - BOLD, CODE, ITALIC).
  • style prop is rendered as-is if it is a string, or can also be a dict in which case its properties are converted into a string using camel_to_dash.
  • Invalid attributes are left for the BeautifulSoup / html5lib parser to handle.
  • HTML escaping is automatically done by BeautifulSoup / html5lib.
  • The string engine escapes single quotes.

R&D notes

Useful resources

Draft.js

Generating HTML in Python

Limitations of lxml

  • Succinct documentation (http://lxml.de/tutorial.html is the best)
  • XML parser first, not really meant to build HTML.
  • No support for document fragments.
  • No support for text nodes (elt.text and elt.tail attributes instead).

Limitations of html5lib

  • Generated HTML is always "made valid" by being wrapped in ''.

Limitations of BeautifulSoup4

  • The API to parse/generate HTML is clunky.

Constraints (to be weighted)

  • Keeping the coupling between the CMS and the editor low so that we can change the editor's capabilities without having to write back-end code
  • Giving the CMS enough control over the generated HTML. Not sure if "the CMS" just means "back-end" or if some JS code next to the editor code counts as well.
  • Maintaining this over time with changing Draft.js output formats and unmaintained or wildly changing AST formats
  • Regardless of implementation, being on the right spot between: "over the lifetime of the site/CMS, a given piece of content should always render the same HTML" and "a piece of content should always render the best HTML"
  • Keeping the coupling between CMS and editor low so that we can swap editors easily (bigger scope than the one of a single site/project here)
  • Favour logical completeness over raw speed. A regex parser will be faster than building a DOM. But caching makes that mostly irrelevant. The bottlenecks will all likely be in the DB lookups for inline entities anyway.

Other approaches to exporting from Draft.js

We have so far discussed a few different strategies to solve the rendering problem. Below are some of our notes.

We could create the HTML in JS when you save it in draft, then apply a similar strategy to Wagtail’s Rich Text, by iterating over the output with regexes? Still get the benefits of a much improved quality of editor and we can make use of existing JS-based draft exporters.

  • contentState to JSON, Python doing JSON to HTML, with this contentState -> HTML exporter
  • contentState to AST to JSON, with AST to HTML in Python
  • contentState to AST to HTML, with extra Python regexes
  • contentState to HTML, with extra Python regexes

and it's not just contentState but contentState + some config

Support

Python support

Python versions as defined in setup.py classifiers.

Which version combinations to include in Travis test matrix?

All supported Python versions should be tested.

Each version should be tested with the lower and upper bounds of supported version ranges for all dependencies.

Troubleshooting

Install

pip install draftjs_exporter
# [...]
# *********************************************************************************
# Could not find function xmlCheckVersion in library libxml2. Is libxml2 installed?
#*********************************************************************************

Solution: see http://stackoverflow.com/a/6504860/1798491

apt-get install libxml2-dev libxslt1-dev python-dev