stencila/convert : Stencila converters
Stencila Converters allow you to convert between a range of formats commonly used for "executable documents" (those containing some type of source code or calculation).
These converters are available as a sub-command in the Stencila CLI (command line tool) e.g.
stencila convert document.md document.jats.xml. The Stencila CLI is a standalone binary and is the easiest way to use these converters.
If you want to go ahead and install the converters separately you'll need Node.js and
npm. Once you have them set up...
- on Mac OS X and Linux, open the terminal,
- on Windows start the Command Prompt or PowerShell, and type:
npm install stencila-convert -g
Many of the text document converters rely on a recent version of Pandoc. This package will use an existing installation of Pandoc if it is new enough. If not, it will automatically download the required Pandoc version to the Stencila directory in your home folder. See
pandoc.json for the necessary Pandoc version and download URLs. At times it may be necessary to use our custom Pandoc build available at https://github.com/stencila/pandoc/releases.
stencila-convert document.md document.jats.xml
API documentation is available at https://stencila.github.io/convert.
The following table lists the status of converters that have been developed, are in development, or are being considered for development. We'll be developing converters based on demand from users. So if you'd like to see a converter for your favorite format, look at the listed issues and comment under the relevant one. If there is no issue regarding the converter you need, create one.
When the converters have been better tested, the plan is to integrate them into Stencila Desktop as a menu item e.g.
Save as... > Jupyter Notebook
|Yaml Front matter for CSV CSVY||#25||#26|
|Tabular Data Package|
|Open Document Spreadsheet
Clone the repository and install a development environment (again, you need Node.js to do it):
git clone https://github.com/stencila/convert.git cd convert npm install
Check how to contribute back to the project. All PRs are most welcome! Thank you!
Run the test suite:
npm test # or make test
Or, run a single test file:
To get coverage statistics:
npm run cover # or make cover
Or, manually test conversion using the bin script on test cases:
./bin/stencila-convert.js tests/fixtures/paragraphs.md temp.pdf
There's also a
Makefile if you prefer to run tasks that way e.g.
make lint cover check
You can also test using the Docker image for a self-contained, host-independent test environment:
docker build --tag stencila/convert . docker run stencila/convert
The tests are currently doing a "round trip" conversion. That is, a test case is:
- Converted to a temporary file in pandoc JSON format.
- The temporary file in pandoc JSON format is converted into a [pandoc pandoc’s intermediate representation of the document, AST)[https://pandoc.org/using-the-pandoc-api.html].
- The pandoc document is converted into an executable document.
- The executable document is converted into a pandoc document in JSON format (compare Stencila API).
- The pandoc document is converted into a temporary file in pandoc JSON format.
- The temporary file in pandoc JSON is converted into a file in the same format
as the test case (with
-outadded to the original name).
- The test case (input file) is then compared with the result of the
round trip conversion file. For example,
input.mdis compared with
input-out.ipynband so on.
To look up the details of the above conversion steps, see the
Test case alternatives
"Round trip" testing checks allows for testing conversions of alternatives for
input formats. For example, Atx-style headers (tagged with
#) in a Markdown file should be treated the same as Setext-style headers (tagged with
---- under the header text) during conversion. That is,
# Atx- style Header example
is converted into the same output as the one below:
Setext-style Header example -------------
To make testing more succinct we have a convention of extending test case file names with "-alternative" (dash alternative), for example,
heading-setext.md. Since the headings should be converted the same way,
heading-setext-out.md is compared to
(Another way to think about this approach is to treat the conversion as
standardising the input formats to pandoc Reader / Writer options.)
Note The above test regime tests primarily pandoc conventions for reading from and writing to different formats, (pandoc Reader and Writer options)[https://pandoc.org/MANUAL.html#options].
Create new test cases
You can create a new test case for a particular format by converting an existing tests case for another format. For example, to create a nested lists test case for JATS, you could use the existing test case for Markdown:
./bin/stencila-convert.js tests/fixtures/list_nested.md tests/fixtures/list_nested.jats.xml