Skip to content
↔️ Codecs for executable document formats
Branch: master
Clone or download
stencila-ci chore(release): 0.59.1 [skip ci]
## [0.59.1](v0.59.0...v0.59.1) (2019-07-19)

### Bug Fixes

* **HTML:** Move slugger reset to the right place ([a845948](a845948))
* **HTTP utility:** Use synchronous cache ([bbad587](bbad587))
Latest commit 788d78e Jul 19, 2019
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.vscode chore(Package): Upgrade dependencies Jul 5, 2019
src fix(HTML): Move slugger reset to the right place Jul 19, 2019
.all-contributorsrc Merge branch 'master' into all-contributors/add-rgaiacs Jul 2, 2019
.appveyor.yml
.dockerignore test(*): For consistency with __fixtures__ and __tests__ use __outputs__ Jul 1, 2019
.editorconfig chore(Package): Upgrade dependencies Jul 5, 2019
.gitignore test(*): For consistency with __fixtures__ and __tests__ use __outputs__ Jul 1, 2019
.nvmrc Add commands to generate CLI binary May 7, 2019
.travis.yml ci(INstall) Run node install to get Pandoc on CI Jul 17, 2019
CHANGELOG.md chore(release): 0.59.1 [skip ci] Jul 19, 2019
CONTRIBUTING.md docs: update README; add contributors May 14, 2019
Dockerfile chore: Further renaming in top-lvel files May 30, 2019
LICENSE
Makefile chore(Linting): Add linting config and task Jul 11, 2019
README.md feat(HTTP): Add caching for HTTP requests Jul 8, 2019
docs.ts docs(Generation): Make logging quieter Jul 5, 2019
encoda docs(README): Add shortcut bash script and npm task Jul 7, 2019
install.js chore(Package): Use tsconfig rootDir to simplify dist Jul 17, 2019
package-lock.json feat(Dependecies): Upgrade @stencila/schema Jul 18, 2019
package.json feat(Dependecies): Upgrade @stencila/schema Jul 18, 2019
renovate.json Update renovate.json Jul 9, 2019
tsconfig.json chore(Package): Use tsconfig rootDir to simplify dist Jul 17, 2019
typedoc.js docs(*): Generate docs using `dir` codec Jul 1, 2019

README.md

Encoda

Build status Build status Code coverage NPM Contributors Docs Chat Greenkeeper badge

Encoda allows you to convert between a range of formats commonly used for "executable documents" (those containing some type of source code or calculation).

Formats

Format Codec Approach Status Issues Coverage
Text
Plain text txt None β
Markdown md Extens α
LaTex latex - α
Microsoft Word docx rPNG α
Google Docs gdoc rPNG α
Open Document Text odt rPNG α
HTML html Extens α
Journal Archiving Tag Set jats Extens α
Portable Document Format pdf rPNG α
Notebooks
Jupyter ipynb Native α
RMarkdown xmd Native α
Presentations
Microsoft Powerpoint pptx rPNG
Demo Magic dmagic Native β
Spreadsheets
Microsoft Excel xlsx Formula β
Google Sheets gsheet Formula
Open Document Spreadsheet ods Formula β
Tabular data
CSV csv None β
CSVY csvy None
Tabular Data Package tdp None β
Collections
Document Archive dar Extens ω
Filesystem Directory dir Extens ω
Data interchange, other
JSON json Native
JSON5 json5 Native
YAML yaml Native
Pandoc pandoc Native β
Reproducible PNG rpng Native β
Transport
HTTP [http]

Key

Approach... How executable nodes (e.g. `CodeChunk` and `CodeExpr` nodes) are represented
  • Native: the format natively supports executable nodes
  • Extens.: executable nodes are supported via extensions to the format e.g. in HTML and DAR, a CodeChunk is represented using a <stencila-chunk> element
  • rPNG: executable nodes are supported via reproducible PNG images inserted into the document
  • Formula: executable CodeExpr nodes are represented using formulae
Status...
  • ✗: Not yet implemented
  • ω: Work in progress
  • α: Alpha, initial implementation
  • β: Beta, ready for user testing
  • ✔: Ready for production use
Issues... Link to open issues and PRs for the format (please check there before submitting a new issue 🙏)

If you'd like to see a converter for your favorite format, look at the listed issues and comment under the relevant one. If there is no issue regarding the converter you need, create one.

Install

The easiest way to use Encoda is to install the stencila command line tool. Encoda powers stencila convert, and other commands, in that CLI. However, the version of Encoda in stencila, can lag behind the version in this repo. So if you want the latest functionality, install Encoda as a Node.js package:

npm install @stencila/encoda --global

Use

Encoda is intended to be used primarily as a library for other applications. However, it comes with a simple command line script which allows you to use the convert function directly e.g.

encoda convert notebook.ipynb notebook.docx

Encoda will determine the input and output formats based on the file extensions. You can override these using the --from and --to options. e.g.

encoda convert notebook.ipynb notebook.xml --to jats

You can decode an entire directory into a Collection. Encoda will traverse the directory, including subdirectories, decoding each file matching your glob pattern. You can then encode the Collection using the dir codec into a tree of HTML files e.g.

encoda convert myproject myproject-published --to dir --pattern '**/*.{rmd, csv}'

You can also read content from the first argument. In that case, you'll need to specifying the --from format e.g.

encoda convert "{type: 'Paragraph', content: ['Hello world!']}" --from json5 paragraph.md

You can send output to the console by using - as the second argument and specifying the --to format e.g.

encoda convert paragraph.md - --to yaml
Option Description
--from The format of the input content e.g. --from md
--to The format for the output content e.g. --to html
--theme The theme for the output (only applies to HTML, PDF and RPNG output) e.g. --theme eLife
--standalone Generate a standalone document, not a fragment (default true)
--bundle Bundle all assets (e.g images, CSS and JS) into the document (default false)
--debug Print debugging information

Documentation

Self-hoisted (documentation converted from various formats to html) and API documentation (generated from source code) is available at: https://stencila.github.io/encoda.

Develop

Check how to contribute back to the project. All PRs are most welcome! Thank you!

Clone the repository and install a development environment:

git clone https://github.com/stencila/encoda.git
cd encoda
npm install

Run the test suite:

npm test

Or, run a single test file:

npx jest tests/xlsx.test.ts

To get coverage statistics:

npm run cover

Or, manually test conversion using the ts-node and the cli.ts script:

npm run cli -- convert simple.md simple.html

There is a bash script to make that a little shorter and more like real life usage:

./encoda convert simple.md simple.html

If that is a bit slow, compile the Typescript to Javascript first and use node directly:

npm run build
node dist/cli convert simple.md simple.html

There's also a Makefile if you prefer to run tasks that way e.g.

make lint cover check

You can also test using the Docker image for a self-contained, host-independent test environment:

docker build --tag stencila/encoda .
docker run stencila/encoda

Contribute

We 💕 contributions! All contributions: ideas 🤔, examples 💡, bug reports 🐛, documentation 📖, code 💻, questions 💬. See CONTRIBUTING.md for more on where to start. You can also provide your feedback on the Community Forum and Gitter channel.

Contributors

Aleksandra Pawlik
Aleksandra Pawlik

💻 📖 🐛
Nokome Bentley
Nokome Bentley

💻 📖 🐛
Jacqueline
Jacqueline

📖 🎨
Hamish Mackenzie
Hamish Mackenzie

💻 📖
Alex Ketch
Alex Ketch

💻 📖 🎨
Ben Shaw
Ben Shaw

💻 🐛
Phil Neff
Phil Neff

🐛
Raniere Silva
Raniere Silva

📖
Add a contributor...

To add youself, or someone else, to the above list, either,

  1. Ask the @all-contributors bot to do it for you by commenting on an issue or PR like this:

    @all-contributors please add @octocat for bugs, tests and code

  2. Use the all-contributors CLI to do it yourself:

    npx all-contributors add octocat bugs, tests, code

See the list of contribution types.

Acknowledgments

Encoda relies on many awesome opens source tools (see package.json for the complete list). We are grateful ❤ to their developers and contributors for all their time and energy. In particular, these tools do a lot of the heavy lifting 💪 under the hood.

Tool Use
Ajv Ajv is "the fastest JSON Schema validator for Node.js and browser". Ajv is not only fast, it also has an impressive breadth of functionality. We use Ajv for the validate() and coerce() functions to ensure that ingested data is valid against the Stencila schema.
Frictionless Data datapackage-js from the team at Frictionless Data is a Javascript library for working with Data Packages. It does a lot of the work in converting between Tabular Data Packages and Stencila Datatables.
Pandoc Pandoc is a "universal document converter". It's able to convert between an impressive number of formats for textual documents. Our Typescript definitions for Pandoc's AST allow us to leverage this functionality from within Node.js while maintaining type safety. Pandoc powers our converters for Word, JATS and Latex. We have contributed to Pandoc, including developing its JATS reader.
Puppeteer Puppeteer is a Node library which provides a high-level API to control Chrome. We use it to take screenshots of HTML snippets as part of generating rPNGs and we plan to use it for generating PDFs.
Remark Remark is an ecosystem of plugins for processing Markdown. It's part of the unified framework for processing text with syntax trees - a similar approach to Pandoc but in Javascript. We use Remark as our Markdown parser because of it's extensibility.
SheetJs SheetJs is a Javascript library for parsing and writing various spreadhseet formats. We use their community edition to power converters for CSV, Excel, and Open Document Spreadsheet formats. They also have a pro version if you need extra support and functionality.

Many thanks ❤ to the Alfred P. Sloan Foundation and eLife for funding development of this tool.

You can’t perform that action at this time.