Serlo-Markup-Prototype

A prototype for a markup language on serlo.org.

Idea

In order to develop a markup language for serlo.org, we construct a prototype converter between the Serlo API and the markup language. This allows for testing the markup language with real contents by converting them back and forth. The conversion from markup language to the API will be used to display the edited contents.

To implement these goals, the following tools will be written: A library which handles the conversion and a few command line tools for interfacing the library.

Currently, the idea is to use AsciiDoc as the underlying markup language. This is done for two main reasons: Firstly it is a language which naturally allows for content type extensions. This allows the vanilla use of an already defined markup language. Secondly this vanilla use of a markup language gives rise to the hope that some (external) editors already provide syntax highlighting for the markup language.

Executables

Using the command line, one can convert an article from serlo.org to AsciiDoc via the executable serlo2adoc. You can find a manpage for it int the folder doc. You can use cabal to build and install the library and executable. To get an impression, on what the programm can convert, you can look at the test article which roughly contains everything which can be converted.

Installation instructions are provided in INSTALL.adoc.

For downloading articles one can use the helper python scripts in the folder helper. They take as argument an article ID and print the article’s contents to the standard output. The script fetch-article.py only fetches the article without extracting it’s contents from the API request. This contains more metadata. The script fetch-article-content.py only outputs the article’s contents. When you redirect the output from the latter script, you can feed the file directly into serlo2adoc to see a conversion of the article to AsciiDoc.

State of Implementation

The conversion is done in a three step process: First there is a conversion from the JSON created by the Serlo API to an internal representation for the Serlo content. In the second step this internal representation is converted to an internal representation of an AsciiDoc document. The third step consists of converting the internal AsciiDoc representation to the actual AsciiDoc document. For the conversion in the other direction, both steps get reversed. Hence the implemtation consists of two parts: A translation between the Serlo API and the internal model and a translation between the internal model and the markup language. Graphically:

Serlo API JSON <--> internal Serlo content model <--> internal AsciiDoc model <--> AsciiDoc

Translation between API JSON and internal model

The Serlo Editor constists of a collection of plugins. These plugins each have a particular state, which determines their content. To track progress it is therefore useful to collect the plugins with which the translator can deal with. These are:

text (partially, for details see below)
injection (only from JSON to internal model)
rows
spoiler
image (only from JSON to internal model)
important (only from JSON to internal model)

The text plugin itself is complicated enough to deserve its own treatment: The text plugin itself has a substructure which denotes the text formatting. Currently supported are

bold/italic text
headings
normal text
paragraphs
maths
unordered lists

AsciiDoc parsing and printing

The internal model to represent an AsciiDoc document is still under heavy development. Regarding the conversion, there are two directions to cover: The printing, that is converting the internal model to an actual document, and the parsing, that is converting the actual document to the internal model. On the printing side, the following parts are working:

document header
sections
paragraph content (at least some parts)

The printable parts of the paragraph content are

plain text
bold/italic/highlighted text
subscript and superscript
maths
inline macro calls

Some representation for maths contents does exist, but it cant be rendered right now because it is unclear how to render it.

The parser is currently only capable of parsing the header and some blocks. These are

section
paragraph (for details see below)

Not all syntax elements of a paragraph are parseable right now. Currently the parser recognises the following syntax elements inside a paragraph

plain text (anything which is never a special char)
hard line breaks
end of a paragraph
macro calls
unconstrained bold/italic/monospace/highlighted text
sub- and superscript text
inline maths (asciimath and latex)

The unconstrained bold/italic/monospace implementation deviates from the specification by allowing stacking the formatting marks in any order.

Mapping between the internal models

There need to be two conversions for the internal model. Right now, there is a partial implementation to convert the Serlo internal model to the AsciiDoc internal model. In that, the following can be converted to AsciiDoc

rows
paragraphs (partially, see below)
spoilers
injections
images
important

For paragraphs, the following parts can be converted

plain text
bold/italic text
maths

Caveats

The Serlo editor encoding is not one of the best documented pieces of software. Nevertheless the source code gives understandable state descriptions for some plugins. Sadly, the text plugin, which is the bread and butter plugin for article creation, is so complicated, that the editor code itself is not a good documentation for the underlying data structure. Hence the API-side implementation of the text plugin is highly explorative.

Similar caveats concern the overall code quality. It is explorative code which probably ignores a bunch of best practices.

Name		Name	Last commit message	Last commit date
Latest commit History 53 Commits
doc		doc
executables		executables
helper		helper
src		src
.gitignore		.gitignore
INSTALL.adoc		INSTALL.adoc
LICENSE		LICENSE
README.adoc		README.adoc
Setup.hs		Setup.hs
serlo-markup.cabal		serlo-markup.cabal

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Serlo-Markup-Prototype

Idea

Executables

State of Implementation

Translation between API JSON and internal model

AsciiDoc parsing and printing

Mapping between the internal models

Caveats

About

Releases

Packages

Languages

License

gruenerBogen/Serlo-Markup-Prototype

Folders and files

Latest commit

History

Repository files navigation

Serlo-Markup-Prototype

Idea

Executables

State of Implementation

Translation between API JSON and internal model

AsciiDoc parsing and printing

Mapping between the internal models

Caveats

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages