Skip to content

Openwords Problem Markup Language v0.2

Marc-Bogonovich edited this page Nov 23, 2018 · 88 revisions


OWML is a straight-forward and simple but also extensive markup language for specifying language-learning problems that are developed by Openwords. It follows the simplicity of Markdown language and the elegance of Python, everything should be human-readable, human-editable, looking beautiful and requiring minimal syntax keywords/characters.

Problem Structure

Each "problem" consists of 4 parts, Problem Type, Problem Lines, Problem Answers and Problem Marplots (incorrect answers). The first two parts are compulsory in order to be able to define a problem, the last two parts are optional depending on what kind of problems you want to make. Problem Type is currently just TWO letters (starts with a "=" symbol) for the purpose of identification, the other parts all consist entirely of "Items", the atomic units for building a problem.

Structure Example

=fb This first line is the problem type.
*[A problem line]
#[An answer line] Answer lines are optional.
%[A marplot line][with two items] Marplot lines are optional.

Please note that we do not support empty lines between those data lines within a problem.


An Item is considered as pure UTF-8 (we may support more advanced UTF versions in the future) displayable text, and it is defined by using a pair of square brackets [ ], everything inside the brackets is treated as raw text, what you write is what you see in the problem.

For example, if you write a "[ ]" you'll see a space, if you write a "[😁]" you'll see a smiling face, if you write a "[a]" you will see an "a", if you write a "[好]" you will see a "好", if you write a "[مرحبا]" you will see a "مرحبا".

Blank Item

However, if you write "[]", which is a pair of brackets with nothing in it (literally!), this means something special in our syntax, specifically a "blank" for filling in an Answer item. This is one core concept of this format enabling complex, versatile and interactive language problems (you'll see some examples to get the idea).

Item Structure

There is actually no structure for Item, because inside the brackets is all raw text, but when multiple items join together as a line there should be nothing between the items.

Problem Lines (*)

One element of our philosophy is that, this format should be only for conserving language problem "data", only the information and logic, but nothing about the user interface. So that any problem designer is free from concerns about the appearances of their problems on any potential devices (including real paper!), and developers/painters have some flexibility in their representation of the appearance of the language data, especially when designers intentionally provide implicit data logic rather than explicit. We do not intend to "monopolize" the way people understand knowledge, especially for this kind of language knowledge, we just present the facts in front of you and you can comprehend it in any way you want that follows your instincts.

The major body of a problem is the problem lines, each line starts with a "*" symbol to indicate this is a problem line. Problem lines then consist of one or more items bracketed by square brackets [ ]. If a line is broken (by line breakers not spaces), then part after the breaker will not be considered as a part of this line. For example:

*[Hello, ]

[how are you?]

The second part will not be considered as valid and it will be omitted when parsing, the correct format should be

*[Hello, ][how are you?]


*[Hello, ]
*[how are you?]

You can have any number of problem lines within one problem as you wish, and any number of items within one line as you wish. Human capability is the limit, but please consider that

one problem consists of all information that a learner should see during one instance of study.

The power of multiple problem lines is to let UI developers know those lines need to be displayed separately in a visual sense, such as either in different "paragraphs" or different "lines" or different "chunks". This is the only part that the problem designer can explicitly tell UI developers regarding the visual arrangement.

Better Itemization

You may notice that a space is manually written within the first item in the above example, that is because we assume "nothing" between the items, only the data within the brackets will be acknowledged and parsed. So if some language writing systems require spaces between words, you need to carefully design the spaces within your items, otherwise no spaces between words will be displayed in the interface outcome of our implementation.

We rank this property as one of the core foundations of this format, because the purpose of this format is to accommodate all historical and extant languages on this planet if possible.

Additionally, we want the format to be extensible for other purposes as well (such as Natural Language Processing), so each item text should be exactly what it is. Although our design belief requires manually placed spaces between words, anyone can feel free to make their own implementations (programs/software) to display the words and spaces according to specific languages, as we said before this format aims not to monopolize how people think, but aims only to capture the minimal essences of human language learning problems. One recommendation is that one can extend the Problem Type value to specify different language types, and so the display of word 'betweenness' can behave differently.

However, the details of problem items depend on what the problem designer is trying to do. One informational problem line can be very simple, such as:

*[Hello, how are you?]

but in other cases, a problem designer may want some words to be blanks or some words need to attach other information (we will explain this in Item Attachments section), so a fairly itemized line should look like this:

*[Hello][,][ ][how][ ][are][ ][you][?]

Answer Lines (#)

Answer lines always come after the Problem lines, and starts with a "#" symbol, the rest are just items which is similar to a problem line. However, here is a very important and implicit syntax:

Each Answer Line in his linear order is matched to the Blank items appeared in linear order defined in the Problem lines.

For example, suppose we have defined 3 blanks in the problem lines (no matter how many lines you have), we then need to define 3 answer lines in order to provide correct answers to all 3 blanks in linear order, so the 1st answer line is for the 1st blank item, the 2nd answer line is for the 2nd blank item, and so on. Please note that if you do not provide correct answer/s to a blank item, the user will never be able finish the problem, because how can you answer a question without saying anything (maybe you can...😓)! And another implicit syntax is that for each answer line you can define any number of items you want, and that means a blank can have multiple correct answers! Which makes sense, right?!

Marplot Line (%)

The Marplot line always comes after the Answer lines, and starts with a "%" symbol. The Marplot line, like Answer lines and Problem lines consist mainly of items.

However, we currently only support only one Marplot line, which means you need to provide all marplot items for all your blanks within this one line.

You may write multiple Marplot lines but we will still treat it as one line. Why? Because we believe a "problem" should be simple enough for a learner to comprehend as quickly as possible, so in the case of the blanks do not have any relations to each other they better to be divided into multiple smaller problems, in the other case of the blanks do have some relations to each other then one marplot line can fulfill its exact purpose (i.e. mingle all the marplot items together). Our format does want to accommodate complex problems but this does not mean that unnecessary complexity is welcome and could cost learners half an hour to do one problem!

Marplot is not a common English word, so it is worthwhile to provide a definition. A Marplot is a meddlesome person who interferes in the plans of others. In our code individual Marplot line items or marplots essentially function as incorrect answers.


The Free Dictionary

A problem designer may design a problem without any marplots, with one marplot, or with two or more marplots. The marplots will appear intermingled with correct answers in the user interface as options for problem blanks. See Example 1. below. A problem designer will generally find that there is no need for more than one Marplot lines because you can place 0-n marplots items in one Marplot line.

How This Thing Works?

The simple idea behind this markup language strategy is that "designer digs a hole and provides options, then learner fills up the hole with a provided option". ......

Item Attachments

Item attachments can refer to external resources including both static output resources and dynamic input resources, such as audio, image, video, hyperlink, keyboard input, microphone input, canvas drawing input. So we categorized the attachments into two generic types: Output and Input. Output means the resources are provided by the problem designer and will be presented to learner statically during study. Input means the resources should be generated by learner dynamically during study.

Item attachments are specified with parentheses "()" and are placed as postfixes on items with the following general structure. Please note that there should be nothing between the item ending bracket and the attachment starting bracket (the same between multiple attachments).

*[item text](attachment)(more attachments)

Attachment Structure

Item attachments may be the most complex part in our format, but they are less important/necessary than the Items, because anyone should be able to make a problem without any attachment. In a layman term, the structure of an attachment is just a set of values lined up in a queue, each value has his own seat and no one else can occupy it even he is not there. And we have reserved two seats now: the first and the second. The first seat is for the value of Attachment Type, the second seat is for the value of Attachment URI.

In a more technical term, the structure of an attachment is a set of text separated by double colon "::" in a linear order, if you do not specify a value you can leave it as a blank but still need to keep the colons. For example, if you want to specify the third value for your own use without need of the URI value, you need to write something like this:


As we said, any developers can make their own parser/implementation to translate/explain the language data within our format, so we are open for the discussion of the future work of Attachment, but right now we just want to leave this open.

Attachment Type

Currently 6 types are reserved: sound-out, sound-in, image-out, image-in, link-out, type-in. Please note that the same attachment type may have different appearance or behavior across different implementations, however, we are trying our best to finish a most "de facto" implementation for those types. Also, anyone is welcome to suggest or implementation their own attachment types.

Static Output Attachment:

  • sound-out

    Specifies an attachment that is a sound that the user can hear by tapping/clicking the item. A sound should be able to be attached to a problem item, an answer item or a marplot item.

  • image-out

    Specifies an attachment that is an image that the user can see. Because an image is a visual object, same as a piece of a text, so normally if an item is attached with an image the item text will not be displayed during study, then in this case the item text should serve as an identification purpose rather than a display purpose.

  • link-out

    Specify an attachment that is a hyperlink that the user can click on, which would be handled by the user device operating system or the application runtime environment. If a link is clicked the application should take user to an external web page or a piece of information interface.

Dynamic Input Attachment:

  • sound-in

    This attachment can only take affect on an problem item, it will record one piece of sound from user device's microphone. The technology is not there yet to evaluate human language pronunciation, but we believe people are capable of comparing two sounds/speech by themselves.

  • image-in

    This attachment can only be placed on an answer item, it will capture one piece of image file that provided by user. We do not limit how the image file is created, and our current implementation will provide a drawing canvas for user to draw anything by hand and then save it as an image file.

  • type-in

    This attachment on answer items compels a learner to recall and type an answer rather than recognize an answer from an assortment of possible answers. This attachment only makes sense in the context of an answer item. A problem designer would not place a "(type-in)" attachment on a marplot item. Indeed, a problem where all blanks have type-in answers would not require a Marplot line at all.

Attachment Exclusion

In OWML, any item can have any type and any number of attachments, there is no "wrong" ways to put any attachment behind an item, we believe OWML is not a language to constrain certain formality for representing language problems, but rather to encourage recording and gathering all written information regarding language problems. However, we only make choices at our own learning UI to decide which attachments are better presented at the current stage. Such "choices" are described below, again please note that this is not enforced in OWML, we only enforce attachment exclusion in our own implemented UI.

types sound-out sound-in image-out image-in link-out type-in
sound-out both both both both both both
sound-in first both first ⬅️ first
image-out first ⬆️ both ⬆️
image-in first ⬅️ first
link-out first ⬆️
type-in first

Attachment URI

URI (Uniform Resource Identifier) ...

Attachment Overriding


Our Implicit Syntax


In this section we provide several examples of the OWML together with a GUI representation of the problem.

Example 1

Simple sentence construction problem.

*[I am a cat.]

Example 2

Simple sentence construction problem, requiring learner to type in correct answer. Notice that there are two possible correct answers and there are no marplots (incorrect answers).

*[I am a cat.]

Example 3

A fill in the blank problem designed to teach conjugation. Similar problems can be developed to teach/test other aspects of language morphology.

*[Je ][][ pour aller au travail.]

Example 4

This example is a simple vocabulary review problem. Notice the audio image/button. This image is connected to an audio file. There is an URL for both the image file and the audio file.

*[māo][ ](sound-out::URL)(image-out::URL)

Example 5

This is a simply audio review problem. An audio file is attached to the item in the problem line with the raw text "theAudioText" and the audio file is played when the learner taps the text. Audio files can be attached to answer items in this way.


Example 6

This is a simply audio review problem. In this example, no text is displayed. In this case the blank item simply acts as a placeholder rather than as a problem blank. Instead an audio file is "attached" to a blank item serving as a placeholder. An image attachment is also attached to both the audio attachment and placeholder; in a sense they are attached to each other. The audio file is played when the learner taps the image (audio button image). The image file in this case is a default audio button image. Other images could have been referenced with the audio attachment.

*[ ](sound-out::URL)(image-out::URL)


Syntax Glossary














Future Work


The Openwords Problem Markup Language (OWML) is licensed under terms of the Creative Commons Attribution-Share Alike 3.0 unported (CC-BY-SA 3.0) license.

Copyright (c) 2017 Openwords. Authors, Shenshen Han, Marc Bogonovich.

You can’t perform that action at this time.