Skip to content

Commit

Permalink
Add documentation for the yaml tests
Browse files Browse the repository at this point in the history
  • Loading branch information
egli committed Aug 26, 2015
1 parent 565d72a commit c9f0ffe
Showing 1 changed file with 225 additions and 56 deletions.
281 changes: 225 additions & 56 deletions doc/liblouis.texi
@@ -1,6 +1,7 @@
\input texinfo
@c %**start of header
@setfilename liblouis.info
@documentencoding UTF-8
@include version.texi
@settitle Liblouis User's and Programmer's Manual

Expand Down Expand Up @@ -139,8 +140,9 @@ Testing Translation Tables interactively
Automated Testing of Translation Tables
* Translation Table Test Harness::
* Translation Table Doctests::
* YAML Tests::
* Test Harness::
* Doctests::
Programming with liblouis
Expand Down Expand Up @@ -1875,81 +1877,223 @@ There are a number of automated tests for liblouis and they are
proving to be of tremendous value. When changing the code the
developers can run the tests to see if anything broke.

For testing the translation tables there are basically two approaches:
there are the harness tests and the doctests. They were created at
roughly the same time using different technologies, have influenced
each other and have gone through improvements and technology changes.
For now they are both based on Python so you need to have that
installed. The philosophies of the two are slightly different:
The easiest way to test the translation tables is to write a YAML file
where you define the table that is to be tested and any number of
words or phrases to translate together with their respective expected
translation.

The YAML based tests replace the two older methods for testing the
translation tables which only work with Python and when compiled with
UCS4: the harness tests and the doctests. They are deprecated and will
be removed in a future release.

@table @asis
@item YAML tests
The YAML tests are data driven, i.e. you give the test data, a string
to translate and the expected output. The data is in a standard format
namely YAML. If you have @file{libyaml} installed they will
automatically be invoked as part of the standard @command{make check}
command.


@item Harness tests
The harness tests are data driven, i.e. you give the test data, i.e. a
string to translate and the expected output. The data is in a standard
format, i.e. json. They work with both Python2 and Python3, however
since the format is json it is perceivable that somebody would write
some C code which takes the data in the harness file and runs it through
liblouis so they could also run without Python and without ucs4.
The harness tests are also data driven like the YAML tests. However
the data is given in JSON and quite a bit more verbose than the YAML
tests. They work with both Python2 and Python3 but not from plain C,
so you need Python and you will have to compile with UCS4.

@item Doctests
The doctests on the other hand are based on a technology used in Python
where you define your tests as if you were sitting at a terminal session
with a Python interpreter. So the tests look like you typed a command
and got some output, e.g.
The doctests are based on a technology used in Python where you define
your tests as if you were sitting at a terminal session with a Python
interpreter. Again they only work with either Python2 and Python3 but
not from plain C, so you need Python and you will have to compile with
UCS4.
@end table

@menu
* YAML Tests::
* Test Harness::
* Doctests::
@end menu

@node YAML Tests
@section YAML Tests

@url{http://yaml.org/,YAML} is a human readable data serialization
format that allows for an easy and compact way to define tests.

A YAML file first defines which tables are to be used for the tests.
Then it optionally defines flags such as the @samp{testmode}. Finally
all the tests are defined.

Let's just look at a simple example how tests could be defined:

@example
>>> translate(['table.ctb'], "Hello", mode=compbrlLeftCursor)
("HELLO", [0,1,2,3], [0,1,2,3], 0)
# comments start with '#' anywhere on a line
# first define which tables will be used for your tests
tables: [unicode.dis, en-ueb-g1.ctb]
# then optionally define flags such as testmode. If no flags are
# defined forward translation is assumed
# now define the tests
tests:
- # each test is a list.
# The first item is the string to translate. Quoting of strings is
# optional
- hello
# The second item is the expected translation
- ⠓⠑⠇⠇⠕
- # optionally you can define additional parameters in a third
# item such as typeform or expected failure, etc
- Hello
- ⠨⠶⠠⠓⠑⠇⠇⠕⠨⠄
- @{typeform: '11110', xfail: true@}
- # a simple, no-frills test
- Good bye
- ⠠⠛⠕⠕⠙ ⠃⠽⠑
# same as above using "inline notation"
- [Good bye, ⠠⠛⠕⠕⠙ ⠃⠽⠑]
@end example

There is a convenience wrapper which hides away much of the complexity
of above example so you can write stuff like
The three basic components of a test file are as follows:

@table @samp
@item tables
A list containing table names, which the tests should be run against.
This is usually just one table, but for some situations more than one
table can be required.

To test the @file{en-ueb-g1.ctb} table using unicode braille you could
use the following definition:

@example
>>> t.braille('the cat sat on the mat')
u'! cat sat on ! mat'
tables: [unicode.dis, en-ueb-g1.ctb]
@end example

If you wanted to test the @file{eo-g1.ctb} table using brf notation
then you would use the following definition:

@example
tables: [en-us-brf.dis, eo-g1.ctb]
@end example

But essentially you are writing code, so the doctests allow you to do
more flexible tests that are much closer to the raw iron. For technical
reasons the doctests will probably only ever work in either Python2 or
Python3 but not both and they will never run from C.
@item flags
The flags that apply for all tests in this file. At the moment only
the @samp{testmode} flag is supported. It can have three possible
values:

@table @samp
@item forward
This indicates that the tests are for forward translation
@item backward
This indicates that the tests are for backward translation
@item hyphenate
This indicates that the tests are for hyphenation
@end table

To sum it up, the recommendation is that for normal table testing you
should use the test harness. It has a lot of momentum and the format
is a standard. If you want to be closer to the raw Python API of
liblouis, if you want to test some more intricate scenarios (involving
inpos, modes, etc) then the doctests are for you.
If no flags are defined forward translation is assumed.

@menu
* Translation Table Test Harness::
* Translation Table Doctests::
@end menu
@item tests
A list of tests. Each test consists of a list of two or three items.
The first item is the unicode text to be tested. The second item is
the expected braille output. This can be either unicode braille or an
ASCII-braille like encoding. Quoting strings is optional. Comments can
be inserted almost anywhere using the @samp{#} sign. A simple test
would look at follows:

@node Translation Table Test Harness
@section Translation Table Test Harness
@example
- # a simple, no-frills test
- Good bye
- ⠠⠛⠕⠕⠙ ⠃⠽⠑
@end example

Using the more compact "inline notation" it would look like the
following:

Each harness file is a simple UTF8 encoded json file, which has two entries.
@example
- [Good bye, ⠠⠛⠕⠕⠙ ⠃⠽⠑]
@end example

An optional third item can contain additional options for a test such
as the typeform, or whether a test is expected to fail. The following
shows a typical example:

@example
-
- Hello
- ⠨⠶⠠⠓⠑⠇⠇⠕⠨⠄
- @{typeform: '11110', xfail: true@}
# same test more compact
- [Hello, ⠨⠶⠠⠓⠑⠇⠇⠕⠨⠄, @{typeform: '11110', xfail: true@}]
@end example

The valid additional options for a test are as follows:

@table @samp
@item xfail
Whether a test is expected to fail. Valid options are @samp{true},
@samp{Y}, @samp{Yes} or @samp{ON}. Anything else is considered false.

@item typeform
The typeform used for a translation. The typeform is passed in the
form of a string.

@item cursorPos
A list of cursor positions for each input position. Useful when
simulating screenreader interaction, to debug contraction and cursor
behavior as in the following example from the @file{en-GB-g2.ctb}
test:

@example
-
- you went to
- ⠽ ⠺⠑⠝⠞ ⠞⠕
- @{mode: [compbrlAtCursor], cursorPos: [0,1,2,3,4,5,6,7,8,9,10]@}
@end example

@item mode
A list of translation modes that should be used for this test. If not
defined defaults to 0. Valid mode values are @samp{noContractions},
@samp{compbrlAtCursor}, @samp{dotsIO}, @samp{comp8Dots},
@samp{pass1Only}, @samp{compbrlLeftCursor}, @samp{otherTrans}or
@samp{ucBrl}.
@end table

@end table

For more examples and inspiration please see the YAML tests
(@file{*.yaml}) in the @file{tests} directory of the source
distribution.

@node Test Harness
@section Test Harness

Each harness file is a simple UTF8 encoded JSON file, which has two entries.
@table @code
@item tables
A list containing table names, which the tests should be run against.
This is usually just one table, but for some situations more than one table is required.
This is usually just one table, but for some situations more than one
table is required.
@item tests
A list of sections of tests, which should be processed independently.
Each test section is a dictionary of two items.
@item flags
The flags that apply for all the test cases in this section.
For example, they could all be forward translation tests, or they should all be run as computer braille tests.
The flags that apply for all the test cases in this section. For
example, they could all be forward translation tests, or they should
all be run as computer braille tests.

@item data
A list of test cases, each one containing the specific test data needed to perform a test.
A list of test cases, each one containing the specific test data
needed to perform a test.
@end table

These are the valid fields for the flags section:
@table @code
@item comment
A field describing the reason for the tests, the transformation rule or any useful info that might be needed in case the test breaks (optional).
A field describing the reason for the tests, the transformation rule
or any useful info that might be needed in case the test breaks
(optional).
@item cursorPos
The position of the cursor within the given text (optional). Useful
when simulating screenreader interaction, to debug contraction and
Expand All @@ -1958,11 +2102,13 @@ cursor behavior.
The liblouis translation mode that should be used for this test
(optional). If not defined defaults to 0.
@item outputUniBrl
For a forward translation test, the output should be in Unicode braille.
For a backward translation test, the input is in Unicode braille.
For a forward translation test, the output should be in Unicode
braille. For a backward translation test, the input is in Unicode
braille.
@item testmode
The optional testmode field can have three values: "translate" (default if undeclared), "backtranslate" or "hyphenate".
Declares what tests should be performed on the test data.
The optional testmode field can have three values: "translate"
(default if undeclared), "backtranslate" or "hyphenate". Declares what
tests should be performed on the test data.
@end table


Expand All @@ -1980,14 +2126,36 @@ The expected position of the braille cursor in the braille output
contraction and cursor behavior.
@end table

Variables defined in the flags section can be overridden by individual test cases, but if several tests need the same options, they should
ideally be split into their own section, complete with their own flags and data.
Variables defined in the flags section can be overridden by individual
test cases, but if several tests need the same options, they should
ideally be split into their own section, complete with their own flags
and data.

For examples please see @file{*_harness.txt} in the harness directory
in the source distribution.

@node Doctests
@section Doctests
A doctest looks like you typed a command at the Python command line
and got some output, e.g.

For examples please see @file{*_harness.txt} in the
harness directory in the source distribution.
@example
>>> translate(['table.ctb'], "Hello", mode=compbrlLeftCursor)
("HELLO", [0,1,2,3], [0,1,2,3], 0)
@end example

There is a convenience wrapper which hides away much of the complexity
of above example so you can write stuff like

@example
>>> t.braille('the cat sat on the mat')
u'! cat sat on ! mat'
@end example

@node Translation Table Doctests
@section Translation Table Doctests
Essentially you are writing code, so the doctests allow you to do more
flexible tests that are much closer to the raw iron than any of the
other tests. However the doctests will only work in either Python2 or
Python3.

For examples on how to create doctests please see @file{*_test.txt} in
the doctest directory in the source distribution.
Expand Down Expand Up @@ -2683,4 +2851,5 @@ directory. Usage information is included in the Python module itself.
@c LocalWords: inlen compbrlAtCursor compbrlLeftCursor trantab stderr endian
@c LocalWords: tablelist fileName printindex deprecatedopcode setDataPath
@c LocalWords: getDataPath MathML suboperands logEnd liblouisutdml whitespace
@c LocalWords: xhhhh yhhhhh zhhhhhhhh OpenOffice
@c LocalWords: xhhhh yhhhhh zhhhhhhhh OpenOffice documentencoding
@c LocalWords: YAML Doctests JSON logLevels

0 comments on commit c9f0ffe

Please sign in to comment.