Promise this will be updated again with lemontest specific details again. It's somewhat there but missing some details on some of the more specific lemontest features such as containerisation configuration.
Refer to thesis paper for details.
lemontest is a redesign and reorganisation of the existing Autotest written by Andrew Taylor et al. to better utilise new developments in software design such as embracing a modular architecture, containerisation and parallelisation of tests to alleviate security and performance concerns respectively.
lemontest runs a series of tests on 1 or more programs comparing their behaviour to specified expected behaviour.
lemontest focuses on producing output comprehensible to a novice programmer perhaps in their first coding course.
The lemontest syntax is designed to allow tests to be specified quickly and concisely.
Tests are typically specified in a single file named by default tests.txt.
lemontest syntax is designed to allow succinct convenient specification of tests, e.g.:
files=is_prime.c
1 stdin="39" expected_stdout="29 is not prime\n"
2 stdin="42" expected_stdout="42 is not prime\n"
3 stdin="47" expected_stdout="47 is prime\n"
lemontest allows flexible specification of command line arguments, so it can be comfortable used by novices who little experience with command-line programs.
lemontest will typically be run via a wrapper shell script which specifies arguments and parameters values appropriate for a class, for example, specifiying the base directory to search for autotests, e.g:
#!/bin/sh
parameters="
default_compilers = {'c' : [['clang', '-Werror', '-std=gnu11', '-g', '-lm']]}
upload_url = https://example.com/autotest.cgi
"
exec /usr/local/lemontest/lemontest.py --exercise_directory /home/class/activities --parameters "$parameters" "$@"```
Students can then run the wrapper script simply specifying the particular class exercise they wish to
autotest, perhaps:
```bash
$ autotest.sh is_prime
Some useful command-line options are useful when developing test specifications, include:
-a AUTOTEST_DIRECTORY, --autotest_directory AUTOTEST_DIRECTORY specify directly the location of the autotest specification.
-D DIRECTORY, --directory DIRECTORY copty files in the specified to the test directory.
-g, --generate_expected_output generate expected output for the tests by executing the supplied files.
for example, this will update the test specification in the directory my_autotest
using a
model solution in my_solution
$ lemontest.py --generate_expected_output=update --directory my_solution --autotest_directory my_autotest
A temporary container (effectively a hardened chroot jail) is created for each autotest with the program and any supporting files centralised to a shared directory that each test will either link or copy files from.
By default any other files in the test specification directory are also
copied to the shared directory (see the supplied_files_directory
parameter)
but not necessarily the test container at the current time (optional_files
currently needs to be set but this is due to change soon)
By default tests are executed in an environment stripped of most environment variables but this can be specified with test parameters.
By default tests are executed with resource limits which can be specified with test parameters. (This is subject to change once either cgroupsv2 libraries arrive to python or I get around to writing one myself)
A test consists of a label and set of parameter value.
Every test must have a unique label consisting of alphanumeric characters and underscore ([a-zA-Z0-9_]+)
The file is read sequentially and when a test label is reached a test is created with the current values of parameters.
Assignments to parameter values apply to any following test or until a different value is assigned to the parameter.
Except assignments to parameter values on the same line as a test label are used only for that test. For example in the follow example the CPU limit for test1 is 5 seconds and the CPU limit for test2 is 10 seconds.
max_cpu_seconds=10
test1 max_cpu_seconds=5 command=./prime.py 41 expected_stdout="True\n"
test2 command=./prime.py 42 expected_stdout="False\n"
If a command is a single string it is passed to a shell for evaluation
A test label may be used multiple times to supply the value of different parameters for the test.
max_cpu_seconds=10
program=prime.py
test1 max_cpu_seconds=5 arguments=41 expected_stdout="True\n"
test2 arguments=42 expected_stdout="False\n"
Tests are specified by assigning values to parameters, for example:
max_cpu_seconds=10
Parameter names start with an alphabetic letter and can contain alphanumeric characters and underscore ([a-zA-Z0-9_]+)
The values assigned to parameters can use Python syntax including single-quotes, double-quotes, triple-quotes and raw-quotes and f-strings. Values can also be lists or dicts specified in Python syntax.
Triple-quoted strings, lists and dicts can be multi-line.
Parameters specified in previous lines are available as parameters in f-strings. Parameters specified on the current line are not available in the evaluation of f-strings.
Assignment to a parameter name which is not a builtin parameter listed in the section below, will produce an error unless the parameter name begins with a single ''. Parameter names begining with '' can be given values to be used in later f-strings.
For convenience, values can also be written as shell-like unquoted strings if they contain only non-whitespace ASCII and none of these characters **\ = [ ] { } " ' **. So for example, these are equivalent parameter assignments.
command=./a.out
command="./a.out"
Multiple unquoted strings are aggregated into a list so these are equivalent commands:
command=./a.out --print example.txt
command=['./a.out', '--print', 'example.txt']
Parameter values are coerced to an appropriate type if possible.
If a boolean type is expected, values are converted to be True or False following
Python rules, so for example . 0 '' [] {}
will all become False
,
except strings with a first characters of '0', 'f' or 'F' are considered False
files=prime.c
# specifying command-lines arguments
test1 arguments=41 expected_stdout="41 is prime.\n"
# specifying stdin
test2 stdin="42" expected_stdout="42 is not prime.\n"
# using files to specify stdin and expected_stdout
test3 stdin=['43.txt'] expected_stdout=['43_expected_output.txt']
# running a Shell command
test4 command="echo 44 | prime" expected_stdout="44 is not prime.\n"
# using two line to specifiy test plus triple-quote for a multi-line string
test5 arguments=45
test5 expected_stdout="""45 is not prime.
"""
# specify more flexibility in test acceptance
# by ignoring white space, some punctuation characters (",.!") and extra new lines
test6 ignore_whitespace=True ignore_blank_lines=True ignore_characters=",.!"
test6 arguments=46 expected_stdout="46 is not prime.\n"
# make test succeed if it has just right digits in output
test7 arguments=47 compare_only_characters="0123456789" expected_stdout="47 is not prime.\n"
program
The name of script/binary to run for this test.
If program
is not specified it is heuristically inferred from command
, if possible.
arguments
= []
Command-line arguments for this test. Used only if command
is not specified.
command
Command to run for this test.
If command
is a string, it is passed to a shell.
If command
is a list, it is executed directly.
If command
is not specified and program
is specified,
command
is set to a list containing program
with arguments
appended.
Otherwise command
is inferred heuristically from the first filename
specified in by parameter files
, if possible.
files
Input files required to be supplied for a test.
If files
is not specified it is set to the parameter program
with a .c
appended iff program
does not contain a '.'.
For example if files
is not specified and program
== hello
, files
will be set to hello.c
,
but if program
== hello.sh
files
will be set to hello.c
optional_files
= []
Input files which may be optionally supplied for a test.
supplied_files
= []
Helper files which may be supplied for a test.
NOTE: All helper files copied in via supplied_files_directory
must still be declared here or they won't be linked into each test's containerised directory
check_hash_bang_line
= True
Check Perl, Python, Shell scripts have appropriate #! line.
pre_compile_command
If set pre_compile_command
is executed once before compilation.
This is invisible to the user, unless pre_compile_command
produces output.
Compilation does not occur if pre_compile_command
has a non-zero exit-status.
If pre_compile_command
is a string, it is passed to a shell.
If pre_compile_command
is a list, it is executed directly.
default_checkers
= {'js': [['node', '--check']], 'pl': [['perl', '-cw']], 'py': [['python3', '-B', '-m', 'py_compile']], 'sh': [['bash', '-n']]}
A dict which supplies a default value for the parameter checkers
based on the suffix for the
the first file specified by the parameter files
.
checkers
List of checkers. Each checker is run once for each file supplied for a test. The filename is appended as argument.
Checkers are only run once for a file.
If checker is a string it is run by passing it to a shell.
Deprocated: if the value is a string containing ':' a list is formed by splitting the string at the ':'s.
default_compilers
= {'c': [[['dcc'], ['clang', '-Wall'], ['gcc', '-Wall']]], 'cc': [['g++', '-Wall']], 'java': [['javac']], 'rs': [['rustc']]}
A dict which supplies a default value for the parameter compilers
based on the suffix for the
the first file specified by the parameter files
.
If '%' is present in a list, it is replaced by the program
.
compilers
List of compilers + arguments.
files
are compiled with each member of list and test is run once for each member of the list.
For example, given:
# run all tests twice once compiled with gcc -fsanitize=address, once with clang -fsanitize=memory
compilers = [['gcc', '-fsanitize=address'], ['clang', '-fsanitize=memory']]
Element of the list of compilers can themselves be a list specifying a list of alternative compilers.
For example:
# run all tests twice once compiled with gcc -fsanitize=address, once with clang -fsanitize=memory
compilers = [[['dcc'], ['clang', '-Wall'], ['gcc', -Wall]]]
The first element of this sub-list where the compiler can be found in PATH is used.
If compiler is a string it is run by passing it to a shell.
Deprocated: if the value is a string containing ':' a list is formed by splitting the string at the ':'s.
default_compiler_args
= {'c': [['-o', '%']], 'cc': [['-o', '%']]}
A dict which supplies a default value for the parameter compilers
based on the suffix for the
the first file specified by the parameter files
.
If '%' is present in a list, it is replaced by the program
.
compiler_args
= []
"List of arguments strings added to every compilation"
If '%' is present in a list, it is replaced by the program
.
compile_commands
List of compile commands.
Test is run once for each member of the list.
If command is a string it is run by passing it to a shell.
compile_commands
is not normally set directly.
If not set, it is formed from compilers
and compiler_args
and files
.
In most cases, set these parameters will be more appropriate.
setup_command
If set setup_command
is executed once before a test.
This is invisible to the user, unless setup_command
produces output.
The test is not run if setup_command
has a non-zero exit-status.
If setup_command
is a string, it is passed to a shell.
If setup_command
is a list, it is executed directly.
global_setup_command
If set global_setup_command
is executed once before all tests run within the shared test directory once all necessary files have been copied.
This is invisible to the user, unless global_setup_command
has a non-zero exit status.
All tests are not run if global_setup_command
has a non-zero exit-status.
The execution of any given global_clean_command
is aborted/not run if exit-status of global_setup_command
is not zero or one.
If global_setup_command
is a string, it is passed to a shell.
If global_setup_command
is a list, it is executed directly.
global_clean_command
If set global_clean_command
is executed once after all tests run within the shared test directory.
This is invisible to the user, unless global_clean_command
has a non-zero exit status.
If global_clean_command
is a string, it is passed to a shell.
If global_clean_command
is a list, it is executed directly.
supplied_files_directory
If set to a non-empty string, any files in this directory are copied to the directory before testing.
This directory is also prepended to any relative file pathnames in test specifications.
Its default value is the directory containing the test specification file (tests.txt
).
Only one directory is copied for all tests. This parameter must be set as a global parameter.
It is usually specified in a wrapper shell script via -P.
stdin
Bytes supplied on stdin for test.
Deprocated: stdin is not specified and the file test_label.stdin
exists, its contents are used.
Not yet implemented: if value is a list it is treated as list of pathname of file(s) containing bytes.
unicode_stdin
= True
Whether or not the specified stdin should be treated as unicode. Default is True.
unicode_stdout
= True
Whether or not the program's stdout should be treated as unicode. Default is True.
unicode_stderr
= True
Whether or not the program's stderr should be treated as unicode. Default is True.
unicode_files
= True
Describes whether or not output files should be treated as unicode. Default is True.
environment_kept
= 'ARCH|C_CHECK_.|DCC_.|DRYRUN_.|LANG|LANGUAGE|LC_.|LOGNAME|USER'
Environment variables are by default deleted to avoid them affecting testing.
Environment variables whose entire name matches this regex are not deleted.
All other environment variables are deleted.
environment_base
Dict specifying values for environment variables.
Default:
{
'LC_COLLATE' : 'POSIX',
'LC_NUMERIC' : 'POSIX',
'PERL5LIB' : '.',
'HOME' : '.',
'PATH' : '/bin:/usr/bin:/usr/local/bin:.:$PATH',
},
where $PATH
is the original value of PATH
.
The environment variables in environment_base
are set and then,
environment variables specified in environment_set
are set.
This parameter should not normally be used.
The parameter environment_set
should normally be used instead of this parameter.
It is only necessary to specify environment_base
if these variables need to be unset rather than given different values for a test.
environment_set
= {}
Dict specifying environment variables to be set for this test.
For example: environment_set={'answer' : 42 }
This is the parameter that should normally be used to manipulate environment variables.
environment
Dict specifying all environment variables for this test.
This parameter should not normally be specified,
environment_set
will serve most purposes.
By default environment
is formed by taking original environment variables provided to autotest,
removing all but those matching the regex in environment_variables_kept
,
setting any variables specified in environment_base
and then
setting any variables specified in environment_set
.
global_user_environment_vars
= []
List of environment variable values that need to be provided by the user for use in all tests.
User input is reflected on the terminal.
For example, if a user needs to provide a value to a pre-specified environment variable. This shouldn't be used unless necessary.
NOTE: There is no duplicate checking implemented at the moment so please use a variable that does not exist.
global_user_protected_environment_vars
= []
List of environment variable values that need to be provided by the user for use in all tests.
User input is not reflected on the terminal.
For example, if a user needs to provide their LDAP password for UNSW LDAP authentication which should
be stored in $PASS, provide ["PASS"]
.
NOTE: There is no duplicate checking implemented at the moment so please use a variable that does not exist.
expected_stdout
Bytes expected on stdout for this test.
If value is a list it is treated as list of pathname of file(s) containing expected bytes.
Deprocated: if expected_stdout
is not specified and the file test_label.expected_stdout
exists,
its contents are used.
Not yet implemented: handling of non-unicode output.
expected_stderr
Bytes expected on stderr for this test.
If value is a list it is treated as list of pathname of file(s) containing expected bytes.
Deprocated: if expected_stderr
is not specified and the file test_label.stderr
exists,
its contents are used.
Not yet implemented: handling of non-unicode output.
expected_file_name
= ''
Pathname of file expected to exist after this test.
Expected contents specified by expected_file_contents
.
Use expected_files
to specify creation of multiple files.
expected_file_contents
= ''
Bytes expected in file expected_file_name
expected to exist after this test.
Not yet implemented: handling of non-unicode output.
expected_files
= {}
Dict specified bytes expected to be written to files.
if value is an string, it is specifies bytes expected to be written to that filename.
If a value is a list it is treated as list of pathname of file(s) containing expected bytes.
For example: this indicates a file named
answer.txtshould be created containing
42.
expected_files={"answer.txt":"42
"}`
Not yet implemented: handling of non-unicode output.
Resource limits are mostly implemented on via setrlimit
and more information can be found in its documentation.
If a resource limit is exceeded, the test is failed with an explanatory message.
max_stdout_bytes
Maximum number of bytes that can be written to stdout.
max_stderr_bytes
Maximum number of bytes that can be written to stderr.
max_real_seconds
Maximum elapsed real time in seconds (0 for no limit).
If not specified, defaults to 20 * max_cpu_seconds
max_cpu_seconds
= 60
Maximum CPU time in seconds (0 for no limit).
max_core_size
= 0
Maximum size of any core file written in bytes.
max_stack_bytes
= 32000000
Maximum stack size in bytes.
max_rss_bytes
= 100000000
Maximum resident set size in bytes.
max_file_size_bytes
= 8192000
Maximum size of any file created in bytes.
max_processes
= 4096
Maximum number of processes the current process may create. Note: unfortunately this is total per user processes not child processes
max_open_files
= 256
Maximum number of files that can be simultaneously open
These apply to comparision for stdout, stderr, and files
ignore_case
= False
Ignore case when comparing actual & expected output
ignore_whitespace
= False
Ignore white space when comparing actual & expected output.
ignore_trailing_whitespace
= True
Ignore white space at end of lines when comparing actual & expected output.
ignore_blank_lines
= False
Ignore lines containing only white space when comparing actual & expected output.
ignore_characters
= ''
Ignore these characters when comparing actual & expected output.
Ignoring "
" has no effect, use **ignore_blank_lines**
to ignore empty lines.
Unimplemented: handling of UNICODE.
compare_only_characters
Ignore all but these characters and newline when comparing actual & expected output.
Unimplemented: handling of UNICODE.
postprocess_output_command
Pass expected and actual output through this command before comparison.
If command
is a string, it is passed to a shell.
If it is a list it is executed directly.
allow_unexpected_stderr
= False
Do not fail a test if there is unexpected output on stderr but other expected outputs are correct.
This means warning messages don't cause a test to be failed.
colorize_output
If true highlight parts of output using ANSI colour sequences. Default is true if stdout is a terminal
description
String describing test printed with its execution - defaults to command
.
show_actual_output
= True
If true, the actual output is included in a test failure explanation.
show_expected_output
= True
If true, the expected output is included in a test failure explanation.
show_diff
= True
If true, a description of the difference between expected output is included in a test failure explanation.
show_stdout_if_errors
= False
Unless true the actual output is not included in a test failure explanation, when there are unexpected bytes on stderr.
show_reproduce_command
= True
If true the command to reproduce the test is included in a test failure explanation
show_compile_command
= True
If true the command to compile the binary for a test is included in the test output.
show_stdin
= True
If true the stdin is included in a test failure explanation
max_lines_shown
= 32
Maximum lines included in components of test explanations. Any further lines are elided. Likely to be replaced with improved controls.
max_line_length_shown
= 1024
Maximum line lengths included in components of text explanations. Any further characters are elided.
no_replace_semicolon_reproduce_command
= False
If true semicolons are not replaced with newlines in the command to reproduce the test if it is included in a test failure explanation.
Likely to be replaced with improved controls.
dcc_output_checking
Use dcc's builtin output checking to check for tests's expected output. This is done by setting several environment variables for the test
upload_url
= ''
Files tested and the output of the tests are uploaded to this URL using a POST request.
No more than upload_max_bytes
will be uploaded with the test output log not included in this number.
Any field/values specified in upload_max_bytes
will be included in the POST request.
In addition 'exercise', 'hostname' and 'login' fields are included in the POST request.
A zip archive containing the files tested is passed as the field zip
.
This zip archive includes the output of the test in a file named autotest.log
.
Only one upload is done for all tests. This parameter must be set as a global parameter.
upload_max_bytes
= 2048000
Maximum number of bytes uploaded if `upload_url is set.
upload_fields
= {}
Any specified fields/values are added to upload requests.
debug
= 0
Level of internal debugging output to print.
worker_count
= 3
Number of autotest workers. Effectively how many unshare container processes to run autotests on. (multi-processing) Ideally, having more should improve performance depending on the core count + other tasks running.
worker_isolate_network
= True
If running in a worker
, isolate network.
worker_read_only_mount
= []
Pathnames of files or directories mounted read-only in the worker
in addition to files or directories specified by worker_read_only_mount_base
.
A tuple can be to specify a diferent mount point in the worker.
Note: Files and custom mount points are not yet supported
worker_read_write_mount
= []
Pathnames of files or directories visible mounted read-write in the worker
in addition to files or directories specified by worker_read_write_mount_base
.
A tuple can be to specify a different mount point in the worker
/tmp
, /proc
, /sys
and /dev
are always mounted directly read-write in the worker
Note: Files and custom mount points are not yet supported
worker_read_only_mount_base
= ['/bin', '/etc', '/lib', '/lib32', '/lib64', '/libx32', '/sbin', '/usr']
Pathnames of files or directories mounted read-only in the worker
The parameter worker_read_only_mount
should be used to add extra pathnames.
This parameter need only be set to stop one of these pathnames being mounted.
Note: Files and custom mount points are not yet supported