Sources for the test161 Go libraries and command line tool.
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Type Name Latest commit message Commit time
Failed to load latest commit information.


test161: A Testing Tool for OS/161

test161 is a command line tool for automated testing of OS/161 instructional operating system kernels that run inside the sys161 (System/161) MIPS R3000 simulator. You are probably not interested in test161 unless you are a student taking or an instructor teaching a course that uses OS/161.

test161: Library, Client, and Server

The test161 source tree consists of a library along with client and server utilities, which are in large part wrappers around the library. Configuration and Usage examples below that mention test161 commands are referring to the test161 utility, which is what students generally interact with. There is also a test161-server utility that students indirectly interact with when submitting assignments. Much of the documentation refers simply to test161, which refers to the system as a whole.

Installation and Environment

test161 is written in Go, and the instructions below assume you are installing from source and/or setting up your development environment to work on test161. Alternatively, the current stable binary version of test161 is included in the PPA.

Installing Go

Many Linux distributions package fairly out-of-date versions of Go. Instead, we encourage you to install the Go Version Manager (GVM):

sudo apt-get install -y curl bison # Install requirement
bash < <(curl -s -S -L
source $HOME/.gvm/scripts/gvm

At this point you are ready to start using GVM. We are currently building and testing test161 with Go version go1.5.3. However, because the Go compiler is now written in Go, installing versions of Go past 1.5 require install Go version 1.4 first.

gvm install go1.4
gvm use 1.4
gvm install go1.5.3
gvm use 1.5.3 --default


Note that gvm will set your GOPATH and PATH variables properly to allow you to run Go binaries that you install. However, if you are interested in writing Go code you should set a more accessible GOPATH as described as described here.

Installing test161

Once you have Go installed, upgrading or installing test161 is simple:

go get -u


Out of the box, test161 can’t do much without a set of test scripts and an OS/161 root directory which contains your kernel and user binaries. If you are starting from the ops-class OS/161 sources, as soon as you configure, compile, and install your userland binaries and kernel, test161 will work in either your root or source trees. If you are starting from other OS/161 sources, see Custom Configuration.

Custom Configuration

The ops-class sources create symlinks [1] between your OS/161 source and root directories in order to infer your environment, which may not always be what you want. To support partial environments where either source or root cannot be inferred from the other, or you want to use a specific set of test161 configuration files, you can use test161 config to set the test161 directory [2].

test161 config test161dir <path>

This allows you to run test161 tests from your root directory using the test files in test161dir. If you do not have symlinks created for environment inference, submitting will need to be done from your source directory.

Environment Variables

In addition to configuring the test161 directory, test161 supports environment variables that may be useful during development testing and in advanced and use cases.

  • TEST161_SERVER: This variable allows you to set a custom back-end server, for example:

export TEST161_SERVER=http://localhost:4000
  • TEST161_OVERLAY: The test161 server uses an overlay directory containing trusted files for each assignment. As a security measure, these trusted files — make files, user and kernel test code — replace students' versions when testing their submissions (see Security). Setting this environment variable allows you to test an overlay using the test161 client, allowing you to test overlay changes without submitting to a test161 server.

Submission Configuration

In order to submit to the test161 server, you need to configure your username/token [3], which can be done with:

test161 config add-user <username> <token>

Removing users and changing configured tokens can be done with:

test161 config del-user <username>              # Delete user information
test161 config change-token <username> <token>  # Change token

Printing Configuration

To view the current test161 configuration, simply run:

test161 config


test161 is designed around two main tasks: running tests and submitting targets. Additionally, sub-commands exist for configuring test161 and listing existing tests. Running test161 with no arguments will print usage information, and for a more detailed description of test161 sub-commands, use test161 help.

Tests, Targets, and Tags

The test161 sub-commands often take one or more tests, targets, and tags as arguments. A test consists of one or more OS/161 commands along with metadata, sys161 configuration, and possibly some additional test161 runtime configuration. Tests can optionally include tags, which allow related tests to be grouped and run together. Targets consist of a set of tests that are run together, and allow point values to be assigned to each test.

Listing Tests, Targets, and Tags

Available tests, targets, and tags can easily be listed with the test161 list sub-command:

test161 list tests        # List all tests with descriptions
test161 list tags         # List all tags and which tests share each tag
test161 list targets      # List all targets
test161 list targets -r   # List all targets available for submission on the server

Running Tests

To run a single test, group of tests, or single target, use the test161 run <names> sub-command. Here, <names> can be a single target, one or more tests, or one or more tags.[4] For test files, <names> is a list of globstar style file names, with paths specified relative to the root of the test directory. The following are all valid commands:

test161 run synch/*.t         # Run all tests in the tests/sync/
test161 run **/l*.t           # Run all tests in all sub-directories beginning with 'l'
test161 run synchprobs/sp*.t  # Run the synchprobs
test161 run synch/lt1.t       # Run lock test 1
test161 run locks             # Run all lock tests (tests tagged with 'locks')
test161 run asst1             # Run the asst1 target

Test Concurrency

By default, test161 runs all tests in parallel to speed up processing. As a result, the output produced by each test will be interleaved, which can be difficult to debug. It is possible to run tests sequentially using the -sequential (-s) flag.

Test Dependencies

Each test specifies a list of dependencies, tests that must pass in order for that test to run. For example, our condition variable tests depend on our lock tests since locks must work for CVs to work. Internally, test161 creates a dependency graph for all the tests it needs to run and will short-circuit any children in the dependency graph in case of failure. By default, all dependencies are run when running any group of tests. For targets, this is unavoidable. For other groups of tests, this behavior can be suppressed with the -no-dependencies (-n) flag. This can save a lot of time when debugging a particular test that has a lot of dependencies.

Command Line Flags

There are several command line flags that can be specified to customize how test161 runs tests.

  • -dry-run (-r): Show the tests that would be run, but don’t run them.

  • -explain (-x): Show test detail, such as name, descriptions, sys161 configuration, commands, and expected output.

  • -sequential (-s): By default the output of all tests are interleaved, which can be hard to debug. Specify this option to run tests one at a time.

  • -no-dependencies (-n): Run the given tests without also running their dependencies.

  • -verbose (-v): There are three levels of output: loud (default), quiet (no test output), and whisper (only final summary, no per-test status).


Solutions are submitted with the test161 submit sub-command. In the most common case, you will use the following command from your source or root directory, where <target> is the target you wish to submit:

test161 submit <target>

By default, test161 submit will use the commit associated with the tip of your current Git branch. This behavior can be overridden by specifying a tree-ish argument after the target argument. For example, all of the following are valid commands:

test161 submit asst1            # Submit the current branch to the asst1 target
test161 submit asst2 working    # Submit the working branch/tag to the asst2 target
test161 submit asst3 3df3dd59a  # Submit the commit 3df3dd59a to the asst3 target

Command Line Flags

test161 submit has a few useful command line flags:

  • -debug: Print debug output when submitting, namely all Git commands used to determine repository details.

  • -verify: Check for local and remote issues without submitting, i.e. verify that the submission would be accepted. This option is useful for verifying that your configuration —  users, tokens, keys, etc. — is correct.

  • -no-cache: As an optimization, test161 caches a cloned copy of your repo in the same way the server does in order to improve the performance of subsequent submissions. In some cases, it is useful to override this behavior.


  • sys161 and disk161 in the path.

  • Git version >= 2.3.0.

Commands, Tests, and Targets

test161 uses a YAML-based configuration system, with configuration files located across subdirectories of your test161 directory. The anatomy of this configuration directory is as follows:

  • commands/: *.tc files containing OS/161 command specification. Each .tc file usually contains multiple related commands.

  • targets/: *.tt files containing target definitions, one per target.

  • tests/: *.t files containing test specification, one per test. This directory will contain subdirectories used to organize related tests.


The basic unit in test161 is a command, such as lt1 for running Lock Test 1, or sp1 to run the whalemating test. Information about what to expect when running these commands, as well as what input/output they expect is specified in the commands directory in your test161 root directory. All .tc files in this directory will be loaded, and commands must only be specified once.

The following is the full syntax for a commands file:

# Each commands file consists of a collection of templates. An * indicates the
# default option.
    # Command name/id. For userland tests, include the path and binary name.
  - name: /testbin/forktest

    # An array of input arguments (optional). This should be included if the
    # command needs default arguments.
      - arg1
      - arg2

    # An array of expected output lines (optional). This should be specified
    # if the output differs from the default <name>: SUCESS.
        # The expected output text
      - text: "You should see this message if the test passed!"

        # true* if the output is secured, false if not
        trusted:  true

        # true if <text> references an external command, false* if not
        external: false

    # Whether or not the command panics - yes, no*, or maybe
    panics: no

    # Whether or not the command is expected to time out - yes, no*, maybe
    timesout: no

    # Time (s) after which the test is terminated - 0.0* indicates no timeout
    timeout: 0.0

Minimally, any command that is to be evaluated for correctness needs to be present in exactly one commands (.tc) file with the name property specified. If no output is specified, the default expected output is <command name>: SUCCESS.


In the following example, several commands are specified all of which expect the default output.

  - name: lt1
  - name: lt2
  - name: sem1

Some commands might be designed to cause a kernel panic.

  - name: lt2
    panics: yes
      - text: "lt2: Should panic..."

Some OS/161 tests are composed of other tests, in which case the command output will reference an external command. In the following example, the 'triplehuge' command expects three instances of the 'huge' command’s output:

 - name: /testbin/triplehuge
      - {text: /testbin/huge, external: true, trusted: true}
      - {text: /testbin/huge, external: true, trusted: true}
      - {text: /testbin/huge, external: true, trusted: true}

Input and output can use Go’s text templates to specify more complex text. The arguments and argument length are available in the text templates as .Args and .ArgLen, respectively. Custom functions are also provided; see the funcMap in commands.go for details.

The following example illustrates how the add test’s output can be determined from random integer inputs:

  - name: /testbin/add
      - "{{randInt 2 1000}}"
      - "{{randInt 2 4000}}"
      - text: "/testbin/add: {{$x:= index .Args 0 | atoi}}{{$y := index .Args 1 | atoi}}{{add $x $y}}"


Test files (*.t) are located in the tests/ directory in your test161 root directory. This directory can contain subdirectories to help organize tests. Each test consists of one or more commands, and each test can have its own sys161 configuration. Tests are run in their own sandboxed environment, but commands within the test are executed within the same sys161 session.

The following is an example of a test161 test file:

name: "Semaphore Test"
  Tests core semaphore logic through cyclic signaling.
tags: [synch, semaphores, kleaks]
depends: [boot]
  cpus: 32

Front Matter

The test consist of two parts. The header in between the first and second --- is YAML front matter that provides test metadata and configuration. The following describes the syntax and semantics of the test metadata:

name: "Test Name"            # The name this is displayed in test161 commands
description: "Description"   # Longer test description, used in test161 list tests
tags: [tag1, tag2]           # All tests with the same tag can be run with test161 run <tag>
depends: [dep1, dep2]        # Specify dependencies. If these fail, the test is skipped
Configuration Options

In addition to metadata, the test file syntax supports various configuration options for both test161 and the underlying sys161 instance. The following provides both the syntax and semantics, as well as the default values for all configuration options.

# sys161 configuration
  # 1-32 are valid
  cpus: 8

  # Number of bytes of memory, with optional K or M prefix
  ram: 1M

  # Random number generated at runtime. This can be overridden by specifying an
  # unsigned 32 bit integer to use as the random seed.
  random: seed=random

  # Disabled by default, but should be enabled when you want swap disk
    enabled: false
    rpm: 7200
    bytes: 32M
    nodoom: true

  # Disabled by default, but uses these defaults if configured
    enabled: false
    rpm: 7200
    bytes: 32M
    nodoom: false

# stat161 configuration. The window specifies the number of stat objects we
# keep around, while the resolution represents the interval (s) that we
# request stats from stat161.
  resolution: 0.01
  window: 1

# Monitor configuration
  enabled: true

  # Number of samples to include in kernel and user calculations
  window: 400

  # Monitor configuration for tracking kernel cycles. The ratio of kernel
  # cycles to total cycles to be >= min (if enabled) and <= max.
    enablemin: false
    min: 0.001
    max: 1.0

  # Monitor configuration for tracking user cycles. The ratio of user cycles
  # to total cycles to be >= min (if enabled) and <= max.
    enablemin: false
    min: 0.0001
    max: 1.0

  # Sim time (s) without character output before the test is stopped
  progresstimeout: 10.0

  # Sim time (s) a command is allowed to execute before it is stopped
  commandtimeout: 60.0

# Miscelleneous configuration
  # The next three configuration parameters deal with sys161 occasionally
  # dropping input characters.

  # Time (ms) to wait for a character to appear in the output stream after
  # sending it.
  charactertimeout: 1000

  # Whether or not to retry sending characters if the character timeout is
  # exceeded.
  retrycharacters: true

  # Number of times a command is retried if the number of character retries is
  # exceeded.
  commandretries: 5

  # Time (s) before halting a test if the current prompt is not seen in the
  # output stream.
  prompttimeout: 1800.0

  # If true, send the kill signal to sys161. This should not generally be
  # needed.
  killonexit: false
Command Override

In addition to the configuration options, command behavior can be overridden in the YAML front matter. A partial command template can be specified using the commandoverrides property, which will be merged with the command definition found in the commands files.

For example, the following changes the command timeout for a particular test:

  - name: /testbin/forkbomb
    timeout: 20.0

Note that the name must be specified in order to distinguish between commands in the test.

Test Commands

The second part of the test file is a listing of the commands that make up the test. In the example at the top of the section, the test file specifies that a single test should be run, namely sem1. It is important to note that the command name provided here must match what is specified in the commands files.

Test File Syntactic Sugar

A line starting with $ will be run in the shell and start the shell as needed. Lines not starting with $ are run from the kernel prompt and get there if necessary by exiting the shell. sys161 shuts down cleanly without requiring the test manually exit the shell and kernel, as needed.

So this test:

$ /bin/true

Expands to:


Note that commands run in the shell must be prefixed with $. Otherwise test161 will consider them a kernel command and exit the shell before running them. For example:

This test is probably not what you want:


Because it will expand to:

/bin/true # not a kernel command

But this is so much simpler, right?

$ /bin/true

test161 also supports syntactic sugar for memory leak detection.

| p /testbin/forktest

expands to:

p /testbin/forktest


Target files (*.tt) are located in the targets/ directory in your test161 root directory. Targets specify which tests are run for each assignment, and how the scoring is distributed. When you test161 submit your assignments, you will specify which target to submit to.

The following example provides the full syntax of a target file:

# The name must be unique across all targets
name: example_target

# The print name, description, and leaderboard are used by the test161 front-end website
print_name: EXAMPLE
description: An example to illustrate target syntax and semantics
leaderboard: true

# Only active targets can be submitted to
active: true

# The test161 server uses the target version internally. The version number
# must be incremented when any test or points details change.
version: 1

# Target types can be asst or perf (assignment and performance).
type: asst

# The total number of points for the target. The sum of the individual test
# points must equal this number.
points: 10

# The associated kernel configuration file. test161 uses this value to
# configure and compile your kernel.
kconfig: ASST1

# true if the userland binaries should be compiled, false (default) if not.
userland: false

# Specify a commit hash id that must be included in the Git history of the
# submitted repo.

# The list of tests that are to be run and evaluated as part of this target.
    # ID is the path relative to the tests directory
  - id: path/to/test.t
    # entire or partial. With the "entire" (default) method, all commands in
    # the test must pass to earn any points. With partial, each command in the
    # test can earn points.
    scoring: entire

    # The points for this test
    points: 10

    # The number of points to deduct if a memory leak was detected.
    mem_leak_points: 2  # default is 0

    # A list of commands whose behavior needs to be individually specified.
    # This is only necessary when argument overrides need to be provided, or
    # when partial command credit is given.
      - id: sem1
        # Particular instance of the command in the test. Only useful if the
        # command is listed multiple times in the test.
        index: 0
        points: 0
        # Override default command arguments.
        args: [arg1, arg2,...]


test161-server is a command line utility that implements the test161 back-end functionality. Its main responsibilities include accepting submission requests, evaluating these requests, and persisting test output and results.

Our test161-server uses mongoDB as its storage layer, which is also how it communicates with the front-end. The interface of the back-end utility is less mature than the test161 command line utility, mostly due to its audience.

test161-server Configuration

The test161-server configuration is in YAML configuration file, ~/.test161-server.conf. The following example provides the syntax for this configuration file:

overlaydir: /path/to/overlay/directory
test161dir: /path/to/test161/root/directory
cachedir: /path/to/student/repo/cache
keydir: /path/to/student/deploy/keys

# The maximum concurrency for executing test161 tests. This can also be changed
# dynamically from the command line with test161-server set-capacity N.
max_tests: 20

# The mongoDB database name
dbname: "test161"

# The mongoDB database server and port
dbsever: "localhost:27017"

# Database credentials
dbuser: user
dbpw: password

# The port that test161-server is configured to listen on for API requests
api_port: 4000

# The minimum test161 client version that the server will accept submissions from
  major: 1
  minor: 2
  revision: 5

Key Directory

As part of its API, test161-server can generate public/private RSA keypairs. The front-end issues these requests from a student’s settings page. The keypairs are stored in the keydir specified in test161-server.conf. The student is required to add the public key to their private Git repository as a deploy key so test161 can clone their OS/161 sources.

Cache Directory

test161-server caches students' source code so that it can fetch updates rather than re-clone on subsequent submissions.

test161-server Usage

test161-server should be launched as a daemon during boot, but occasionally you may need to communicate with the running instance.

test161-server status          # Get the status of the running instance
test161-server pause           # Stop the server from accepting new submissions, but
                               # finish processing pending submissions
test161-server resume          # Resume accepting submissions
test161-server set-capacity N  # Set the max number of concurrent tests
test161-server get-capacity    # Get the max number of concurrent tests


Progress Tracking Using stat161 Output

test161 uses the collected stat161 output produced by the running kernel to detect deadlocks, livelocks, and other forms of stalls. We do this using several different strategies:

  1. Progress and prompt timeouts. Test files can configure both progress (monitorconf.timeouts.progress) and prompt (monitorconf.timeouts.prompt) timeouts. The former is used to kill the test if no output has appeared, while the latter is passed to expect and used to kill the test of the prompt is delayed. Ideally OS/161 tests should produce some output while they run to help keep the progress timeout from firing, but the other progress tracking strategies described below should also help.

  2. User and kernel maximum and minimum cycles. test161 maintains a buffer of statistics over a configurable number of stat161 intervals. Limits on the minimum and maximum number of kernel and user cycles (expressed as fractions) over this buffer can help detect deadlocks (minimum) and livelocks (maximum). User limits are only applied when running in userspace.

  3. Note that test161 also checks to ensure that there are no user cycles generated when we are running in kernel mode, which could be caused by a hung progress.

Memory Leak Detection

In addition for checking for test correctness, test161 can also check for memory leaks. To implement this feature, a few changes were made to our ops-class OS/161 sources, which implies this feature will be unavailable or source modification is required if you are starting from other OS/161 sources.

A new command, khu, has been added to our OS/161 kernel menu. When run, this command prints the number of bytes of memory currently allocated, between both the kmalloc subpage allocator, and the VM subsystem. test161 parses this output and calculates the difference between successive invocations to determine memory leaks. test161 targets can optionally deduct points for memory leaks.

Correctness vs. Grading

The concepts of correctness and grading are purposely separated in test161. Correctness is first established at the command granularity — each command has specific criteria that defines correct execution. Since tests are composed of commands, it follows that test correctness can be determined from command correctness. Grading, however, is handled independently by targets. The partial grading method allows for points to be awarded when only some of the commands in a single test are correct. In the entire scoring method, points are only awarded of the all of the commands in the test are correct.

Partial Credit

test161 allows for partial credit at the command level. This is different from the partial scoring method for tests. With partial credit, a command can earn a fraction of the points it is assigned in the target. This is implemented by looking for the (secured) string, PARTIAL CREDIT X OF N. If X == N, full credit is awarded and the test is marked correct. Otherwise, a fraction of the points (X/N) are awarded and the test is marked as incorrect.


Given that students are modifying the operating system itself, the attack surface for gaming the system is quite large. Given that we have modified user test programs to output <name>: SUCCESS when they succeed, it would be particularly easy for students to fake this output if security measures were not put in place. Therefore, security has been built into test161 to create a secure testing environment. This was accomplished through both test161 features and additions to our ops-class OS/161 sources. In particular, we ensure that our trusted tests are running, and that to a very high degree, we trust the output.

libtest161 and secprintf

Our OS/161 sources add a test161 library, libtest161 with the important function, secprintf:

int secprintf(const char * secret, const char * msg, const char * name);

When SECRET_TESTING is disabled, which it normally is, this function simply outputs the message, such as <name>: SUCCESS. Even though the test161 command is expecting trusted output, it knows that SECRET_TESTING has been disabled and will ignore this requirement. This allows students to test their code using test161 in an unsecured environment.

When SECRET_TESTING is enabled, this function secures the output string by computing the SHA256 HMAC of the message using the provided secret key and a random salt value. It outputs a 4-tuple of (name, hash, salt, name: message). test161 uses this information to verify the authenticity of the message.

Source Overlay

test161 allows the specification of an overlay directory that contains trusted source files. These trusted files, such as make files and anything that prints SUCCESS, overwrite students' untrusted source files on the test161 server during compilation. During testing, an overlay can be specified using the TEST161_OVERLAY environment variable (see Environment Variables).

Key Injection

When an overlay is specified, the process of key injection is triggered. In our OS/161 source code, a placeholder (SECRET) for the secret key was added anywhere a key was required for the secprintf function. Key injection replaces instances of SECRET in the source code with a randomly generated key, one per command. During compile time, a map of command to key is created so test161 can authenticate messages output by test code. Importantly, this process repeats itself each time a student submits to a target, which means no information from previous submissions can be replayed. Additionally, unique salt values are required during testing, preventing replay attacks from previously seen command output, such in the case of triplehuge.

Multiple Output Strategies

test161 supports different output strategies through its PersistenceManager interface. Each TestEnvironment has a PersistenceManager which receives callbacks when events happen, like when scores changes, statuses change, or when output lines are added. This allows multiple implementations to handle output as they wish. The test161 client utility implements the interface through its ConsolePersistence type, which writes all input to stdout. The server uses a MongoPersistence type which outputs JSON data to our mongo back-end server. This feature allows test161-server forks to easily use whatever back-end storage system they desire.



  • sys161 version checks

  • Check for repository problems:

    • Check and fail if it has inappropriate files (.o), or is too large. (Prevent back-end storage DOS attacks.)

  • Use URL associated with the tree-ish id provided to test161 submit

  • Fix directory bash completion for test161 config test161dir. It’s unfortunately adding a space instead of /.

  • Populate man pages

Performance Tracking

Most of the infrastructure is in place to handle performance targets, but we still need finish this and test it. Specifically, we need set the performance number in the Test object and use it properly in the Submission.

Parallel Testing Output

It would be cool to be able to print serial output from one test while queuing the output from other tests. Maybe using curses to maintain a line at the end of the screen showing the tests that are being run.

Output Frequency

For long running tests, OS/161 tests generate periodic output, usually in the form of a string of '.' characters. This output is used as a keep-alive mechanism, resetting test161’s progress timeout. Because this output is in a single line, and it would create more unnecessary DB output and server load to break these into multiple lines, it would be nice to refactor things in such a way that the current output line is periodically persisted. This would give students a better indication of progress, as opposed to tests looking "stuck".

Server Nits

  • Environment inference with environment variable overrides, similar to the test161 client

    • test161-server config to both show the configuration and modify it

  • Log the configuration on startup

  • Usability cleanup

    • Usage

    • Help

    • Bash completion

  • Moving window for stats API

  • Periodically persist server stats, either in mongoDB or through the logger. We currently lose these on restart.

  • Move collaboration messages into their own files instead of hard-coding

1. .root is in your source directory and points to your root directory, .src is in your root directory and points to your source directory
2. The directory containing the tests, targets, and commands subdirectories. For the ops-class sources, the test161 directory is named test161 and is a subdirectory of the OS/161 source directory.
3. The task of creating tokens belongs to the front-end, which students need log in for.
4. In the case that tag and target names conflict, specify -tag if you mean tag.