Skip to content

Writing Tests

Siddhartha Kasivajhula edited this page Jun 13, 2024 · 8 revisions

Where to Put Tests?

See the guide to qi-test in the Source Code Overview.

Test Public Interfaces Not Private Functions

The purpose of unit tests is to validate what a function does, not how it does it. That's why we test only public interfaces rather than private functions used within.

Private functions should have no dedicated tests as it makes the codebase rigid and inflexible to changing implementations.

If some lines in a private function are uncovered by tests, write more thorough tests for public interfaces using the function so that those lines become covered. If public interfaces don't use those lines in the private function, then those lines are unnecessary and should be removed.

One Assertion Per Test

If a test involves some setup and we'd like to assert several things once this environment has been set up, then abstract the setup component into a helper function and write multiple tests that invoke it and then each assert a single property.

This ensures that if there is a failure, then it would neatly identify a single failing expectation instead of narrowing it to several possibilities.

This may not always be possible or advisable, but it is a good rule of thumb: aim for one assertion per test.

Tests for Triangulating Issues

In many cases, what's "public" is a matter of perspective. From the user's perspective, only the surface language is "public," and from this perspective we only need unit tests at the level of the surface syntax. But from a Qi developer's perspective, there are many levels in the codebase and each must fulfill certain well-specified contracts. Tests aren't just for catching bugs but are also useful as debugging tools to help us isolate issues as narrowly as possible. For this reason, we also write tests for "public" interfaces from a developer perspective, such as the expander as a whole, the compiler as a whole, individual compiler passes, individual rules in the compiler, etc.

Although it is a matter of perspective, hopefully it will be clear in context whether a function deserves dedicated tests or if it should only be tested via well-scoped higher-level functionality using it.

Valid Regression Tests

For an identified bug or potential bug, in addition to ensuring that the test passes with the appropriate fix, also ensure that the test fails in the absence of the fix. The ensures that the test is valid and capable of detecting regression.

Test Coverage

Coverage Strategies

Qi aims for 100% test coverage. In some instances, this can be tricky to achieve. Here are some strategies for dealing with such cases. There are also cases where code either need not, or cannot, be covered by tests. These are rare, but examples of this include:

  • Code that is only hit when using a custom evaluator (e.g. a sandboxed evaluator), which does not count for coverage
  • Contract code that is never hit due to arity fast-tracking (i.e. the Racket evaluator checking arity first before asserting a contract)

These could still have remedies in some cases, but otherwise could be excluded from coverage, as discussed below.

Replace lambdas with named functions

If there is a lambda whose body is never evaluated in tests (e.g. due to the contract arity fast-tracking mentioned above), then consider replacing the lambda with a named function. If this named function is a standard function such as identity, or a function defined and tested in another module, then that should be all you need to do. Otherwise, if it is a new function you will be defining locally, then write a separate unit test for this function to validate its behavior. This should now result in the code being fully covered.

Declare exclusions using submodules

If the code cannot or need not be covered and you need to add an exclusion, you can do this by placing the code that should not be covered in a submodule. Submodules are, by default, ignored by the coverage checker.

Important: Remember to add a comment explaining why the code cannot/need not be covered, so that if there are any changes to the code in the future, understanding this reason will help reveal whether the changes are subject to the same limitation or not.

Clone this wiki locally