All new code, or changes to existing code, should have new or updated tests before being merged into master. This document gives some guidelines for developers who are writing tests or reviewing code for CKAN.
CKAN is an old code base with a large legacy test suite in ckan.tests
. The legacy tests are difficult to maintain and extend, but are too many to be replaced all at once in a single effort. So we're following this strategy:
- A new test suite has been started in
ckan.new_tests
. - For now, we'll run both the legacy tests and the new tests before merging something into the master branch.
- Whenever we add new code, or change existing code, we'll add new-style tests for it.
- If you change the behavior of some code and break some legacy tests, consider adding new tests for that code and deleting the legacy tests, rather than updating the legacy tests.
- Now and then, we'll write a set of new tests to cover part of the code, and delete the relevant legacy tests. For example if you want to refactor some code that doesn't have good tests, write a set of new-style tests for it first, refactor, then delete the relevant legacy tests.
In this way we can incrementally extend the new tests to cover CKAN one "island of code" at a time, and eventually we can delete the legacy ckan.tests
directory entirely.
We want the tests in ckan.new_tests
to be:
- Fast
- Use the
mock
library to avoid pulling in other parts of CKAN (especially the database), seemock
. Don't share setup code between tests (e.g. in test class
setup()
orsetup_class()
methods, saved against theself
attribute of test classes, or in test helper modules).Instead write helper functions that create test objects and return them, and have each test method call just the helpers it needs to do the setup that it needs.
- Use the
- Independent
- Each test module, class and method should be able to be run on its own.
- Tests shouldn't be tightly coupled to each other, changing a test shouldn't affect other tests.
- Clear
It should be quick and easy to see what went wrong when a test fails, or to see what a test does and how it works if you have to debug or update a test.
You shouldn't have to figure out what a complex test method does, or go and look up a lot of code in other files to understand a test method.
- Tests should follow the canonical form for a unit test, see
test recipe
. - Write lots of small, simple test methods not a few big, complex tests.
- Each test method should test just One Thing.
- The name of a test method should clearly explain the intent of the test. See
naming
. - Test methods and helper functions should have docstrings.
- Tests should follow the canonical form for a unit test, see
- Easy to find
It should be easy to know where to add new tests for some new or changed code, or to find the existing tests for some code.
- See
organization
- See
naming
.
- See
- Easy to write
Writing lots of small, clear and simple tests that all follow similar recipes and organization should make tests easy to write, as well as easy to read.
The follow sections give some more specific guidelines and tips for writing CKAN tests.
The organization of test modules in ckan.new_tests
mirrors the organization of the source modules in ckan
:
ckan/
tests/
controllers/
test_package.py <-- Tests for ckan/controllers/package.py
...
lib/
test_helpers.py <-- Tests for ckan/lib/helpers.py
...
logic/
action/
test_get.py
...
auth/
test_get.py
...
test_converters.py
test_validators.py
migration/
versions/
test_001_add_existing_tables.py
...
model/
test_package.py
...
...
There are a few exceptional test modules that don't fit into this structure, for example PEP8 tests and coding standards tests. These modules can just go in the top-level ckan/new_tests/
directory. There shouldn't be too many of these.
The name of a test method should clearly explain the intent of the test.
Test method names are printed out when tests fail, so the user can often see what went wrong without having to look into the test file. When they do need to look into the file to debug or update a test, the test name helps to clarify the test.
Do this even if it means your method name gets really long, since we don't write code that calls our test methods there's no advantage to having short test method names.
Some modules in CKAN contain large numbers of loosely related functions. For example, ckan.logic.action.update
contains all functions for updating things in CKAN. This means that ckan.new_tests.logic.action.test_update
is going to contain an even larger number of test functions.
So as well as the name of each test method explaining the intent of the test, it's important to name the test function after the function it's testing, for example all the tests for user_update
should be named test_user_update_*
.
Good test names:
test_user_update_with_id_that_does_not_exist
test_user_update_with_no_id
test_user_update_with_invalid_name
Bad test names:
test_user_update
test_update_pkg_1
test_package
The Pylons Unit Testing Guidelines give the following recipe for all unit test methods to follow:
- Set up the preconditions for the method / function being tested.
- Call the method / function exactly one time, passing in the values established in the first step.
- Make assertions about the return value, and / or any side effects.
- Do absolutely nothing else.
Most CKAN tests should follow this form.
One common exception is when you want to use a for
loop to call the function being tested multiple times, passing it lots of different arguments that should all produce the same return value and/or side effects. For example, this test from :pyckan.new_tests.logic.action.test_update
:
../ckan/new_tests/logic/action/test_update.py
The behavior of :py~ckan.logic.action.update.user_update
is the same for every invalid value. We do want to test :py~ckan.logic.action.update.user_update
with lots of different invalid names, but we obviously don't want to write a dozen separate test methods that are all the same apart from the value used for the invalid user name. We don't really want to define a helper method and a dozen test methods that call it either. So we use a simple loop. Technically this test calls the function being tested more than once, but there's only one line of code that calls it.
Generally, what we're trying to do is test the interfaces between modules in a way that supports modularization: if you change the code within a function, method, class or module, if you don't break any of that code's unit tests you should be able to expect that CKAN as a whole will not be broken.
As a general guideline, the tests for a function or method should:
- Test for success:
- Test the function with typical, valid input values
- Test with valid, edge-case inputs
- If the function has multiple parameters, test them in different combinations
- Test for failure:
- Test that the function fails correctly (e.g. raises the expected type of exception) when given likely invalid inputs (for example, if the user passes an invalid user_id as a parameter)
- Test that the function fails correctly when given bizarre input
- Test that the function behaves correctly when given unicode characters as input
- Cover the interface of the function: test all the parameters and features of the function
ckan.new_tests.factories
ckan.new_tests.helpers
We use the mock library to replace parts of CKAN with mock objects. This allows a CKAN function to be tested independently of other parts of CKAN or third-party libraries that the function uses. This generally makes the test simpler and faster (especially when :pyckan.model
is mocked out so that the tests don't touch the database). With mock objects we can also make assertions about what methods the function called on the mock object and with which arguments.
A mock object is a special object that allows user code to access any attribute name or call any method name (and pass any parameters) on the object, and the code will always get another mock object back:
>>> import mock
>>> my_mock = mock.MagicMock()
>>> my_mock.foo
<MagicMock name='mock.foo' id='56032400'>
>>> my_mock.bar
<MagicMock name='mock.bar' id='54093968'>
>>> my_mock.foobar()
<MagicMock name='mock.foobar()' id='54115664'>
>>> my_mock.foobar(1, 2, 'barfoo')
<MagicMock name='mock.foobar()' id='54115664'>
When a test needs a mock object to actually have some behavior besides always returning other mock objects, it can set the value of a certain attribute on the mock object, set the return value of a certain method, specify that a certain method should raise a certain exception, etc.
You should read the mock library's documentation to really understand what's going on, but here's an example of a test from :pyckan.new_tests.logic.auth.test_update
that tests the :py~ckan.logic.auth.update.user_update
authorization function and mocks out :pyckan.model
:
../ckan/new_tests/logic/auth/test_update.py
Here's a much more complex example that patches a number of CKAN modules with mock modules, replaces CKAN functions with mock functions with mock behavior, and makes assertions about how CKAN called the mock objects (you shouldn't have to do mocking as complex as this too often!):
../ckan/new_tests/logic/action/test_update.py
The following sections will give specific guidelines and examples for writing tests for each module in CKAN.
Note
When we say that all functions should have tests in the sections below, we mean all public functions that the module or class exports for use by other modules or classes in CKAN or by extensions or templates.
Private helper methods (with names beginning with _
) never have to have their own tests, although they can have tests if helpful.
ckan.new_tests.logic.action
ckan.new_tests.logic.auth
All converter and validator functions should have unit tests.
Although these converter and validator functions are tested indirectly by the action function tests, this may not catch all the converters and validators and all their options, and converters and validators are not only used by the action functions but are also available to plugins. Having unit tests will also help to clarify the intended behavior of each converter and validator.
CKAN's action functions call :pyckan.lib.navl.dictization_functions.validate
to validate data posted by the user. Each action function passes a schema from :pyckan.logic.schema
to :py~ckan.lib.navl.dictization_functions.validate
. The schema gives :py~ckan.lib.navl.dictization_functions.validate
lists of validation and conversion functions to apply to the user data. These validation and conversion functions are defined in :pyckan.logic.validators
, :pyckan.logic.converters
and :pyckan.lib.navl.validators
.
Most validator and converter tests should be unit tests that test the validator or converter function in isolation, without bringing in other parts of CKAN or touching the database. This requires using the mock
library to mock ckan.model
, see mock
.
When testing validators, we often want to make the same assertions in many tests: assert that the validator didn't modify the data
dict, assert that the validator didn't modify the errors
dict, assert that the validator raised Invalid
, etc. Decorator functions are defined at the top of validator test modules like :pyckan.new_tests.logic.test_validators
to make these common asserts easy. To use one of these decorators you have to:
- Define a nested function inside your test method, that simply calls the validator function that you're trying to test.
- Apply the decorators that you want to this nested function.
- Call the nested function.
Here's an example of a simple validator test that uses this technique:
../ckan/new_tests/logic/test_validators.py
ckan.new_tests.logic.test_schema
ckan.new_tests.controllers
ckan.new_tests.model
ckan.new_tests.lib
ckan.new_tests.plugins
ckan.new_tests.migration
Within extensions, follow the same guidelines as for CKAN core. For example if an extension adds an action function then the action function should have tests, etc.