Skip to content

Latest commit

 

History

History
140 lines (99 loc) · 6.54 KB

generators.md

File metadata and controls

140 lines (99 loc) · 6.54 KB

Generators

To generate input data for properties, RapidCheck uses the concept of generators. A generator which generates values of type T has the type Gen<T>.

RapidCheck includes functions for creating generators for most common types and use cases but it also provides combinators that allow you to build custom generators by combining existing ones. These are generally located in the rc::gen namespace. See the generators reference for more information about the different generators including examples.

If for some reason you cannot use the built in generators and combinators, you can also implement a generator from scratch. See the documentation for Gen for more information about this.

Usage

To explicitly use a generator in your property, use the prefix * operator (i.e. dereference). This overload returns a value of T that was either generated from the given generator or found by shrinking a generated value. For example, to use the inRange generator to generate a value in the range [0, 10):

const auto i = *rc::gen::inRange(0, 10);

With this style of using generators, you can also have parameters of a generator depend on values previously generated by other generators. For example:

const auto max = *rc::gen::arbitrary<int>();
const auto i = *rc::gen::inRange(0, max);

In this example, we first generate an arbitrary int which becomes the maximum limit for the generation of i which will be an int between 0 (inclusive) and the first value (exclusive).

Please note that operator* is only valid in certain contexts, properties being one of them.

Arbitrary

The most important generator (or family of generators) is rc::gen::arbitrary<T>(). As can be deduced from it's name, it returns a generator that will yield a completely arbitrary value of T. While it might not always be feasible to do so, the idea is that there should be no possible value of type T that cannot be generated by the returned generator.

Out of the box, RapidCheck has support for generating arbitrary values of the following types:

  • All primitive built-in types
  • std::array<T, N>
  • std::vector<T>
  • std::deque<T>
  • std::forward_list<T>
  • std::list<T>
  • std::set<T>
  • std::map<K, V>
  • std::multiset<T>
  • std::multimap<K, V>
  • std::unordered_set<T>
  • std::unordered_map<K, V>
  • std::unordered_multiset<T>
  • std::unordered_multimap<K, V>
  • std::basic_string<T>
  • std::tuple<Ts...>
  • std::pair<T1, T2>
  • std::chrono::duration<Rep, Period>
  • std::chrono::time_point<Clock, Duration>
  • rc::Maybe<T>

The caveat is, of course, that for template types, RapidCheck must know how to generate the template arguments.

However, it is simple to add support for your own types. To do this, add a specialization of the Arbitrary template struct in the namespace rc. The specialization should have a static member function named arbitrary that should return an appropriate generator. The the following type:

struct Person {
  std::string firstName;
  std::string lastName;
  int age;
};

We can add arbitrary support for this type by making the following visible in the file that requests the arbitrary generator:

// NOTE: Must be in rc namespace!
namespace rc {

template<>
struct Arbitrary<Person> {
  static Gen<Person> arbitrary() {
    return gen::build<Person>(
        gen::set(&Person::firstName),
        gen::set(&Person::lastName),
        gen::set(&Person::age, gen::inRange(0, 100)));
  }
};

} // namespace rc

With this added, RapidCheck not only knows how to generate Person but also std::vector<Person> and std::pair<std::string, Person>, among other types.

Size

Generators in RapidCheck have an implicit size parameter that controls the size of the generated test data. Not all generators honor this parameter but most do where applicable. For example, when generating std::vector<T>, the size parameter controls the maximum length of the generated vector as well as the size that is passed to the generator that generates the elements of the vector. When generating primitive integral types, the size controls the maximum values that can be generated.

When RapidCheck runs the test cases for a given property, it starts with a zero size and increases it up to the maximum configured limit for the final test. There are several advantages to this approach:

  • Smaller data is (usually) cheaper data. If we can find bugs with small sizes, we will be able to find them more quickly. Why bring out the big guns before you've tried something simpler?
  • With smaller data, the selection of values is (usually) smaller. There are fewer values between -1 and 1 than there are between -1000 and 1000 and there are fewer possible std:vector<int> of length 2 than there are of length 100. This means that the chance of collisions and duplicates is higher at smaller sizes, something that may shake out particular categories of bugs.

In some cases, you may need to modify the size for performance reasons or to shift the distribution of generated values. For example, in the final test case, the number of elements in a generated std::vector might average 50 elements which means that std::vector<std::vector<T>> may contain 50 x 50 elements when concatenated. If T is also expensive, this may lead to very slow properties. You can modify the size of a generator using the rc::gen::resize and rc::gen::scale combinators. In addition, you can use the rc::gen::withSize combinator to have even more control.

Naming

When printing a counterexample, RapidCheck will by default print the type of each value:

std::vector<int>:
[1, 3, 5, 10, 2]

int:
1

int:
4

While it is possible to determine what value in the counterexample corresponds to what value in the code by looking at the position (values are listed in the order they were picked), it can certainly be confusing.

RapidCheck allows you to attach a name to your generators using the as method:

auto elements = *gen::arbitrary<std::vector<int>>().as("elements");

const auto indexIntoElements =
    gen::inRange<std::size_t>(0, elements.size()).as("index into elements");
const auto a = *indexIntoElements;
const auto b = *indexIntoElements;

as returns a generator with the given name that is otherwise identical to the one it was called on. When a counterexample is printed, the type will be replaced with the name of the generator instead:

elements:
[1, 3, 5, 10, 2]

index into elements:
1

index into elements:
4

The name of a generator can be retrieved using the .name() method.