Skip to content

[EN] 5. Package with input generator

Alicja Kluczek edited this page May 25, 2021 · 1 revision

Problem

In all previous guides, the input structure was very simple. Preparation of input files could be done by manually entering values in the editor, or by writing an extremely primitive script in any language. Unfortunately, this is not always possible. In the vast majority of tasks, in the input we want to place arrays, graphs, matrices and other advanced structures. You can solve this problem by including in the package a program that generates the necessary input files.

Task

We have given a tree, i.e. a connected, undirected and acyclic graph. It consists of n vertices and n - 1 edges. The task will be to determine the number of leaves. A leaf is any vertex (except the root) of degree 1 (only one edge comes out of it). The task will be titled Leaves and the abbreviation will be lea.

Tests

Test preparation is undoubtedly the most important (and often the most difficult) part of preparing the task package. The tests should be created so as to differentiate individual classes of (correct) solutions. When writing tests, you should be guided by the following general guidelines:

  • Each time the generator is started, the tests created must be identical.
  • How many points the solution gains should reflect effort (conceptual and programming) necessary to develop it. In case of doubt, the points should be evenly distributed between more and more effective solutions.
  • How many tests a given solution should pass depend only on the type of solution, not on the programming language in which it is implemented.
  • As a rule, all correct solutions, also those less effective, should earn some partial points.
  • Heuristics or solutions based on incorrect assumptions should not receive points (or receive few).
  • In particular, solutions that always give this the same answer (e.g. 0, NO, or answer for the sample test) should receive 0 points (you can use test grouping for this).
  • The kit should contain extremely small and extremely large tests, even if they are trivially simple.
  • The program receives points for a given group if it passes all tests included. You should try to minimize the number of tests in the group as reasonably as possible.

Input generator

In the file leaingen.cpp we will have to put the code that will generate input tests. Frequently, task authors have their own template of such a file, which is specially prepared work on tests for their preferences. For the purposes of this guide, we'll show you a very simple input generator written in C++. As in the previous guide, we will use the oi.h library, which provides us with many tools for pseudo-randomness. To begin with, we'll create a function that returns a number from the given range.

unsigned randomUIntInRange(unsigned a, unsigned b) {
    unsigned res = RG.randUInt();
    return res % (b - a + 1) + a;
}

Now we will deal with the structure of our test. It is worth considering what data the test must have in order to be able to unambiguously print it. In our case it will be:

struct Test {
    int n;
    vector<pair<int, int>> edges;

Indeed, the number of vertices and the edges vector are sufficient to print the test. Now let's prepare a function that will be used to print a given test.

void print(int no, char *letter) {
    assert('a' <= *letter && *letter <= 'z');
    string name = id + to_string(no) + *letter + ".in";
    cerr << "Printing to file: " << name << "\n";
    ofstream outFile(name);
    outFile << n << "\n";
    for (const auto &edge : edges) {
        outFile << edge.first << " " << edge.second << "\n";
    }
    outFile.close();
    (*letter)++;
}

We have a function that takes the group number and the first letter (often useful if we want to generate some of the first tests from the group manually). Now if we create a function that returns the type Test, we can call the print method and a file containing the test will be created automatically. We will also create a shuffle function that shuffles the data in the vector (remember that we do not guarantee a specific order). It is worth noting that we use the function provided by the oi.h library for shuffling.

RG.randomShuffle(test->edges.begin(), test->edges.end());

We can now create several different types of tests. A more detailed description can be found in the generator file itself. The code for one group in main looks like this.

{ // Test group 1 - n <= 100
    int group = 1;
    char c = 'a';
    RG.setSeed(1);
    random(100).print(group, &c);
    random(92).print(group, &c);
    path(100, false).print(group, &c);
    star(100, true).print(group, &c);
    binaryTree(100).print(group, &c);
}

It is worth paying attention to the following line.

RG.setSeed(1);

Setting the seed for the generator guarantees that the tests will be deterministic. In this task, we change the seed in every group, it is worth noting, however, that a common practice is to set new seed for each test (then the change of a particular test does not affect the next one belonging to the same group). You can read more about grain and pseudo-randomness here. Now after running make ingen all tests should appear in the gue folder.

Input verification

When working on tests, we will often change the generator. Often, we will arrange one specific function to check a certain class of solutions. We can then run tests and check what are the results obtained by specific model, incorrect or slow solutions. However, how can we ensure that these tests are correct? The input verification program comes in handy. It will work on a very similar principle as the output verification program presented in the previous guide. With the help of the oi.h library we will read the input file character by character and check if it meets the requirements, which we guarantee in the task. In particular, remember to inform about the fulfilment of specific subtasks. Finished input verificator can be found in gueinwer.cpp. We can start the verifier by entering make inwer in the terminal. Writing an input verifier is a very important habit and it should be found in every package regardless of level of input complexity.

Cleaning

To clean all generated tests (except sample tests) we use make clean-gen.