Skip to content

Commit

Permalink
Support interactive evaluator (#137)
Browse files Browse the repository at this point in the history
Resolve #118
  • Loading branch information
fushar committed May 24, 2017
1 parent 9d24a32 commit 5f2c281
Show file tree
Hide file tree
Showing 37 changed files with 1,130 additions and 114 deletions.
12 changes: 12 additions & 0 deletions CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -52,9 +52,14 @@ set(INCLUDE
include/tcframe/evaluator/EvaluationOptions.hpp
include/tcframe/evaluator/EvaluationResult.hpp
include/tcframe/evaluator/Evaluator.hpp
include/tcframe/evaluator/EvaluatorConfig.hpp
include/tcframe/evaluator/EvaluatorHelperRegistry.hpp
include/tcframe/evaluator/EvaluatorRegistry.hpp
include/tcframe/evaluator/InteractiveEvaluator.hpp
include/tcframe/evaluator/GenerationResult.hpp
include/tcframe/evaluator/communicator.hpp
include/tcframe/evaluator/communicator/CommunicationResult.hpp
include/tcframe/evaluator/communicator/Communicator.hpp
include/tcframe/evaluator/scorer.hpp
include/tcframe/evaluator/scorer/CustomScorer.hpp
include/tcframe/evaluator/scorer/DiffScorer.hpp
Expand Down Expand Up @@ -154,8 +159,11 @@ set(TEST_UNIT
test/unit/tcframe/aggregator/MockAggregatorRegistry.hpp
test/unit/tcframe/aggregator/SumAggregatorTests.cpp
test/unit/tcframe/evaluator/BatchEvaluatorTests.cpp
test/unit/tcframe/evaluator/InteractiveEvaluatorTests.cpp
test/unit/tcframe/evaluator/MockEvaluator.hpp
test/unit/tcframe/evaluator/MockEvaluatorRegistry.hpp
test/unit/tcframe/evaluator/communicator/CommunicatorTests.cpp
test/unit/tcframe/evaluator/communicator/MockCommunicator.hpp
test/unit/tcframe/evaluator/scorer/CustomScorerTests.cpp
test/unit/tcframe/evaluator/scorer/MockScorer.hpp
test/unit/tcframe/generator/GeneratorLoggerTests.cpp
Expand All @@ -176,6 +184,9 @@ set(TEST_UNIT
test/unit/tcframe/logger/SimpleLoggerEngineTests.cpp
test/unit/tcframe/os/MockOperatingSystem.hpp
test/unit/tcframe/runner/ArgsParserTests.cpp
test/unit/tcframe/runner/BatchRunnerTests.cpp
test/unit/tcframe/runner/BaseRunnerTests.hpp
test/unit/tcframe/runner/InteractiveRunnerTests.cpp
test/unit/tcframe/runner/MockRunnerLogger.hpp
test/unit/tcframe/runner/MockRunnerLoggerFactory.hpp
test/unit/tcframe/runner/RunnerTests.cpp
Expand Down Expand Up @@ -229,6 +240,7 @@ target_link_libraries(test_unit
)

set(TEST_INTEGRATION
test/integration/tcframe/evaluator/communicator/CommunicatorIntegrationTests.cpp
test/integration/tcframe/evaluator/scorer/CustomScorerIntegrationTests.cpp
test/integration/tcframe/evaluator/scorer/DiffScorerIntegrationTests.cpp
test/integration/tcframe/os/OperatingSystemIntegrationTests.cpp
Expand Down
12 changes: 12 additions & 0 deletions docs/api-ref/api-ref.rst
Original file line number Diff line number Diff line change
Expand Up @@ -242,6 +242,14 @@ Problem styles

Defines the options to enable for problem styles. The following methods are exposed:

.. cpp:function:: BatchEvaluator()

Declares that the problem uses batch evaluator.

.. cpp:function:: InteractiveEvaluator()

Declares that the problem uses interactive evaluator.

.. cpp:function:: CustomScorer()

Declares that the problem needs a custom scorer.
Expand Down Expand Up @@ -606,6 +614,10 @@ Local grading
The custom scorer command to use. Default: ``./scorer``.

.. py:function:: --communicator=<command>
The communicator command to use. Default: ``./communicator``.

.. py:function:: --time-limit=<time-limit-in-seconds>
Overrides the time limit specified by ``TimeLimit()`` in grading config.
Expand Down
105 changes: 91 additions & 14 deletions docs/topic-guides/styles.rst
Original file line number Diff line number Diff line change
Expand Up @@ -3,30 +3,54 @@
Problem Styles
==============

Currently, **tcframe** supports only **batch**-style problems, where the solution is expected to read the test cases from the standard input and write the answers to the standard output. There are some configurable options to this behavior, which can be specified in the ``StyleConfig()`` method of the problem spec class.
The ``StyleConfig()`` method of the problem spec class can be used to configure several aspects related to the nature of the problem itself and is independent from the test spec.

.. sourcecode:: cpp

void StyleConfig() {
// option specifications
}

The available options are as follows.
Evaluator
---------

An evaluator specifies how to run a solution against a test case. Two types of evaluators are supported:

Batch
*****

Enabled by calling ``BatchEvaluator()``. This is the default evaluator if none is specified. The solution must read the test cases from the standard input and print the output to the standard output.

The following options are further configurable:

- ``CustomScorer()``

By default, the output will be checked with the default ``diff`` program, unless a custom **scorer** is specified. See the **Helper programs** section on how to write a scorer.

- ``NoOutput()``

If the problem is using a custom scorer and it does not depend on test case output of any test case, then this option can be enabled. If enabled, then ``.out`` files will not be generated, and it is not allowed to specify ``Output()`` in sample test cases.

Interactive
***********

Enabled by calling ``InteractiveEvaluator()``. The solution will participate in a 2-way communication with a special program called **communicator**, which will ultimately print the verdict of the solution. See the **Helper programs** section on how to write a communicator.

Custom scorer
-------------
----

Enabled by calling ``CustomScorer()`` inside ``StyleConfig()``.
Helper programs
---------------

A scorer is a program which decides the verdict of a test case. By default, the scorer is the simple ``diff`` program. If custom scorer is enabled, then you must provide the custom scorer program.
Scorer
******

The custom scorer will receive the following arguments:
A scorer is a program which decides the verdict of a test case. It will receive the following arguments:

- argv[1]: test case input filename
- argv[2]: test case output filename
- argv[3]: contestant's produced output filename

The custom scorer must print the test case verdict to the standard output, which is a line consisting of either:
It must print the test case verdict to the standard output, which is a line consisting of either:

- ``AC``: indicates that the contestant's output is correct. It will be given 100 / (number of test cases in its subtask) points.
- ``WA``: indicates that the contestant's output is incorrect. It will be given 0 points.
Expand All @@ -38,15 +62,15 @@ The custom scorer must print the test case verdict to the standard output, which
9


The custom scorer must be compiled prior test cases generation/local grading, and the execution command should be passed to the runner program as the ``--scorer`` option. For example:
The scorer must be compiled prior to test cases generation/local grading, and the execution command should be passed to the runner program as the ``--scorer`` option. For example:

::

./runner grade --solution=./solution_alt --scorer=./my_custom_scorer

The default scorer command is ``./scorer`` if not specified.

Here is an example custom scorer which gives AC if the contestant's output differs not more than 1e-9 with the official output.
Here is an example scorer which gives AC if the contestant's output differs not more than 1e-9 from the official output.

.. sourcecode:: cpp

Expand Down Expand Up @@ -83,9 +107,62 @@ Here is an example custom scorer which gives AC if the contestant's output diffe
}
}

No output
---------
Communicator
************

A communicator is a program which performs 2-way communication with the solution program, and then decides the verdict. It will receive the following (only) argument:

- argv[1]: test case input filename

During the communication, the communicator can read the solution program's output from the standard input, and can give input to the solution program by writing to the standard output. Make sure the communicator flushes after every time it writes output. Ultimately, the communicator must print the test case verdict to the standard error, with the same format as a scorer as described in the previous section.

Enabled by calling ``NoOutput()`` inside ``StyleConfig()``.
The communicator must be compiled prior to local grading, and the execution command should be passed to the runner program as the ``--communicator`` option. For example:

Sometimes, a problem does not need test case output files (``.out``) because the scoring is done by a custom score alone. If this option is enabled, then ``.out`` files will not be generated, and it is not allowed to specify ``Output()`` in sample test cases.
::

./runner grade --solution=./solution_alt --communicator=./my_communicator

The default scorer command is ``./communicator`` if not specified.

Here is an example communicator program in a typical binary search problem.

.. sourcecode:: cpp

#include <bits/stdc++.h>
using namespace std;

int ac() {
cerr << "AC" << endl;
return 0;
}

int wa() {
cerr << "WA" << endl;
return 0;
}

int main(int argc, char* argv[]) {
ifstream tc_in(argv[1]);

int N;
tc_in >> N;

int guesses_count = 0;

while (true) {
int guess;
cin >> guess;

guesses_count++;

if (guesses_count > 10) {
return wa();
} else if (guess < N) {
cout << "TOO_SMALL" << endl;
} else if (guess > N) {
cout << "TOO_LARGE" << endl;
} else {
return ac();
}
}
}
3 changes: 3 additions & 0 deletions include/tcframe/evaluator.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,7 +4,10 @@
#include "tcframe/evaluator/EvaluationOptions.hpp"
#include "tcframe/evaluator/EvaluationResult.hpp"
#include "tcframe/evaluator/Evaluator.hpp"
#include "tcframe/evaluator/EvaluatorConfig.hpp"
#include "tcframe/evaluator/EvaluatorHelperRegistry.hpp"
#include "tcframe/evaluator/EvaluatorRegistry.hpp"
#include "tcframe/evaluator/GenerationResult.hpp"
#include "tcframe/evaluator/InteractiveEvaluator.hpp"
#include "tcframe/evaluator/communicator.hpp"
#include "tcframe/evaluator/scorer.hpp"
11 changes: 9 additions & 2 deletions include/tcframe/evaluator/Evaluator.hpp
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
#pragma once

#include <stdexcept>
#include <string>

#include "EvaluationOptions.hpp"
#include "EvaluationResult.hpp"
#include "GenerationResult.hpp"
#include "scorer.hpp"

using std::logic_error;
using std::string;

namespace tcframe {
Expand All @@ -25,9 +27,14 @@ class Evaluator {
virtual GenerationResult generate(
const string& inputFilename,
const string& outputFilename,
const EvaluationOptions& options) = 0;
const EvaluationOptions& options) {

throw logic_error("unsupported");
}

virtual ScoringResult score(const string& inputFilename, const string& outputFilename) = 0;
virtual ScoringResult score(const string& inputFilename, const string& outputFilename) {
throw logic_error("unsupported");
}
};

}
45 changes: 45 additions & 0 deletions include/tcframe/evaluator/EvaluatorConfig.hpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,45 @@
#pragma once

#include <utility>

using std::move;

namespace tcframe {

enum class TestCaseOutputType {
OPTIONAL,
NOT_REQUIRED
};

class EvaluatorConfig {
friend class EvaluatorConfigBuilder;

private:
TestCaseOutputType testCaseOutputType_;

public:
TestCaseOutputType testCaseOutputType() const {
return testCaseOutputType_;
}

bool operator==(const EvaluatorConfig& o) const {
return tie(testCaseOutputType_) == tie(o.testCaseOutputType_);
}
};

class EvaluatorConfigBuilder {
private:
EvaluatorConfig subject_;

public:
EvaluatorConfigBuilder& setTestCaseOutputType(TestCaseOutputType testCaseOutputType) {
subject_.testCaseOutputType_ = testCaseOutputType;
return *this;
}

EvaluatorConfig build() {
return move(subject_);
}
};

}
5 changes: 5 additions & 0 deletions include/tcframe/evaluator/EvaluatorHelperRegistry.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@

#include "tcframe/os.hpp"
#include "tcframe/util.hpp"
#include "communicator.hpp"
#include "scorer.hpp"

using std::string;
Expand All @@ -21,6 +22,10 @@ class EvaluatorHelperRegistry {
return new DiffScorer(os);
}
}

virtual Communicator* getCommunicator(OperatingSystem* os, const string& communicatorCommand) {
return new Communicator(os, new VerdictCreator(), communicatorCommand);
}
};

}
41 changes: 39 additions & 2 deletions include/tcframe/evaluator/EvaluatorRegistry.hpp
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,13 @@
#include <string>

#include "BatchEvaluator.hpp"
#include "EvaluatorConfig.hpp"
#include "EvaluatorHelperRegistry.hpp"
#include "InteractiveEvaluator.hpp"
#include "communicator.hpp"
#include "scorer.hpp"
#include "tcframe/os.hpp"
#include "tcframe/spec.hpp"
#include "tcframe/util.hpp"
#include "tcframe/verdict.hpp"

Expand All @@ -25,8 +29,22 @@ class EvaluatorRegistry {
EvaluatorRegistry(EvaluatorHelperRegistry* helperRegistry)
: helperRegistry_(helperRegistry) {}

virtual Evaluator* get(OperatingSystem* os, const map<string, string>& helperCommands) {
return getBatch(os, helperCommands);
virtual Evaluator* get(EvaluationStyle style, OperatingSystem* os, const map<string, string>& helperCommands) {
switch (style) {
case EvaluationStyle::BATCH:
return getBatch(os, helperCommands);
case EvaluationStyle::INTERACTIVE:
return getInteractive(os, helperCommands);
}
}

virtual EvaluatorConfig getConfig(EvaluationStyle style) {
switch (style) {
case EvaluationStyle::BATCH:
return getBatchConfig();
case EvaluationStyle::INTERACTIVE:
return getInteractiveConfig();
}
}

private:
Expand All @@ -36,6 +54,25 @@ class EvaluatorRegistry {
return new BatchEvaluator(os, new VerdictCreator(), scorer);
}

EvaluatorConfig getBatchConfig() {
return EvaluatorConfigBuilder()
.setTestCaseOutputType(TestCaseOutputType::OPTIONAL)
.build();
}

Evaluator* getInteractive(OperatingSystem* os, const map<string, string>& helperCommands) {
string communicatorCommand = getHelperCommand(helperCommands, "communicator").value();
Communicator* communicator = helperRegistry_->getCommunicator(os, communicatorCommand);

return new InteractiveEvaluator(communicator);
}

EvaluatorConfig getInteractiveConfig() {
return EvaluatorConfigBuilder()
.setTestCaseOutputType(TestCaseOutputType::NOT_REQUIRED)
.build();
}

static optional<string> getHelperCommand(const map<string, string>& helperCommands, const string& key) {
if (helperCommands.count(key)) {
return optional<string>(helperCommands.at(key));
Expand Down

0 comments on commit 5f2c281

Please sign in to comment.