Structured format for grader replies #8

Armael · 2019-06-13T18:07:03Z

We should decide on a format for replies from a grader to the frontend that is a bit more structured than the current pair {0,1} × string.
In particular it should have support for partial successes (typically, when someone proved some of the lemmas and left some unfinished, we want to display "Partial success 3/5" instead of just "Error"). As far as I understand, this is currently implemented in the Isabelle case but in an ad-hoc way.

Here's a first proposal. The grader output would be a json dictionnary of the following format:

{ "checks" : [
    { "name" : "lemma1", "result" : "ok" },
    { "name" : "lemma2", "result" : "ok_with_axioms" },
    { "name" : "lemma3", "result" : "error" },
    ...
  ],
  "log" : "<grader log here (including possible error messages)>"
}

The checks field should list the checks run by the grader (typically, there could be one check per lemma; but if a grader backend does not support fine grained checking, it would simply return a list with a single check element for the whole document). For each check the result is either "ok" (everything went fine), "ok_with_axioms" (the check passes but the proof uses unexpected axioms / has been admitted), or "error" (typechecking error or whatnot).

To keep things simple, all the error reporting and grader messages simply go in the free form "log" field which should simply be displayed to the user by the frontend.

Wrt partial successes and allow_sorry, this format has the nice property that allow_sorry does not need to be passed to the grader! The frontend can decide in a prover-agnostic way how to interpret the "ok_with_axioms" results: failure if allow_sorry is false, and I imagine partial success otherwise.

What do you think?

The text was updated successfully, but these errors were encountered:

maxhaslbeck · 2019-06-19T09:58:18Z

I agree to the design of the "checks" field, also I like the allow_sorry-agnostic idea.

We need to model failing submissions! Examples are the following:

ill-formated submissions -> lead to an error during proof checking,
some proof tactics takes too long or loops, [in order to avoid DoS attacks] we need to have some timeout on the proof checking phase
Internal Error (something strange happend, conflicting versions?, ...)
others ...?

maybe we need an extra field (e.g. "result") which is either "passed", or "failed X", with X being either "Ill-formated (line number)", "Timeout", "Internal-Error"

then we need a set of informative Error Messages that can be displayed to the users/contestants.

Armael · 2019-06-26T13:24:59Z

Indeed we should add something to account for submissions that fail to even parse/typecheck/etc. As a first step, I think the simplest option is to have a result field as you suggest, and leave all the error reporting messages in the log.
The json format would then be:

{ "result" : "ok"/"error", 
  "checks" : [
    { "name" : "lemma1", "result" : "ok" },
    { "name" : "lemma2", "result" : "ok_with_axioms" },
    { "name" : "lemma3", "result" : "error" },
    ...
  ],
  "log" : "<grader log here (including possible error messages)>"
}

I think this would be already good to have, and then we can think of how to structure the error messages a bit more. But already having them displayed to the user should do most of the job.

maxhaslbeck · 2019-06-26T13:49:53Z

for the error I propose

{ "result" : "ok"/"error", 
  "checks" : [  ... ],
  "log" : "<grader log here (including possible error messages)>",
  "error" : [  { "where": "Check.thy" , "what": "syntax error" } , ... ],
}

Armael · 2019-07-03T08:27:56Z

Mhh, with respect to the Coq backend, it would be easier to have each check include a message; and then the global log would include the remaining messages for the global submission (e.g. it couldn't compile/parse error/ -- in which case no checks can be run).

{ "result" : "ok"/"error", 
  "checks" : [
    { "name" : "lemma1", "result" : "ok", "message" : "OK" },
    { "name" : "lemma2", "result" : "ok_with_axioms", "message" : "Uses forbidden axiom foobar" },
    { "name" : "lemma3", "result" : "error", "message" : "type mismatch ..." },
    ...
  ],
  "log" : "<grader log here (error messages here are for the global submission, eg parse error)>"
}

Would that work for the isabelle backend?

maxhaslbeck · 2019-07-03T13:44:36Z

So Armael and me discussed the format and came to the following conclusion.
The idea is to do as much as possible of the logic in the backend and only present polished lists of pairs to the frontend, which then can have a uniform display routine.

{ "submission_is_valid" : True/False, 
  "checks" : [
    { "name" : "lemma1", "result" : "ok" },
    { "name" : "lemma2", "result" : "ok_with_axioms" },
    { "name" : "lemma3", "result" : "error" },
    ...
  ],
    "messages" : [  { "where": "Check.thy" , "what": "syntax error" } , ... ],
  "log" : "anything"
}

possible values for "submission_is_valid": True/False

True signals, that checking the submission went well
False signals, that checking the submission failed, then there should be some error messages in the "messages" list

possible values for the "result" in "checks": {"ok", "ok_with_axioms", "error"}

"ok" says, the lemma was proven without an oracle
"ok_with_axioms" says, that the lemma was proven but used an oracle
"error" says, there was an error when checking for that specific "check"

the idea is, that the grading process should be independent from whether or not we allow sorry for a task. Only the frontend then decides (with the allow_sorry flag) how to compute (partial) points. In that way, one can also later change the flag.
-> issue: can we then remove passing on the allow_sorry flag from the backends?

"log" can contain anything, maybe we display the full string as a toggle at the end of the submission display

Armael added the enhancement New feature or request label Jun 13, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Structured format for grader replies #8

Structured format for grader replies #8

Armael commented Jun 13, 2019 •

edited

Loading

maxhaslbeck commented Jun 19, 2019

Armael commented Jun 26, 2019 •

edited

Loading

maxhaslbeck commented Jun 26, 2019

Armael commented Jul 3, 2019

maxhaslbeck commented Jul 3, 2019 •

edited

Loading

Structured format for grader replies #8

Structured format for grader replies #8

Comments

Armael commented Jun 13, 2019 • edited Loading

maxhaslbeck commented Jun 19, 2019

Armael commented Jun 26, 2019 • edited Loading

maxhaslbeck commented Jun 26, 2019

Armael commented Jul 3, 2019

maxhaslbeck commented Jul 3, 2019 • edited Loading

Armael commented Jun 13, 2019 •

edited

Loading

Armael commented Jun 26, 2019 •

edited

Loading

maxhaslbeck commented Jul 3, 2019 •

edited

Loading