Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Structured format for grader replies #8

Open
Armael opened this issue Jun 13, 2019 · 5 comments
Open

Structured format for grader replies #8

Armael opened this issue Jun 13, 2019 · 5 comments
Labels
enhancement New feature or request

Comments

@Armael
Copy link
Collaborator

Armael commented Jun 13, 2019

We should decide on a format for replies from a grader to the frontend that is a bit more structured than the current pair {0,1} × string.
In particular it should have support for partial successes (typically, when someone proved some of the lemmas and left some unfinished, we want to display "Partial success 3/5" instead of just "Error"). As far as I understand, this is currently implemented in the Isabelle case but in an ad-hoc way.

Here's a first proposal. The grader output would be a json dictionnary of the following format:

{ "checks" : [
    { "name" : "lemma1", "result" : "ok" },
    { "name" : "lemma2", "result" : "ok_with_axioms" },
    { "name" : "lemma3", "result" : "error" },
    ...
  ],
  "log" : "<grader log here (including possible error messages)>"
}

The checks field should list the checks run by the grader (typically, there could be one check per lemma; but if a grader backend does not support fine grained checking, it would simply return a list with a single check element for the whole document). For each check the result is either "ok" (everything went fine), "ok_with_axioms" (the check passes but the proof uses unexpected axioms / has been admitted), or "error" (typechecking error or whatnot).

To keep things simple, all the error reporting and grader messages simply go in the free form "log" field which should simply be displayed to the user by the frontend.

Wrt partial successes and allow_sorry, this format has the nice property that allow_sorry does not need to be passed to the grader! The frontend can decide in a prover-agnostic way how to interpret the "ok_with_axioms" results: failure if allow_sorry is false, and I imagine partial success otherwise.

What do you think?

@Armael Armael added the enhancement New feature or request label Jun 13, 2019
@maxhaslbeck
Copy link
Collaborator

I agree to the design of the "checks" field, also I like the allow_sorry-agnostic idea.

We need to model failing submissions! Examples are the following:

  • ill-formated submissions -> lead to an error during proof checking,
  • some proof tactics takes too long or loops, [in order to avoid DoS attacks] we need to have some timeout on the proof checking phase
  • Internal Error (something strange happend, conflicting versions?, ...)
  • others ...?

maybe we need an extra field (e.g. "result") which is either "passed", or "failed X", with X being either "Ill-formated (line number)", "Timeout", "Internal-Error"

then we need a set of informative Error Messages that can be displayed to the users/contestants.

@Armael
Copy link
Collaborator Author

Armael commented Jun 26, 2019

Indeed we should add something to account for submissions that fail to even parse/typecheck/etc. As a first step, I think the simplest option is to have a result field as you suggest, and leave all the error reporting messages in the log.
The json format would then be:

{ "result" : "ok"/"error", 
  "checks" : [
    { "name" : "lemma1", "result" : "ok" },
    { "name" : "lemma2", "result" : "ok_with_axioms" },
    { "name" : "lemma3", "result" : "error" },
    ...
  ],
  "log" : "<grader log here (including possible error messages)>"
}

I think this would be already good to have, and then we can think of how to structure the error messages a bit more. But already having them displayed to the user should do most of the job.

@maxhaslbeck
Copy link
Collaborator

for the error I propose

{ "result" : "ok"/"error", 
  "checks" : [  ... ],
  "log" : "<grader log here (including possible error messages)>",
  "error" : [  { "where": "Check.thy" , "what": "syntax error" } , ... ],
}

@Armael
Copy link
Collaborator Author

Armael commented Jul 3, 2019

Mhh, with respect to the Coq backend, it would be easier to have each check include a message; and then the global log would include the remaining messages for the global submission (e.g. it couldn't compile/parse error/ -- in which case no checks can be run).

{ "result" : "ok"/"error", 
  "checks" : [
    { "name" : "lemma1", "result" : "ok", "message" : "OK" },
    { "name" : "lemma2", "result" : "ok_with_axioms", "message" : "Uses forbidden axiom foobar" },
    { "name" : "lemma3", "result" : "error", "message" : "type mismatch ..." },
    ...
  ],
  "log" : "<grader log here (error messages here are for the global submission, eg parse error)>"
}

Would that work for the isabelle backend?

@maxhaslbeck
Copy link
Collaborator

maxhaslbeck commented Jul 3, 2019

So Armael and me discussed the format and came to the following conclusion.
The idea is to do as much as possible of the logic in the backend and only present polished lists of pairs to the frontend, which then can have a uniform display routine.

{ "submission_is_valid" : True/False, 
  "checks" : [
    { "name" : "lemma1", "result" : "ok" },
    { "name" : "lemma2", "result" : "ok_with_axioms" },
    { "name" : "lemma3", "result" : "error" },
    ...
  ],
    "messages" : [  { "where": "Check.thy" , "what": "syntax error" } , ... ],
  "log" : "anything"
}

possible values for "submission_is_valid": True/False

  • True signals, that checking the submission went well
  • False signals, that checking the submission failed, then there should be some error messages in the "messages" list

possible values for the "result" in "checks": {"ok", "ok_with_axioms", "error"}

  • "ok" says, the lemma was proven without an oracle
  • "ok_with_axioms" says, that the lemma was proven but used an oracle
  • "error" says, there was an error when checking for that specific "check"

the idea is, that the grading process should be independent from whether or not we allow sorry for a task. Only the frontend then decides (with the allow_sorry flag) how to compute (partial) points. In that way, one can also later change the flag.
-> issue: can we then remove passing on the allow_sorry flag from the backends?

"log" can contain anything, maybe we display the full string as a toggle at the end of the submission display

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Development

No branches or pull requests

2 participants