Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[stm] Error resilience behaves strangely with Stm.join #9204

Open
ejgallego opened this issue Dec 12, 2018 · 18 comments
Open

[stm] Error resilience behaves strangely with Stm.join #9204

ejgallego opened this issue Dec 12, 2018 · 18 comments
Labels
part: STM State Transition Machine, asynchronous proofs, etc.

Comments

@ejgallego
Copy link
Member

ejgallego commented Dec 12, 2018

Opening the following file https://github.com/ejgallego/coq-serapi/blob/b62ae5fc928fa2f098df1e2308d6e1244521d99e/tests/fail/assoc.v
in CoqIDE, then go to the end.

  • The first time, it emits no error [exception], but a message in the buffer;
  • the second time, click the gears, and CoqIDE tries to close the proof, an error "no such goal" is produced,
  • and the third time, click the gears again, the CoqIDE says "All proof terms checked by the kernel", the error persists.

The problem is that Stm.join / finish returns successfully even if the bad proof is there; this seems a bad interaction with the error recovery mode. Disabling it fixes the problem.

Another variation of the bug is with -async-proofs off in CoqIDE, in this case clicking the gear will always report "All proof terms checked by the kernel".

@ejgallego ejgallego added the part: STM State Transition Machine, asynchronous proofs, etc. label Dec 12, 2018
ejgallego added a commit to rocq-archive/coq-serapi that referenced this issue Dec 12, 2018
The semantics are a bit dubious see coq/coq#9204 and #94.

This makes the example provided in #94 fail, we update the test suite.
@gares
Copy link
Member

gares commented Dec 12, 2018

Thanks for the detailed report.

The problem is not in the error resiliency code.

The problem is that CoqIDE (and sertop I imagine) reopens the broken proof after the error is returned by join. The code of join was (wrongly) assuming the current branch was master. If you join the safe environment associated to the state in which you are while fixing the proof.. well, such an env does not even contain the broken proof (it is the one after the proof that does).

@gares
Copy link
Member

gares commented Dec 12, 2018

See #9206

@ejgallego
Copy link
Member Author

ejgallego commented Dec 12, 2018

The problem is that CoqIDE (and sertop I imagine) reopens the broken proof after the error is returned by join.

Note that join does not even return an error in sertop:

(ReadFile assoc.v)
(Exec 34)
(Feedback((doc_id 0)(span_id 34)(route 0)(contents Processed)))
Finish
Join
(Feedback((doc_id 0)(span_id 34)(route 0)(contents Processed)))

so in fact sertop doesn't even know that there is a problem in a subproof. Is that supposed to happen?

@gares
Copy link
Member

gares commented Dec 12, 2018

Hum, Stm.join may fail by raising an exception.
What "type" does it have in setop?

If CoqIDE reopens the proof (via an edit_at) then I believe the exception contains the right metadata.
I don't know what is 34 in your example.

@ejgallego
Copy link
Member Author

34 is the Stateid of the last sentence of the document, ReadFile just does Stm.add of the document.

Indeed Stm.join does not raise an exception in the ML part in this example, it seems. In sertop Stm.join does indeed have type success + exn.

@gares
Copy link
Member

gares commented Dec 12, 2018

Hum, I'm testing without disabling error resiliency. Are you sure you are testing the same?
If CoqIDE's join reports an error it means there was an exception in ML...

@ejgallego
Copy link
Member Author

Yup, the difference with my test and CoqIDE is that CoqIDE "opens" the failed proof. I don't.

I am testing with the default which is is:

    async_proofs_tac_error_resilience = `Only [ "curly" ];
    async_proofs_cmd_error_resilience = true;

@gares
Copy link
Member

gares commented Dec 12, 2018

If I do coqide -debug and then hit the wheels I get this in my terminal:

[pid 5114] <-- Status true
[pid 5114] --> FAIL 25 [No such goal.
frame @ file "stm/stm.ml", line 1528, characters 20-35
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/future.ml", line 132, characters 19-23
frame @ file "stm/stm.ml", line 1513, characters 11-53
frame @ file "stm/stm.ml", line 2556, characters 4-105
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "stm/stm.ml", line 939, characters 6-10
frame @ file "stm/stm.ml", line 2406, characters 12-251
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "stm/stm.ml", line 2344, characters 10-14
frame @ file "stm/stm.ml", line 2410, characters 20-47
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "stm/stm.ml", line 1093, characters 12-91
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "vernac/vernacentries.ml", line 2439, characters 4-36
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "vernac/vernacentries.ml", line 2410, characters 8-451
frame @ file "lib/flags.ml", line 96, characters 19-40
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/flags.ml", line 17, characters 14-17
frame @ file "vernac/vernacentries.ml", line 2313, characters 30-66
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "vernac/vernacextend.ml", line 103, characters 4-18
frame @ file "plugins/ltac/g_ltac.mlg", line 447, characters 4-26
frame @ file "plugins/ltac/g_ltac.mlg", line 386, characters 15-536
frame @ file "proofs/proof_global.ml", line 170, characters 24-36
frame @ file "plugins/ltac/g_ltac.mlg", line 391, characters 6-82
raise @ unknown
frame @ file "proofs/pfedit.ml", line 110, characters 28-67
frame @ file "proofs/proof.ml", line 408, characters 4-30
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "ide/protocol/xmlprotocol.ml", line 670, characters 29-47
frame @ file "ide/idetop.ml", line 421, characters 12-15
frame @ file "ide/idetop.ml", line 271, characters 12-40
frame @ file "library/global.ml", line 61, characters 2-48
frame @ file "library/global.ml", line 32, characters 16-69
frame @ file "kernel/safe_typing.ml", line 344, characters 2-72
frame @ file "list.ml", line 106, characters 12-15
frame @ file "kernel/opaqueproof.ml", line 95, characters 14-30
frame @ file "lib/future.ml", line 167, characters 10-18
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/future.ml", line 132, characters 19-23
frame @ file "lib/future.ml", line 148, characters 52-62
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/future.ml", line 132, characters 19-23
frame @ file "lib/future.ml", line 148, characters 52-62
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/future.ml", line 132, characters 19-23
frame @ file "lib/future.ml", line 148, characters 52-62
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/future.ml", line 132, characters 19-23
frame @ file "lib/future.ml", line 148, characters 52-62
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/future.ml", line 132, characters 19-23
frame @ file "lib/future.ml", line 148, characters 52-62
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/future.ml", line 132, characters 19-23
frame @ file "lib/future.ml", line 148, characters 52-62
raise @ file "clib/exninfo.ml", line 65, characters 2-15
frame @ file "lib/future.ml", line 132, characters 19-23
frame @ file "lib/future.ml", line 148, characters 52-62
raise @ file "clib/exninfo.ml", line 65, characters 2-15]

@gares
Copy link
Member

gares commented Dec 12, 2018

The trace is the result of join, then CoqIDE decides to reopen the proof.
I see an exception there.

@ejgallego
Copy link
Member Author

So CoqIDE does something such as:

  • adds the file
  • calls status => note no exception is emitted here, but only a feedback
  • then I click the gears , and coqide decides to do edit_at
  • after edit at an exception is indeed raised.

So my case is when I don't do EditAt.

@gares
Copy link
Member

gares commented Dec 12, 2018

The exception happens at point 3:

  • each XML protocol message can succeed or fail
  • Staus true fails (calls join that raises an exception, that contains 25 as a valid state, see the beginning of the trace)
  • CoqIDE reacts by sending Edit_at 25

The bug, to me, is that the second time you click the gears join does not raise the same exception.
#9206 fixes that.

I don't believe that when you call Stm.join () in serapi you don't get an exception.
If you don't then you are setting up things differently, and what I see in CoqIDE has nothing to do with it ;-)

@ejgallego
Copy link
Member Author

@gares the same behavior happens in CoqIDE, if I read the trace correctly no exception is raised the first time join is done, XML returns success indeed.

@ejgallego
Copy link
Member Author

Well, it is more subtle than that, with the default setup, finish first returns GOOD but later JOIN false, a race? Log:

[DEBUG] Start eval_call Add (("Require Import List.",-1),(1,false))
[.... many add calls]
[pid 2178] <-- Add (("\nEnd assoc.",-33),(33,false))
2178:master:0] STM: 33: proof get_set_diff_default: asynch
2178:master:0] STM: 27: proof get_del_same: asynch
2178:master:0] STM: 18: proof get_set_diff: asynch
2178:master:0] Enqueue task proof: get_set_diff
2178:master:2] got task: proof: get_set_diff
2178:master:0] Enqueue task proof: get_del_same
2178:master:2] got execution token
[.... many processing and incomplete ]
2178:master:0] Enqueue task proof: get_set_diff_default
[pid 2178] --> GOOD (34,(Inl ,""))

*** EJGA: Stm.add returns GOOD here.

[pid 2178] <-- Status false
[pid 2178] --> GOOD Status: path=assoc;no proof;

*** EJGA: Note Stm.finish returns GOOD here.

2178:master:5] got task: proof: get_del_same
...
2178:master:7] got task: proof: get_set_diff_default
[DEBUG] Feedback Msg 2204:proofworker:1:0 Vernac Interpreter Executing command
         on 26
[DEBUG] Feedback ErrorMsg No such goal.
        frame @ file "stm/stm.ml", line 939, characters 6-10

*** EJGA: Note this is just a feedback from the worker, not an exception

*** End of log

However with -async-proofs off , the bug still persists, even with the patch; gears do nothing, and coqide reports "all proof terms checked by the kernel", which is not true as we have the error in the subproof.

As I pointed out at first, the problem seems to be in the error recovery code.

After doing a little bit more of research my recommendation would be to disable it by default in 8.9, as this seems quite dangerous.

cc: @silene

@gares
Copy link
Member

gares commented Dec 13, 2018

It is ok that finish says ok and join error. If finish was doing all the job, then there would be no reason for join.

@gares
Copy link
Member

gares commented Dec 13, 2018

Il try the other setup to see how it is still buggy.

gares added a commit to gares/coq that referenced this issue Dec 13, 2018
@gares
Copy link
Member

gares commented Dec 13, 2018

You are right about join not giving an error if resiliency is on (and async proofs off).

When I've added error resiliency I forgot to add a silly check that no error states are there when one joins. Addressed in the same PR.

If proofs are async then it is not error resiliency for command that makes Coq reach the end of the document (even if a proof is wrong) but the usual enclosure of the proof in a future in the env. Hence with async proofs on join was still finding the error. Il tombé en marche.

@ejgallego
Copy link
Member Author

ejgallego commented Dec 17, 2018

I am reopening to keep track of the issues noted above and in #9206 , in particular with regards to missing context about what definitions have been added to the kernel enviroment.

@ejgallego ejgallego reopened this Dec 17, 2018
@ejgallego
Copy link
Member Author

I will likely open a new bug with a more precise description, then close this one.

ejgallego added a commit to ejgallego/opam-repository that referenced this issue Jan 28, 2019
CHANGES:

* [general] support Coq 8.9,
 * [general] SerAPI now uses Dune as a build system,
 * [opam]   install `sertop.el`,
 * [serlib] support to serialize kernel environments,
 * [serapi] new query `Env` that tries to print the current kernel environment,
 * [serlib] correct field names for `CAst`,
 * [serlib] more robust support for opaque / non-serializable types (rocq-archive/coq-serapi#61, rocq-archive/coq-serapi#68).
            Thanks to @palmskog,
 * [serlib] new option `--exn_on_opaque` to raise an exception on
            non-serializable types; closes rocq-archive/coq-serapi#68, thanks to @palmskog,
 * [serlib] serialization test-suite from
            https://github.com/proofengineering/serapi-tests, thanks to
            @palmskog,
 * [sercomp] add `--mode` option to better control output,
 * [sercomp] add `compser` for deserialization (inverse of `sercomp`)
             (@palmskog),
 * [serapi]  Allow custom document creation using the `NewDoc` call.
             Use the `--no_init` option to avoid automatic creation
             on init. (@ejgallego)
 * [sercomp] Allow compilers to output `.vo` (@ejgallego , suggested by
             @palmskog)
 * [sercomp] Serialize top-level vernaculars with their syntactic
             attributes (such as location) (@ejallego)
 * [serapi]  Add `Assumptions` query, at the suggestion of @Armael
             (@ejgallego)
 * [sercomp] Disable error resilience mode in compilers; semantics are
             a bit dubious see coq/coq#9204 also rocq-archive/coq-serapi#94.
             (@ejgallego, report by @palmskog)
 * [sercomp] Add `check` mode to compilers to check all proofs without
             outputting `.vo`. (@palmskog)
 * [sercomp] Add "hacky" `--quick` option to skip checking of opaque
             proofs. (@ejgallego, request by @palmskog)
 * [sercomp] Add `--async_workers` option to set maximum number
             of parallel async workers. (@palmskog)
 * [sertop] Stop linking Coq plugins statically and load `serlib`
            plugins when Coq plugins are loaded instead (@ejgallego,
            review by @palmskog)

_Version 0.5.7_:

 * [serlib] Fixed serializers for more tactics data, add support for
   `ground` plugin (rocq-archive/coq-serapi#68). Thanks again to @palmskog for the report.

_Version 0.5.6_:

 * [serlib] Fixed serializers for some tactics data (rocq-archive/coq-serapi#66) Thanks to
   @palmskog for the report.

_Version 0.5.5_:

 * [serlib] Be more lenient when parsing back `Id.t` as to accommodate
   hacks in the Coq AST (rocq-archive/coq-serapi#64) Thanks to @palmskog for the report.

_Version 0.5.4_:

 * [serlib] Fix critical bug in handling of abstract type (rocq-archive/coq-serapi#60)

_Version 0.5.3_:

 * [sertop] Support for `-I` option (`--ml-include-path`).

_Version 0.5.2_:

 * [serlib] Compatibility with OCaml 4.07.0 [problems with `Stdlib` packing]

_Version 0.5.1_:

 * [serlib] (basic) support for serialization of the ssreflect grammar,
 * [serapi] `(Query () (Ast n))` is now `(Query ((sid n)) Ast)`,
 * [serapi] remove broken deprecated `SetOpt` and `LibAdd` commands,
 * [doc] Improved man page.
 * [js] Miscellaneous improvements on the js build.

_Version 0.5.0_:

 * [general] support Coq 8.8, use improved document API,
 * [sertop] By default `sertop` will create a new document with `doc_id` 0,
 * [sertop] new debug options, see `sertop --help`.

_Version 0.4_:

 * [general] support Coq 8.7 , make use of improved upstream API,
 * [sertop] support `-R` and `-Q` options, note the slightly different
   syntax wrt Coq upstream: `-R dir,path` in place of `-R dir path`,
 * [serlib] support serialization of generic arguments [rocq-archive/coq-serapi#41],
 * [serapi] `(ReadFile file)`: hack to load a completed file.

_Version 0.2_:

 * Better Query/Object system.

_Version 0.1_:

 * Serialization-independent protocol core,
 * [js] Javascript worker,
 * [lib] Better Prelude support,
 * [serlib] Full Serialization of generic arguments,
 * [proto] Add is not a synchronous call anymore,
 * [proto] Refactor into a flat command hierarchy,
 * [proto] More useful queries,
 * [proto] Guarantee initial state is 1,
 * [proto] Support for ltac profiling,
 * [proto] Printing: add depth limiting,
 * [proto] Better handling of options in the sexp backend.

_Version 0.03_:

 * **[done]** Implicit arguments.
 * **[done]** Coq Workers support.
 * **[done]** Advanced Sentence splitting `(Parse (Sentence string))`, which can handle the whole document.

_Version 0.02_:

 * **[done]** Serialization of the `Proof.proof` object.
 * **[done]** Improve API: add options.
 * **[done]** Improve and review printing workflow.
 * **[done]** `(Query ((Prefix "add") (Limit 10) (PpStr)) $ObjectType)`
 * **[done]** Basic Sentence splitting `(Parse num string))`, retuns the first num end of the sentences _without_ executing them.
              This has pitfalls as parsing is very stateful.
 * **[done]** Basic completion-oriented Search support `(Query () Names)`
 * **[done]** Better command line parsing (`Cmdliner`, `Core` ?)
 * **[partial]** Print Grammar tactic. `(Query ... (Tactics))`.
   Still we need to decide on:
   `Coq.Init.Notations.instantiate` vs `instantiate`, the issue of
   `Nametab.shortest_qualid_of_global` is a very sensible one for IDEs
ejgallego added a commit to ejgallego/opam-repository that referenced this issue Jan 28, 2019
CHANGES:

* [general] support Coq 8.9,
 * [general] SerAPI now uses Dune as a build system,
 * [opam]   install `sertop.el`,
 * [serlib] support to serialize kernel environments,
 * [serapi] new query `Env` that tries to print the current kernel environment,
 * [serlib] correct field names for `CAst`,
 * [serlib] more robust support for opaque / non-serializable types (rocq-archive/coq-serapi#61, rocq-archive/coq-serapi#68).
            Thanks to @palmskog,
 * [serlib] new option `--exn_on_opaque` to raise an exception on
            non-serializable types; closes rocq-archive/coq-serapi#68, thanks to @palmskog,
 * [serlib] serialization test-suite from
            https://github.com/proofengineering/serapi-tests, thanks to
            @palmskog,
 * [sercomp] add `--mode` option to better control output,
 * [sercomp] add `compser` for deserialization (inverse of `sercomp`)
             (@palmskog),
 * [serapi]  Allow custom document creation using the `NewDoc` call.
             Use the `--no_init` option to avoid automatic creation
             on init. (@ejgallego)
 * [sercomp] Allow compilers to output `.vo` (@ejgallego , suggested by
             @palmskog)
 * [sercomp] Serialize top-level vernaculars with their syntactic
             attributes (such as location) (@ejallego)
 * [serapi]  Add `Assumptions` query, at the suggestion of @Armael
             (@ejgallego)
 * [sercomp] Disable error resilience mode in compilers; semantics are
             a bit dubious see coq/coq#9204 also rocq-archive/coq-serapi#94.
             (@ejgallego, report by @palmskog)
 * [sercomp] Add `check` mode to compilers to check all proofs without
             outputting `.vo`. (@palmskog)
 * [sercomp] Add "hacky" `--quick` option to skip checking of opaque
             proofs. (@ejgallego, request by @palmskog)
 * [sercomp] Add `--async_workers` option to set maximum number
             of parallel async workers. (@palmskog)
 * [sertop] Stop linking Coq plugins statically and load `serlib`
            plugins when Coq plugins are loaded instead (@ejgallego,
            review by @palmskog)
silene pushed a commit that referenced this issue Apr 5, 2019
silene pushed a commit that referenced this issue Apr 5, 2019
(cherry picked from commit 00263f3)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
part: STM State Transition Machine, asynchronous proofs, etc.
Projects
None yet
Development

No branches or pull requests

2 participants