-
Notifications
You must be signed in to change notification settings - Fork 661
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
One-liner segfault #3831
Comments
Comment author: @maximedenes Welcome to Coq trunk (87f9997) Coq < Show Proof. |
Comment author: @maximedenes After more investigation (with the help of Gabriel), the bug occurs only under OCaml 4.02.1 (not 4.01), and is due to PMP's code for exception handling in lib/exninfo.ml. Turning these functions into identities makes the problem disappear. My guess is that any uncaught exception will trigger a segfault (nothing specific to Show Proof, which is just raising NoCurrentProof in this case). Gabriel says that the representation of exceptions has changed in 4.02. |
Comment author: @gares Maxime, while you have Gabriel at hand, please ask him how to obtain the same Having the possibility to load an exception with extra info is extremely exception WithLoc of Loc.t * exn and then your exception handling code becomes a huge mess with Foo | WithLoc (,Foo) | WithState(,Foo) | (that is what the code looks like in Matita...). Of course having such feature be integrated upstream would be awesome. If it is impossible or rejected, we could fix the code having just exception WithInfo of Dyn.t list ref * exn with Foo | WithInfo(_,Foo) -> ... Better than the code above but still very ugly. |
Comment author: @ppedrot I remember that Jacques-Henri coded such a thing, but it was using a global imperative table to register data, which I find rather ugly (and not working properly in the debugger any way, because it is not marshalled at the same time). What is the current implementation of exceptions in OCaml 4.02? |
Comment author: @gasche
You sound like a dilettante bank robber that meditates in jail, instead of his reinsertion, about how to rob it better next time. Jacques-Henri indeed worked on a similar topic last year, as explained in this blog post about his prototype library: (Note: the worrying performance numbers reported in the post are about what happens if you use syntactic preprocessing to store traces on all try..with constructs. A version that manually implements logging only in some well-chosen case, as already done in the Coq codebase, would control the cost in a much more fine-grained way.) The interface he relied on are now part of OCaml proper (the raw_backtrace part that allows it to be efficient is present since 4.01; 4.02 adds a way to get the call stack (not necessarily at exception-raising points) in the same raw format, and inspect it from user code), thanks to work by him and Alain Frisch that also needed this feature. I can't comment on the suitability of the global table as I don't know much about ocamldebug. I'd note that it can be thought of as a persistent mapping, however: keys are (linearily) looked up by physical comparison with the raised exception, so it should be robust to insertion side-effects (not) being backtracked. (On the other hand, his work assumes that two exceptions raised at different times are physically distinct, which I'm not sure is still true with the new representation of exceptions.) |
Comment author: @ppedrot
I better like to be thought of as a Bernard Madoff!
This is a problem of locality, not of backtrack. Everything running from the side of the program being debugged is not seen in the debugger, except for the values immediately accessible, that are actually remotely marshalled to the debugger. So any non-local info is lost. This is what happen with the current implementation of ephemera in Coq, which uses the same global table technique instead of a proper implementation in the GC of OCaml. So I would be better off with global hacks like this.
I had a quick look at the trunk code of OCaml. I believe we can port the current hack to the 4.02 branch, and dispatch the behaviour according to the version of OCaml at initialization time. |
Comment author: @gares The argument about ocamldebug is pretty void in my opinion. ocamldebug never works for me and nobody uses it as far as I know (for Coq). I would recommend using the implementation that makes the least set of Best, |
Comment author: @ppedrot The issue is deeper than I thought. As Gabriel made the remark, pointer uniqueness for exceptions does not hold anymore. Essentially, I cannot fix my code for exceptions without arguments, because since 4.02, any two such exceptions with the same constructor (e.g. Exit) share the same pointer, that is, Exit == Exit, which was not the case before 4.02. The semantics of the language rely on this fact, so that we cannot modify such argument-free exceptions in place by adding a new field as I did before. Even worse, this also prevents Jacques-Henri's global table solution, for the very same reason of sharing the same code. I do not know how to fix this properly. The simplest solution is to add a dummy unit argument to arguments-free exceptions that should wear information, and to tweak the enriching mechanism to ignore arguments-free exceptions otherwise. As we say, « Bref, c'est la merde. » @ Gabriel: do you know why the OCaml implementation was changed like this in the first place? It seems to me really strange that we handle separately no-arguments exceptions vs. arguments-wearing exceptions (and, since 4.02, open variants). Is it for efficiency reason? Would it be ever conceivable that OCaml could backtrack on this precise point? I am not argueing against the other changes of representation of exceptions, just the fact that at least distinct exceptions should create distinct pointers. |
Comment author: @gares Can we redefine raise and have a 1 slot info bucket somewhere? Something like "raise ~info e" setting the info and "raise e" just In other words having the exception api hold the value, and not the exception itself. |
Comment author: @ppedrot We can try to do that indeed. This is not thread-safe, but it should not be a problem. For the record, http://caml.inria.fr/mantis/view.php?id=6683 I think Xaxier never looked at the code of Coq, otherwise he would be much more horrified... |
Comment author: @gares Well, being thread safe is actually relevant since I do use threads for the STM. |
Comment author: @ppedrot Well, if you can do that, I would be pleased. The few other uses of enriched exceptions can be rewritten in an slightly unsound way with global state (that is, potentially mistaking argument-free exceptions), but this should be sufficient for debugging purposes. I am mainly concerned by the uses of the STM which seem nontrivial to me... |
Comment author: @gares Well, Xavier is right: if you depend on undocumented features then you are alone. We are in trouble. The approaches discussed here that try to keep the code
So I see no other option than going back to wrapping an exception into another The lesson we should all learn is to complain early when we see this kind of The lesson PMP should learn is to stop doing that once an forall. At the light of this episode I find compulsory to know the list of hacks that rely Best, For the records, the STM adds to the exception 2 Stateid.t values (the state we |
Comment author: @ppedrot In OCaml 4.02 we can discriminate arguments-free exceptions at runtime, so that we could do a hybrid approach.
For debugging info, ignoring exceptions should be sufficient. If there is code deeply relying on the content of the information, well, it has to be rewritten (STM ?). Wrt. the hack of Intmap, there is an easy solution, which is to import in the code our own version of the maps... |
Comment author: @gares STM use exninfo for error reporting, exactly as the parser or the pretyper attach to the exception a Loc.t to identify the subterm or substring that does not type or does not parse. STM adds to that the system state id. In other words, if you put a sentence per line, the state id is pretty much the line number while the Loc.t is the offsets in that line. I can use a wrapper exception in STM, and we can go back to a wrapper exception for Loc.t as well (after all, it was like that before exninfo). I can try to do that next week (for the STM) and see how it goes. For handling the backtrace I'm unsure not being able to deal with arguments free exceptions is good enough. Tracing Not_found seems important to me. I don't know know if there are other users of exninfo. I agree that embedding the code of map.ml is a solution, and should probably be done, or we risk Coq compiling but not working with a future OCaml version that changes the implementation of Map. |
Comment author: @gares I did experiment a bit, both the Wrap exception idea and the global table one. https://github.com/gares/coq/tree/try/wrapexn None of them just works but the second one is promising, since it almost works: The Wrap exception could probably work, but detecting where the code should I did not even try to measure the impact on performances. |
Comment author: @ppedrot I think the second patch is the way to go. With a little bit of modification, it should go smoothly and (almost) soundly. In particular, I would like to export the type of exception information in Exninfo, together with a user-defined raise of type val raise : ?exninfo -> exn -> 'a and tweaking the type of Errors.push to be val push : exn -> exn * exninfo so that we can easily see when enriching exceptions. Likewise, the `Exn constructor in Future should contain a pair (exn * exninfo) instead of just an exception. Do you want me to write it out an propose a pull request? |
Comment author: @gares Sure, go ahead an post your branch url |
Comment author: @ppedrot Here is an alternative and (I hope) sounder version. https://github.com/ppedrot/coq/tree/pure-exninfo Instead of adding information to arbitrary exceptions, we only do so at raise time. There is an instrumented (thread-safe) raise that sets a thread-specific reference to be the latest data set. When recovered in a try-with clause, the information is pushed back semi-automagically thanks to the Errors.push function. This is done through a dedicated enriched exception type, which is no more than an exception together with a dedicated enrichable type. A few functions had their signature changed, to cope with this. Future exception fixing functions now take enriched exceptions, and likewise, the tactic monad does everything through enriched exceptions. There may still be very rare cases of mismatching exceptions, but this is now more robust than just checking for pointer equality. The scenario leading to such a confusion is highly unlikely, see the Exninfo module for details. Any comments welcome. |
Comment author: @gasche That seems reasonable. I had a few comments while having a very quick look at the code:
|
Comment author: @ppedrot
Well, not respecting a specification is in particular unsound, but this is nitpicking, isn't it?
I took the first name that popped in my head. You can change it as you wish.
Erm, the iraise function is only defined in Util, which is at the end of the clib file. All files using Exninfo.raise are in lib/, hence before the iraise. But it is indeed possible to defined the iraise in the Exninfo file itself and reexporting it in Util.
Do as you like. This patch was a proof of concept, not a push-ready one. |
Comment author: @ppedrot BTW, after a glance at the STM code, I am unsure that we need the full power of built-in threads (and thus, all of the thread-safety boredom that comes with it). It seems that what we only need is essentially cooperative non-blocking IO. Enrico, do you think it is feasible to write a tiny (hopefully monadic) lightweight-thread library allowing to implement the STM without polluting all the code with the threaded behaviour? |
Comment author: @aspiwack Speaking of which. Can I ask, very ignorantly, why we need threads/mutexes to begin with? I thought the stm only used forks and stuff. Where do mutexes come into action? |
Comment author: @gares OCaml threads, being non-parallel, are good only for non blocking io, exactly as coroutines, but OCaml provides the former, not the latter (that I like too). I've used threads because it is simpler to have a thread managing a worker Worker managers are mostly pure. They interact with the rest of the world in the following ways: they pick tasks from a TQueue, they snapshot (read a !ref) the stm document, they marshall a piece of the document to the worker, they read back the proof terms, they assign a future value (a ref only the worker manager has write have access to), access RemoteCounter (a worker needs some fresh universe levels, he asks the master for them) and finally writes (forward) stuff on the feedback bus. Locks are needed in the TQueue (producer consumer), the remote counter and the feedback bus (read/write a shared var). Well, I'm sorry I'm repeating myself once again but Coq is written in OCaml I'm also curious to ear which problem a monadic interface would solve here. |
Comment author: @ppedrot
You seem to have suffered a childhood trauma involving monads, don't you? Monads by themselves do not require anything special in the target language to be actually used. Haskell does add syntactic sugar and typeclasses, but we need neither of both to write monadic code properly. In practice, you did already introduce monads in Coq under the form of futures, which (almost) follow a monadic interface. The almost is there because you prefer to use the dangerous map + run (chain + force, in your parlance) combination instead of the usual bind, but that's about all. If this is the word "monad" that bothers you, just change it into "computation" or whatever pleases you. Actually most of involved OCaml code out there uses some form of monad, but it does not usually advertise it with a huge blinking signal. I clearly remember that for instance, there are monadic structures in the ocamlnet library, without having the word "monad" spelled out once.
This is not about monads: this is about being able to know statically when we're in threaded code, and when we are not. I find it rather stupid to depend on the Thread library, and to change the whole model of execution of Coq, just for three calls or so to non-blocking IO, drowned in a sea of sequential code. Here, we could just embed those three calls into a small non-blocking IO library, using the standard CPS-style transformation, with a proper event-handling loop running that stuff at the very end. That would save us a lot of hassle about thread safety. And no, this is not about mutexes: this is about changing globally the semantics of the programming language. This is also the proper moment to finish with a trollish quotation about S(T)Ms. "Computer is a state machine. Threads are for people who can't program state machines." (Alan Cox) |
Comment author: @gares The fact the new monadic API has almost eaten up all the performance I don't like threads, in general they just mess up the code. I'm sorry fixing the exninfo mess turns out to be harder because I'm using threads. All you say about rewriting everything in a different style is surely If you really want the citation (you know who said that, I guess): |
Comment author: @ppedrot
I believe you are talking about the Ltac monadic API? Then that is untrue, or at least not because of the monad itself. From what I have seen in real code, the additional hotspots of the current trunk are (by decreasing order of magnitude):
Only the third point is somehow related to the change of interface, but this has nothing to do with the fact we are monadic. It is here for compatibility purpose, and is going to disappear when we switch all tactics to the monad. I actually believe the converse, that is, the monad (in its current form) is slightly quicker than the old hand-written monadic-ish code. This is due to its careful crafting, with the CPS-based encoding which is essentially free. (BTW, we should do something about 1. in Ssreflect, it is a pity to have 20%~25% of the time due to the ugly merge of universes when one could do it for free in state passing style.)
+1
It is, indeed. I am just advocating the fact that we may think of restricting it out of the STM.
Threads do change the semantics of OCaml. As delimited continuations from delimcc do (fwiw, there was an interesting thread on the Caml list about delimcc used together with the imperative trick of tail-rec list map, wreaking havoc). Being a library does not mean it does not mess with the operational semantics of the language.
This is not directly related to the Exninfo. This was just a side remark, considering I do not like threads and the use we make of them seems really reducible to a few simple primitives.
When I'm done with my thesis! |
Comment author: @aspiwack Enrico: one possible way of representing non-blocking IO is futures. This is in the most abstract and non-definite sense of the word. But I still wonder whether your futures would be able to abstract over the asynchronous system calls you use. The secondary question is, of course, whether that would be desirable. On the efficiency concern, Lwt has proved that cooperative concurrency is, in general, significantly more efficient than threads, despite the higher-order encoding. For reference, they have non-blocking IO too [ http://ocsigen.org/lwt/2.4.3/api/Lwt_unix ]. I must confess I know neither how they work or how they are implemented. |
Comment author: @gares @ Arnaud: A future is like lazy_t but you can create it in two ways: So the non-blockingness is given by the fact that the blocking call One last remark, it is not about efficiency here. What costs is marshalling |
Comment author: @aspiwack So I was suggesting to implement futures using futures… Well, that was kind of silly. I don't actually know if a thread-free version of the worker manager would be better than the current one. I don't really have time to understand Lwt's implementation of non-blocking IO and whether, in particular, it would free the rest of the code from locks. I don't have any experimental result on cooperative interfaces vs threads. It's just empirically faster to use Lwt rather than thread-intensive computation (that said, our case does not qualify as thread-intensive, so who knows). I don't think that's very important, I just mentioned it because you seem to be worried about efficiency of higher-order interfaces yourself. |
Comment author: @gares No, not in my code. I was pointing the finger at monads because I have the impression one piles up closures, allocations. Actually my understanding I have 1 future per proof, something like 5 chained functions each, and say But my guess was that doing so at each "tactic/tactical call" would be |
Comment author: @gasche
Indeed; what comes out of a review when one doesn't understand the high-level stuff is usually the details -- nitpicking. But the details do matter.
I'm not sure what you mean by "You can change it", I'm not going to do the work of evolving this pull request into something "push-ready", and I kind of assumed that you would be in charge of that. Assuming this means "change it by giving me a strong opinion that I will follow because I don't care that much", I suggest you use "iraise".
Yes, that would be better.
After more though, I think the best move is actually to make the parameter non-optional at type (info option) (labelled or not). This means that people using |
Comment author: @ppedrot I have cleaned up this patch a bit, and make it compile properly on the test-suite. Yet, I am currently experiencing two issues.
Apart from this, I am fairly satisfied about this patch. On the few files I have tested, I cannot see any overhead in compilation time. This is no scientific benchmarking, but the patch seems neutral. I did not integrate Gabriel advises yet. I am rather happy of the "iraise" name, but I am not too fond of the compulsory option argument of raise. It looks ugly. Further comments anyone? |
Comment author: @gares w.r.t. 1, this is expected. I expect Matthieu to fix the format description for universe related fields at some point ;-) Is the patch at the usual place? It seems not up do date since it says that the last commit is 8 days old. Did you push? |
Comment author: @gasche
If you make the argument optional, you must be prepared to the fact that people that write code and are unfamiliar with the API will never set it. The only way to make them think after which value they want to pass is to have the type-checker if they don't pass one. I think it's fine if people have to think twice before raising an exception -- they will have to think about where it will be caught, and this will reduce the number of bugs. |
Comment author: @ppedrot
Yep, I rebased the patches before pushing, thus keeping the old commit date. I know this is bad practice when the branch is public, but this patch is not merge-ready yet so I do not care that much. |
Comment author: @ppedrot Strange... I can't reproduce the slow behaviour of ocamldebug... Maybe the program was in a eldritch state? So, apart from the vi file & the renaming of functions, this patch should be OK then. |
Comment author: @gares I'm sympathetic to the proposal of Gabriel, but I'm afraid it is too late now. |
Comment author: @ppedrot We can integrate this feature afterwards anyway. For now, I believe my branch is push-ready. Enrico, do you think you can port the vi checker before I push it, or do I push it quickly and we'll see afterwards? |
Comment author: @gares Matthieu has to do it, the data structures one has to describe are the one of universes... |
Comment author: @gares Sorry, half answer. I had no time to check your patch, but push it. |
Comment author: @gares The compilation of ml files via coq_makefile is broken (no -thread I guess) |
Comment author: @silene I cannot reproduce it, so I assume it has been fixed. If not, please reopen and provide a test case. |
Note: the issue was created automatically with bugzilla2github tool
Original bug ID: BZ#3831
From: @maximedenes
Reported version: 8.5
CC: @aspiwack, @gares, @gasche, @silene, @herbelin, @mattam82, @letouzey
The text was updated successfully, but these errors were encountered: