Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mingw64 + threads + system exception raised through longjmp() = crash #7638

Closed
vicuna opened this issue Sep 25, 2017 · 7 comments

Comments

Projects
None yet
1 participant
@vicuna
Copy link

commented Sep 25, 2017

Original bug ID: 7638
Reporter: @xavierleroy
Status: resolved (set by @xavierleroy on 2017-09-28T09:44:41Z)
Resolution: fixed
Priority: normal
Severity: major
Platform: Mingw64
OS: Windows 64
OS Version: 10
Version: 4.06.0 +dev/beta1/beta2/rc1
Target version: 4.06.0 +dev/beta1/beta2/rc1
Fixed in version: 4.06.0 +dev/beta1/beta2/rc1
Category: platform support (windows, cross-compilation, etc)
Monitored by: @gasche

Bug description

Consider:

let crashme v =
ignore (Sys.getenv v)

let _ =
let th = Thread.create crashme "no such variable" in
Thread.join th

Compile this program to bytecode using the Mingw64 port of OCaml and the trunk current at the time of this PR, i.e. 4.06.0+dev. On a Windows 10 machine (ocaml-mingw-64-b from Inria's CI pool, to be exact), the program crashes reproducibly.

Running it under a debugger shows a segfault in the call to longjmp() from caml_raise(), corresponding to Sys.getenv raising Not_found.

A similar issue shows up with lib-threads/socketsbuf.ml from the OCaml test suite.

The program works fine when compiled to native code.

This might be an instance of the setjmp/longmp problem reported here: https://sourceforge.net/p/mingw-w64/bugs/406/

Indeed, the problem goes away if, as suggested in the problem report above, the bytecode interpreter is modified to use __builtin_setjmp and __builtin_longjmp instead of setjmp/longjmp. Note however that those GCC builtins are undocumented.

@vicuna

This comment has been minimized.

Copy link
Author

commented Sep 26, 2017

Comment author: @alainfrisch

Xavier: did you try other ports (especially msvc64) and/or older versions of OCaml (4.05)?

@vicuna

This comment has been minimized.

Copy link
Author

commented Sep 26, 2017

Comment author: @xavierleroy

did you try other ports (especially msvc64) and/or older versions of OCaml (4.05)?

Not yet. A git bisect is in progress. More data points are always welcome.

@vicuna

This comment has been minimized.

Copy link
Author

commented Sep 27, 2017

Comment author: @alainfrisch

My current data points:

  • 4.06, mingw64 port: FAIL as reported, with or without -custom (segfault after 13-15s). OK with ocamlopt (quick termination with "Thread 1 killed on uncaught exception Not_found").

  • 4.06, msvc64 port (using VS2015): OK.

  • 4.06, mingw port (32-bit): OK.

  • 4.05, mingw64 port: FAIL. (So: not a recent regression, in particular not related to the new Unicode stuff.)

Also, replacing the call to Sys.getenv with a direct raise Not_found does not trigger the problem.

@vicuna

This comment has been minimized.

Copy link
Author

commented Sep 27, 2017

Comment author: @xavierleroy

Thanks a lot Alain for the data points.

This is consistent with the hypothesis that Mingw64 has a bug in the way it uses setjmp/longjmp from Microsoft's CRT.

(That MSVC has no problems can be explained in several way: use of a different CRT, use of the same CRT but in a different manner, special compilation of setjmp and longjmp, etc.)

@vicuna

This comment has been minimized.

Copy link
Author

commented Sep 27, 2017

Comment author: @xavierleroy

That leaves us with the question of finding a workaround.

Ideally, Mingw64 would fix the issue and we wouldn't have anything to do, but I'm afraid this will take time, and in the meantime the Mingw64 port of OCaml is seriously broken.

__builtin_setjmp / __builtin_longjmp could be used if we don't mind the fact that these are undocumented GCC features intended to help with the implementation of setjmp / longjmp, if I understood correctly.

I thought of using "frame-based structured exception handling" (the C++-style exception mechanism that Microsoft added to C) as a replacement for this particular use of setjmp / longjmp in the bytecode runtime system. It would do the job, but it is not implemented by the Mingw64 compiler, only by the MSVC compiler.

@vicuna

This comment has been minimized.

Copy link
Author

commented Sep 27, 2017

Comment author: @alainfrisch

Also related: http://www.agardner.me/golang/windows/cgo/64-bit/setjmp/longjmp/2016/02/29/go-windows-setjmp-x86.html

It seems several people have experienced unexplained problems with setjmp/longjmp under mingw64 and decided to switch to using _builtin* variants, which was successful. So I'd say: without more information or ideas for another workaround, let's follow the crowd. The Windows' mysterious ways.

(Oh, and sourceforge.net is in "static offline mode" now...)

@vicuna

This comment has been minimized.

Copy link
Author

commented Sep 27, 2017

Comment author: @xavierleroy

Pull request at #1376

@vicuna vicuna closed this Sep 28, 2017

@vicuna vicuna added this to the 4.06.0 milestone Mar 14, 2019

@vicuna vicuna added the bug label Mar 20, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.