Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Parallel Dynlink usage under Cygwin+MinGW is unsafe #13046

Open
jmid opened this issue Mar 22, 2024 · 0 comments
Open

Parallel Dynlink usage under Cygwin+MinGW is unsafe #13046

jmid opened this issue Mar 22, 2024 · 0 comments

Comments

@jmid
Copy link
Contributor

jmid commented Mar 22, 2024

We've observed an issue with Dynlink under the Cygwin and MinGW ports, which may cause segfaults, hangs, etc.
This has been tracked in ocaml-multicore/multicoretests#307. Here's a small reproducer:

libB.ml

let value = 34

main.ml

let loadfile f =
  try Dynlink.loadfile (Dynlink.adapt_filename f)
  with Dynlink.Error (Dynlink.Module_already_loaded _) -> ()

let dont_crash () =
  let wait = Atomic.make true in
  let dom1 = Domain.spawn (fun () ->
                             while Atomic.get wait do Domain.cpu_relax() done;
                             loadfile "libB.cmxs") in
  let dom2 = Domain.spawn (fun () ->
                             Atomic.set wait false;
                             loadfile "libB.cmxs") in
  let _ = Domain.join dom1 in
  let _ = Domain.join dom2 in
  ()

let _ =
  for i=1 to 1000 do
    Printf.printf "%i %!" i;
    dont_crash ()
  done

steps to reproduce:

ocamlopt -g -shared libB.ml -o libB.cmxs
ocamlopt -g -I +dynlink dynlink.cmxa repro.ml -o repro.exe
./repro.exe

Across 10 MinGW runs I observed the following behaviour (sorted). I've observed simiar under Cygwin.

Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Failure("input_value_from_block: bad object"))
Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Failure("input_value_from_block: bad object"))
Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Failure("input_value_from_block: bad object"))
Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Failure("not an OCaml plugin"))
Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Failure("not an OCaml plugin"))
Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Invalid_argument("Dynlink: Missing frametable for LibB"))
Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Invalid_argument("Dynlink: Missing frametable for LibB"))
Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Invalid_argument("Dynlink: Missing frametable for _shared_startup"))
Fatal error: exception Dynlink.Error (Dynlink.Cannot_open_dll Invalid_argument("Dynlink: Missing gc_roots for _shared_startup"))
Segmentation fault

This is an issue on trunk, 5.2.0~alpha1, 5.1.1, ...
I've since learned that this is a known FlexDLL issue: ocaml/flexdll#120
and that #11607 disabled a compiler test under Windows as a result.
I'm therefore opening this issue to help keep track.

(As usual) @dra27 has a draft fix: a first version adds a global lock around FlexDLL'ss list of loaded units.
I can confirm that it fixes the above reproducer.

Note: an earlier PR ocaml/flexdll#112 fixed races to FlexDLL's global error variables (which is orthogonal to the above AFAICS)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant