New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segment violation since 2.10.0 #405
Comments
I experience a reproducible hang of utop on startup. Also 2.10.0 on ocaml 4.14.0. On Macos, x86_64. However downgrading to 2.9.2 did not fix it, so it's likely a different issue. sorry for hte noise here |
Do you have any of these config files: |
i saw it with our without |
I do not have any of those files. The only utop-related file I see in those directories is |
let's eliminate file-related problems: does the same happen if you rename |
Still happens; I'm afraid it won't be that simple. I upgraded the opam package again, tried without |
Thanks. At least that's ruled out. |
I think I've solved it: I can only reproduce the issue when the opam switch is configured with The output itself was not very informative:
(Spanish for "segmentation violation (core dumped)".) When I submitted this issue it didn't even generate a core dump, I've just now figured out that the default
That would be a SIGSEGV dereferencing a block header in this function from the 4.14.0 runtime source. That code made me remember that my opam switch was configured with Has utop has started using naked pointers between 2.9.2 and 2.10? If I understand correctly, naked pointers won't be supported in OCaml 5 at all, so this seems like a regression. Strangely enough, if I configure a switch with
That's a macOS path, isn't it? This is an Ubuntu 22.04.1 running over WSL2 (Windows 10). |
Ah, good debugging, thanks. I think I can take it from here. The main thing that happened between 2.9.2 and 2.10 is a switch of the library we use to handle unicode. I agree that it should be made nnp-clean. (FYI as a tip, you can get English error messages by setting
Hmm, sorry I mixed up the 2 reports. |
OK I managed to repro with |
This is an interesting thing to debug:
|
I reduced that to lambda-term. Dropping the following dune file in a new dir within its sources triggers the segfault (but replacing
|
So... this is getting complicated. I'm starting to think this is actually on OCaml and isn't worth investigating further.
Right, because it's crashing before it's finished loading the executable, on a major collection while initialising the bytecode runtime's
#use "topfind";;
#require "inspect";;
external get_global_data : unit -> Obj.t = "caml_get_global_data";;
get_global_data () |> Inspect.Sexpr.dump;; (* SIGSEGV *) Those libraries barely have dependencies, so now I think the naked null pointer is already there from the start, but it doesn't cause issues on Since OCaml itself is being largely reworked for 5.x, I think we should just close this issue, assume that 4.14.0's bytecode runtime is unsafe to run with
True, so much software ignores C locales nowadays (for good reasons) that I forget about it! J'ai aussi oublié le français. |
@debugnik did you test if you observe the same segfault with one of the OCaml 5.0.0 beta releases? |
Thanks for the repro, I think it's very much worth trying to fix. On my side I'm trying to make a useful bytecode program that can make this crash. |
@Octachron the "inspect" code sample still segfaults on |
@Octachron I haven't tried out 5.0 yet, as I prefer waiting for stable first releases, but I was just reading your announcement of beta2 so I'll give it a try. gives it a try (5.0 switches build faster, don't they?) No segfault running utop 2.10 normally on 5.0.0~beta2, but the snippet poking into |
I've wrote a program that tries to print every global value, and I manually noted which indices cause a segfault - these are instead printed with a simpler algorithm that just displays tag/size: #use "topfind";;
#require "inspect";;
let do_print = function
| 12 -> false
| 45 -> false
| 46 -> false
| 50 -> false
| 61 -> false
| 62 -> false
| 64 -> false
| 65 -> false
| 71 -> false
| 73 -> false
| 76 -> false
| 81 -> false
| 89 -> false
| 101 -> false
| 117 -> false
| 119 -> false
| 143 -> false
| 154 -> false
| 156 -> false
| 161 -> false
| _ -> true
external get_global_data : unit -> Obj.t array = "caml_get_global_data";;
let g = get_global_data () in
Array.iteri (fun i d ->
if do_print i then
Format.printf "@[%d %a@.@]" i (Inspect.Sexpr.dump_with_formatter ?context:None) d
else
Format.printf "@[%d skipped %s@.@]" i (Inspect.Value.description d)
) g
(the "skipped" ones are the ones that otherwise cause a segfault) |
Trying to dump any functional value is a segfault within the no-naked-pointer mode since # Inspect.Sexpr.dump (fun x -> x);;
Segmentation fault (core dumped) |
Ah, I see. So yes that's expected any case there's a closure. So that |
Finally, I can reproduce the original issue with some generated code (and no utop/lambdaterm):
(* gen.ml *)
let blank = String.make 1000 ' '
let () =
for _ = 1 to 10000 do
Printf.printf "let _ = \"%s\"\n" blank
done When running This also happen with a single large string with let blank = String.make 10_000_000 ' '
let () = Printf.printf "let _ = \"%s\"\n" blank (in contrast, generating more smaller strings crashes the compiler with a stack overflow - I know that the compiler team is not interested in fixing this) @Octachron is that something that should get fixed on the ocaml/ocaml side? I can open a report with that information. |
This is indeed an interesting bug report for 4.14 (to keep track of the issue at the very least). |
Good catch! If |
This has been closed in ocaml/ocaml#11788. I'm closing this since this is an ocaml bug (fixed by disabling nnp, or updating to 4.14.1 or 5.0.0~rc1). Thanks a lot for the debugging help. |
Having installed utop 2.10.0 through opam on a fresh opam switch, trying to run
utop
ordune utop
crashes immediately with a segment violation. Downgrading the package to 2.9.2 fixes the issue. This is with OCaml 4.14.0 and opam 2.1.2 on WSL2 Ubuntu 22.04.1 LTS, x86-64.The text was updated successfully, but these errors were encountered: