This issue falls under the general rule: "Weird-looking generated code is not given first-class support within the compiler distribution, fixes for compiler blowup are not prioritized and will only be merged if they preserve readability/maintainability of the compiler codebase".
Of course we cannot know whether a readable fix exist without understanding the issue, and the idea that there may exist an algorithmic bug within ocamldep made me curious, so I wrote this reproduction script:
let rec generate = function
| 0 -> ()
| i ->
let open Printf in
printf "let _ = Foo.x\n";
printf "include struct\n";
generate (i - 1);
let () = match int_of_string Sys.argv.(1) with
| n -> generate n
| exception _ ->
prerr_endline "please provide an integer as command-line argument";
The command (ocaml generate.ml 24) produces a file that ocamldep.opt takes 3 seconds on. Increasing the argument increases computation time quickly, but for smaller argument values (below 20) the processing is still instantaneous.
I would like to understand what's so bad with this code (there is only one external dependency, Foo, and the included modules export no names, so no data-structure should grow too big), and using this as an excuse to play with memory profilers.
I have played a bit with spacetime on a similar code generator, and the memory profile and allocation rate are flat. Contrarily, for 24 nested include, the add_struct_item somehow ends up being called more than 61M times when there is only 48 structure item in the code.