Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The current code shape generated for time_zone_data.ml makes the compiler explode #16

Closed
gasche opened this issue Jan 23, 2021 · 18 comments · Fixed by #17
Closed

The current code shape generated for time_zone_data.ml makes the compiler explode #16

gasche opened this issue Jan 23, 2021 · 18 comments · Fixed by #17

Comments

@gasche
Copy link

gasche commented Jan 23, 2021

Hi there,

time_zone_data.ml is a 4Mib generated source file with veeery long list literals. This pattern is known to make the compiler blow up if certain optimization passes are enabled (typically, when compiling with flambda).

Currently the OCaml compiler tries to accept human-written files, but it does not claim to support program-generated code well: it is known that some parts of the compiler will blow up on unnatural inputs. The recommend practice for authors of OCaml code generators is to avoid extremely-long sequences of any language construct (sequences, list literals, array literals, toplevel definitions, etc.), and instead build equivalent programs by concatenating smaller groups of items -- with a reasonable limit baked in the generator.

Could you rewrite the generator for time_zone_data.ml to follow this recommended practice?

(Maybe someday the OCaml compiler will become more robust to unnatural data, but that requires a lot of effort and may in some cases decrease readability and maintainability of the compiler code, so this is not on the radar for now.)

@darrenldl
Copy link
Contributor

@gasche when you said toplevel definitions, is it per module or the total number of top level definitions?

@darrenldl
Copy link
Contributor

darrenldl commented Jan 23, 2021

There are 594 time zones, so afaict, either I build small number of definitions with long sequence of operations on a hashtable (which seems to blow up), or I build a lot of definitions with short sequence of operations (which seem to blow as well).

EDIT: assuming it's about total number of definitions

EDIT2: I can swap to using 594 strings, and deserialize the items during run time, but not sure if that fits within the provided recommendation either

@gasche
Copy link
Author

gasche commented Jan 23, 2021

The code currently looks like this:

let db =
  String_map.empty
  |> String_map.add "Cuba"
      [|
        ((-62167219200L), { is_dst = false; offset = (-19776) });
        ((-2524501832L), { is_dst = false; offset = (-19776) });
        ((-1402813824L), { is_dst = false; offset = (-18000) });
        [...]
     |]
  |> String_map.add "EET"
      [| ... |]
   |> String.map "GMT-0"
      [| ... |]
   |> [...]

Two things may be problematic here:

  • an unnaturally large sequence of function applications x |> f1 |> f2 |> ... |> fn
  • one or several unnaturally large array literals [| ... |]

Both of these can be limited in the code generator without changing the semantics of the program. For the array literal, you can rewrite [| ... ... ... |] into Array.concat [| [| ... |] ; [| ... |]; [| ... |] |] for example (I would use the ceiling of the square root of the total size as the maximum smaller-array size). For the function application, you can rewrite it as List.fold_left (fun m (name, data) -> String_map.add name data m) String.empty [ ... ] for example, using again a cautious-generation technique for [...].

I'm writing this without having seen precise failure logs (just "there is a stack overflow with flambda") and without having tried to reproduce the problem. My recommendation would be to start by trying to reproduce the issue, by compiling your program with flambda, possibly with -O3 if earlier levels do not fail.

@darrenldl
Copy link
Contributor

darrenldl commented Jan 24, 2021

Sorry I should have clarified that the modified code are in dev branch.

I have tried Array.concat but it still yields stack overflow (I only tried setting smaller array size to 100, however).

I'll have a look at the log

EDIT: So setting OCAMLRUNPARAM=b, I see the following

dune build @all
    ocamlopt tzdb-full/.timere_tzdb_full.objs/native/timere_tzdb.{cmx,o} (exit 2)
(cd _build/default && /home/darren/.opam/4.11.1+flambda/bin/ocamlopt.opt -w +a-4-9-29-37-40-42-44-48-50-32-30@8 -g -I tzdb-full/.timere_tzdb_full.objs/byte -I tzdb-full/.timere_tzdb_full.objs/native -I /home/darren/.opam/4.11.1+flambda/lib/containers -I /home/darren/.opam/4.11.1+flambda/lib/containers/monomorphic -I /home/darren/.opam/4.11.1+flambda/lib/seq -I tzdb-module/.timere_tzdb.objs/byte -I tzdb-module/.timere_tzdb.objs/native -intf-suffix .ml -no-alias-deps -opaque -open Timere_tzdb__timere_tzdb_full__ -o tzdb-full/.timere_tzdb_full.objs/native/timere_tzdb.cmx -c -impl tzdb-full/timere_tzdb.ml)
Fatal error: exception Stack overflow
Raised by primitive operation at Flambda_iterators.map_exprs_at_toplevel_of_program.loop.map_constant_set_of_closures in file "middle_end/flambda/flambda_iterators.ml", line 703, characters 37-1023
Called from Flambda_iterators.map_exprs_at_toplevel_of_program.loop in file "middle_end/flambda/flambda_iterators.ml", line 740, characters 25-38
Called from Flambda_iterators.map_exprs_at_toplevel_of_program.loop in file "middle_end/flambda/flambda_iterators.ml", line 740, characters 25-38
Called from Flambda_iterators.map_exprs_at_toplevel_of_program.loop in file "middle_end/flambda/flambda_iterators.ml", line 740, characters 25-38
(many repetitions of the above line)

@darrenldl
Copy link
Contributor

darrenldl commented Jan 24, 2021

Okay I've tried myriad combinations of things and not getting much of anywhere. Hm...

@darrenldl
Copy link
Contributor

darrenldl commented Jan 24, 2021

Okay I've reduced the range of generation, which seems to work on 4.11.1+flambda (not sure where to pass the -O3 flag). Could you confirm if it's now suitable?

https://raw.githubusercontent.com/daypack-dev/timere/main/gen-artifacts/time_zone_data.ml

@gasche
Copy link
Author

gasche commented Jan 24, 2021

Now the code generator uses a hashtable, and to avoid a large sequence of Hashtbl.add db "foo" [| ... |] you split the sequence using several intermediary let () = groups.

I don't know if the change is necessary (I would have assumed that the previous approach could also work, by splitting the calls to String_map.add), but I guess that it's good news that it now avoids any stack-overflow in the compiler. (Have you tried flambda's -O3 option which is more demanding?).

This is a weird problem because we don't really know what is "good enough", but if your code compiles with flambda -O3 on your system, I guess you could consider the issue solved and submit on opam again? If you get a report of another issue on some system, you can always investigate and tweak the limits again.

@darrenldl
Copy link
Contributor

darrenldl commented Jan 24, 2021

Now the code generator uses a hashtable, and to avoid a large sequence of Hashtbl.add db "foo" [| ... |] you split the sequence using several intermediary let () = groups.

I don't know if the change is necessary (I would have assumed that the previous approach could also work, by splitting the calls to String_map.add), but I guess that it's good news that it now avoids any stack-overflow in the compiler. (Have you tried flambda's -O3 option which is more demanding?).

The split approach didn't seem to work immediately, but also could be due to the total size of data was too large (didn't try said approach with the current reduced range of dates).

This is a weird problem because we don't really know what is "good enough", but if your code compiles with flambda -O3 on your system, I guess you could consider the issue solved and submit on opam again? If you get a report of another issue on some system, you can always investigate and tweak the limits again.

Sorry but I still have no clue how to pass -O3 to flambda - could you provide a pointer to how one would go about doing so?

EDIT: passing it via dune file doesn't seem to work, hm...I'll take a look later...

@gasche
Copy link
Author

gasche commented Jan 24, 2021

I think that adding the following as a root dune file should work:

(env
 (dev
  (ocamlopt_flags (:standard -O3))))

(See the dune documentation.)

@gasche
Copy link
Author

gasche commented Jan 24, 2021

The split approach didn't seem to work immediately, but also could be due to the total size of data was too large (didn't try said approach with the current reduced range of dates).

Note: Ideally I think that you should be able to include all the data you want, as long as it is split in reasonably-sized chunks at all "sequence" levels.

@darrenldl
Copy link
Contributor

I think that adding the following as a root dune file should work:

Thanks! I can confirm it can now build on 4.11.1+flambda -O3.

Note: Ideally I think that you should be able to include all the data you want, as long as it is split in reasonably-sized chunks at all "sequence" levels.

I don't think this is possible after tuning the parameters a fair bit.

@darrenldl
Copy link
Contributor

darrenldl commented Jan 24, 2021

I think the current fix suffices for now. I'll submit another PR to opam repository.

Thanks very much for your help and feedback!

@darrenldl
Copy link
Contributor

darrenldl commented Jan 27, 2021

@gasche Just in case it's of use to similar breaks in future for other projects: @Drup experimented with the code generation and concluded that the stack overflow crash on flambda (4.11.1) seemed to be related to number of expressions rather than sizes of literals (if I didn't misunderstand).

@gasche
Copy link
Author

gasche commented Jan 27, 2021

[minor note] I don't understand what changed in the generation strategy by looking at the PR #17, simplified explanations of the generated code shape (as I did above) would help. My impression is that the PR both changes the generation strategy and undoes some data reduction that you performed to help, and I don't know how to get readable diffs from github for the data-generation-strategy change only (trying to look at the output rather than generator code).

@Drup
Copy link
Contributor

Drup commented Jan 27, 2021

After several experiments, I concluded that the sheer number of constant sub-expression is the real problem. Flambda lifts everything to toplevel and has a traversal on toplevel expression that is not tail-rec, and crashes.

I decided upon a fairly radical solution: we build a store of the data in advance (in sexp), which we distribute with the sources. At build time, we load the data, put it into the ocaml data-structure we want in the end, and Marshall that in a new ML file. It works fine, it's more compact and is faster to load.

@gasche
Copy link
Author

gasche commented Jan 27, 2021

It does indeed sound nicer. But how do you locate the data at runtime, is it just in the install data of the library? (This makes the system less relocatable, but I guess that's okay; we don't really support static-linking anyway.) Or are you considering playing with embedding the string in the program?

@Drup
Copy link
Contributor

Drup commented Jan 27, 2021

At build time, we create a module whose only content is let s = "...."(containing the marshalled data), and we link it in the library.

@Drup
Copy link
Contributor

Drup commented Jan 27, 2021

Side benefit, after this whole ordeal (and some elbow grease), the size of the generated file is 2.5Mo instead of 4Mo. I don't think we can make it smaller without getting out succinct data-structures. Unboxed arrays would help a tiny bit though.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants