Skip to content

Explicitly compile the primitives object before linking bytecode executables#12896

Merged
dra27 merged 5 commits into
ocaml:trunkfrom
dra27:compiled-primitives2
Jan 16, 2024
Merged

Explicitly compile the primitives object before linking bytecode executables#12896
dra27 merged 5 commits into
ocaml:trunkfrom
dra27:compiled-primitives2

Conversation

@dra27
Copy link
Copy Markdown
Member

@dra27 dra27 commented Jan 15, 2024

This PR forms a small part of the "Relocatable OCaml" series of patches, but is opened at the moment because it is a solution to a problem affecting the restoration of the MSVC port of OCaml. The PR is virtually impossible to review as a whole - please see after the background for advice on reviewing commit-by-commit.

Background

When compiling a custom bytecode runtime, a C file is generated containing the combined primitives table of the runtime (camlprim.c). Presently, this is passed directly to Ccomp.call_linker and is written in a way which avoids using any of the C headers. This has resulted in increasing amounts of code duplication - the typedef for intnat (already in caml/config.h) has to be inferred and the linker command line in Bytelink has to handle -fdebug-prefix-map, duplicating logic already in Ccomp to handle this. The duplication gets worse with the relocatable compiler patches.

The MSVC port of OCaml at present requires -experimental:c11atomics -std:c11 to be added in order to enable the experimental C11 atomics support which we rely on. While this doesn't affect the compilation of camlprim.c in -custom mode (because of the duplicated typedef), it renders the C file for -output-obj, which does include header files uncompilable. The issue affecting MSVC is that Ccomp.call_linker quite rightly does not pass Config.ocamlc_cflags to the compiler.

Changes

This PR alters ocamlc's link process so that primitives file is explicitly compiled using Ccomp.compile_file. That eliminates the duplicated debug prefix map code and also allows caml/mlvalues.h to be used to get all the required definitions. Ccomp.compile_file obviously does pass Config.ocamlc_cflags, which fixes the problem for the MSVC port.

For the build system, this means when building with -custom that we must ensure that runtime directory has been included with -I so that the headers are available (the build system does not use -output-obj or any of the other modes which already require access to the headers, which is why this has not been necessary before).

Although the statistics for the PR may therefore hide it, the overall effect is to reduce code/logic duplication 🙂

Reviewing

Given that this PR alters some of the C files generated by the bytecode linker, the opportunity is being taken to modernise some of this code a little bit. Since #5146, various C files put \n at the beginning of lines which, while fixing the issue introduced in OCaml 3.12 by the unescaped newline warning, leaves the intent of the code somewhat obscured. Now that #12514 is merged, we can instead take advantage of OCaml 4.02's quoted strings, making both the C code being emitted, and the formatting of the lines, somewhat less obscure. I was partly motivated to put this change in having misread \n markings and initially generated syntactically invalid code while testing this.

The first commit is intended to result in no changes to the C files generated, which means that the very noisy diff can be tested more easily with:

$ echo 'print_endline "hello, world"' > hello.ml

$ runtime/ocamlrun ./ocamlc hello.ml -I runtime -I stdlib -dcamlprimc -custom -o hello
# hello.camlprim.c should be identical to the version generated by trunk

$ runtime/ocamlrun ./ocamlc hello.ml -I runtime -I stdlib -output-obj -o hello.c
# hello.c should be identical to the version generated by trunk

The second commit then applies a few very minor changes to the C files - the C++ guard is consistently moved to the start of the file, and a few extra blank lines are added

Difference in generated C file
--- hello.c
+++ hello.c
@@ -1,18 +1,19 @@
-#define CAML_INTERNALS
-#define CAMLDLLIMPORT
-
 #ifdef __cplusplus
 extern "C" {
 #endif
+
+#define CAML_INTERNALS
+#define CAMLDLLIMPORT
+
 #define caml_get_public_method caml__get_public_method
 #define caml_set_oo_id caml__set_oo_id
 #include <caml/mlvalues.h>
 #include <caml/startup.h>
 #include <caml/sys.h>
 #include <caml/misc.h>
-
 #undef caml_get_public_method
 #undef caml_set_oo_id
+
 static int caml_code[] = {
 0x00000054, 0x000002df, 0x00000000, 0x00000057, 0x000f0001, 0x00000010,
 0x00000013, 0x0000001c, 0x00000025, 0x0000002e, 0x00000037, 0x00000040,
@@ -1828,7 +1829,9 @@
 extern value caml_weak_get_copy(void);
 extern value caml_weak_set(void);
 extern value caml_xdg_defaults(void);
+
 typedef value (*c_primitive)(void);
+
 #if defined __cplusplus
 extern
 #endif
@@ -2270,6 +2273,7 @@
   caml_weak_set,
   caml_xdg_defaults,
   0 };
+
 #if defined __cplusplus
 extern
 #endif

The third commit is then the main part of the change - when looking at the diff in Bytelink.build_custom_runtime, especially for the deleted code, it is worth having Ccomp.compile_file side-by-side (i.e. by inspection one should be able to confirm that the C file is then being compiled in a similar way to how it was being linked before).

The final commit, proposes an alternate mechanism for dealing with the caml_get_public_method and caml_set_oo_id primitives in mlvalues.h to the one which was added in #12509. It is an optional addition, but motivated by the fact that the guarded_primitives mechanism now appears in two places in bytelink.ml.

Copy link
Copy Markdown
Contributor

@lthls lthls left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me.
I think it's clearly better to not rely on the linker to compile C files.
I believe that we can also remove Ccomp.linker_is_flexlink now.

Comment thread bytecomp/bytelink.ml
let flag =
[Printf.sprintf "-fdebug-prefix-map=%s=camlprim.c" prim_name]
in
if Ccomp.linker_is_flexlink then
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this was the only use of Ccomp.linker_is_flexlink

Copy link
Copy Markdown
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, indeed it is! Another victory - no longer needing a function to work out if the C compiler pretending to be a linker is in fact a linker pretending to be a C compiler 🤯

MisterDA and others added 5 commits January 16, 2024 14:56
Co-authored-by: David Allsopp <david.allsopp@metastack.com>
Introduces a few convenient blank lines to improve the readability of
camlprim.c and the C file generated for `-output-obj`.
In bytecode, when compiling a custom runtime (when using any of -custom,
-output-obj, -output-complete-obj or -output-complete-exe), the
camlprim.c file is passed directly to the linker. When this was
originally done, the cost was a single typedef for the value type, but
now this process involves considerable code duplication both for the
-fdebug-prefix-map implementation and for the full definition of the
value type.

The primitives file is now explicitly compiled, which means it gets
treated in the same way as a C file passed to ocamlc and in particular
can `#include <caml/mlvalues.h>` to remove the definitions otherwise
duplicated with <caml/config.h>. Using `Ccomp.compile_file` also allows
the duplicate machinery for `-fdebug-prefix-map` to be deleted from
Bytelink.
Instead of the `guarded_primitives` mechanism in bytelink.ml, don't
declare the primitives `caml_get_public_method` and `caml_set_oo_id`
in mlvalues.h when `CAML_INTERNALS_NO_PRIM_DECLARES` is defined so
that it can be included as-is in `camlprim.c` (where those primitives
are declared at a very generic type instead of their real type)
without needing `#define`/`#undef` wrappers.
Since we no longer pass C files to the "linker"!
@dra27 dra27 force-pushed the compiled-primitives2 branch from d119e81 to 5521385 Compare January 16, 2024 15:00
@dra27
Copy link
Copy Markdown
Member Author

dra27 commented Jan 16, 2024

Thanks @lthls! Ccomp.linker_is_flexlink removed, and Changes updated.

@dra27 dra27 merged commit b3b892e into ocaml:trunk Jan 16, 2024
@dra27 dra27 deleted the compiled-primitives2 branch January 16, 2024 15:42
dra27 added a commit to dra27/ocaml that referenced this pull request Feb 1, 2024
Explicitly compile the primitives object before linking bytecode executables

(cherry picked from commit b3b892e)
dra27 added a commit to dra27/ocaml that referenced this pull request Feb 1, 2024
Explicitly compile the primitives object before linking bytecode executables

(cherry picked from commit b3b892e)
@dra27 dra27 mentioned this pull request Feb 1, 2024
2 tasks
dra27 added a commit to dra27/ocaml that referenced this pull request Feb 13, 2024
Explicitly compile the primitives object before linking bytecode executables

(cherry picked from commit b3b892e)
shym added a commit to shym/ocaml that referenced this pull request Jul 10, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Jul 16, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Jul 17, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Jul 19, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Jul 19, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Aug 20, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Aug 20, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Sep 25, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Sep 30, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Oct 2, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Oct 2, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
shym added a commit to shym/ocaml that referenced this pull request Oct 3, 2024
The `runtime` directory must be `-I`ncluded only since ocaml#12896 and only
in ocamltest.
When building an OCaml cross compiler, two OCaml compilers are really
involved, where the non-cross compiler is used to build the cross one.
Most cross-compiler projects do that by overriding variables such as
`CAMLOPT` to point to the non-cross compiler during the build of the
cross compilers. In these use cases, adding the explicit `-I runtime`
makes them generate the cross compilers linking in the cross runtime
(which naturally fails) instead of the build/host runtime that the
non-cross compiler would use without `-I runtime`. To re-enable those
use cases, this patch moves the addition only on `ocamltest/%` targets.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants