Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bytecode -custom supply missing externs #12791

Merged
merged 1 commit into from Dec 3, 2023

Conversation

shayne-fletcher
Copy link
Contributor

Since ocaml-5.1.0, bytecode -custom executables fail to link if given -cc a C++ compiler.

$ cat > hello_world.ml
let () = print_string "Hello world!\n"
^D
$ ocamlc.opt -o hello-world ./hello_world.ml -custom -cc "clang++"
ld: Undefined symbols:
  _caml_builtin_cprim, referenced from:...
  _caml_names_of_builtin_cprim, referenced form:...
...

This is a consequence of certain symbols in generated code being assigned internal linkage by default. This PR addresses the issue by arranging to qualify these constants as extern. Fixes #12790

@xavierleroy
Copy link
Contributor

I have a problem with this fix: it produces dubious C code. At least according to gcc and to clang, an extern declaration with an initializer is a likely error in C:

● cat foo.c
extern const char * x[2] = { "a", "b" };
● clang -std=c11 -Wall -c foo.c
foo.c:1:21: warning: 'extern' variable has an initializer [-Wextern-initializer]
    1 | extern const char * x[2] = { "a", "b" };
      |                     ^
1 warning generated.
● gcc -std=c11 -Wall -c foo.c
foo.c:1:21: warning: ‘x’ initialized and declared ‘extern’
    1 | extern const char * x[2] = { "a", "b" };
      |                     ^

@shayne-fletcher
Copy link
Contributor Author

i'll refine the fix. let's make extern conditional on #ifdef __cplusplus.

@shayne-fletcher
Copy link
Contributor Author

shayne-fletcher commented Nov 27, 2023

i'll refine the fix. let's make extern conditional on #ifdef __cplusplus.

i've pushed and verified that change. the generated code now reads like,

#if defined __cplusplus
extern const c_primitive caml_builtin_cprim[] = {
#else
const c_primitive caml_builtin_cprim[] = {
#endif

it's taking a long time for the changes to get reflected here. i suspect github is having an incident (downdetector seems to indicate so).

@shayne-fletcher shayne-fletcher force-pushed the missing-extern-fix branch 3 times, most recently from 9445317 to 4586466 Compare November 27, 2023 20:22
@MisterDA
Copy link
Contributor

Two suggestions: you could also put just the extern under the #ifdef __cplusplus:

#if defined __cplusplus
extern
#endif
const c_primitive caml_builtin_cprim[] = {

or separate declaration and definition:

extern const c_primitive caml_builtin_cprim[];
const c_primitive caml_builtin_cprim[] = {

both of these work with C and C++.

@shayne-fletcher
Copy link
Contributor Author

shayne-fletcher commented Nov 27, 2023

... you could also put just the extern under the #ifdef __cplusplus

i agree. i think (1) is ever so slightly less verbose and the best phrasing so far.

#if defined __cplusplus
extern
#endif
const char * const caml_names_of_builtin_cprim[] = {

have pushed that. thanks for having a look @MisterDA !

@MisterDA
Copy link
Contributor

MisterDA commented Nov 27, 2023

I wonder if the symbol should be protected against C++ name mangling with extern "C". Not sure if variable names are mangled.

@shayne-fletcher
Copy link
Contributor Author

shayne-fletcher commented Nov 27, 2023

I wonder if the symbol should be protected against C++ name mangling with extern "C". Not sure if variable names are mangled.

the symbols get C mangling correctly as things are (_caml_builtin_cprim, _caml_names_of_builtin_cprim). this is because in the case of C++, all the definitions of a 'foo.camlprim.c' are made to reside in an extern "C" block.

@Octachron
Copy link
Member

@MisterDA having a look at the change this seems mostly a safe change that could be backported to 5.1, what is your opinion ?

@xavierleroy
Copy link
Contributor

I still think it is wrong to compile C code contained in a .c file as if it were C++ code. Actually, clang++ warns about that:

$ clang++ -c foo.c
clang: warning: treating 'c' input as 'c++' when in C++ mode, this behavior is deprecated [-Wdeprecated]

ocamlc -custom generates a temporary C file that must be compiled by a C compiler.

@MisterDA
Copy link
Contributor

I guess the correct invocation would be c++ -x c, so ocamlc.opt -o hello-world ./hello_world.ml -custom -cc "clang++ -x c".

@shayne-fletcher
Copy link
Contributor Author

It is indeed wrong to compile a C file with a C++ compiler. But... you need to compile a C file, then link C++ objects. That requires two distinct command lines. The fact that currently OCaml uses the linker to compile a C file is the root cause, as it mixes the compilation of one language (C) with the linking of another (C++). One option would be to have two distinct command lines, and compile the .c file in a separate operation. That seems a more invasive change than this diff.

We did try -x c but that causes any object files on the command line to be treated as C files, which causes all kinds of problems.

@shayne-fletcher
Copy link
Contributor Author

We did try -x c but that causes any object files on the command line to be treated as C files, which causes all kinds of problems.

$ cat > hello_world.ml
let () = print_string "Hello world!\n"
^D
$ cat > foo.cpp
void f() {}
^D
$ clang++ -c -o foo.o foo.cpp
$ ocamlc.opt -c -o hello.cmo hello_world.ml
$ ocamlc.opt -o hello hello.cmo foo.o -custom -cc "clang++ -x c"
foo.o:1:1: error: source file is not valid UTF-8
<CF><FA><ED><FE><U+000C><U+0000><U+0000><U+0001><U+0000><U+0000><U+0000><U+0
...

@xavierleroy
Copy link
Contributor

xavierleroy commented Nov 29, 2023

you need to compile a C file, then link C++ objects.

Thanks for pointing this out. For the linking phase we need g++ or clang++, but then they interpret .c source files as C++ files. I agree we're kind of cornered here...

@Octachron
Copy link
Member

@xavierleroy , does your comment count as a reluctant approval for this PR ?

Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with the proposed changes to bytecomp/symtable.ml. Some suggestions below.

Changes Outdated Show resolved Hide resolved
runtime/gen_primsc.sh Outdated Show resolved Hide resolved
Copy link
Contributor

@xavierleroy xavierleroy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @shayne-fletcher for the updates. I'm fine with the current state of this PR. Now merging!

@xavierleroy xavierleroy merged commit 16969c2 into ocaml:trunk Dec 3, 2023
1 check passed
@shayne-fletcher shayne-fletcher deleted the missing-extern-fix branch December 3, 2023 18:45
@Octachron
Copy link
Member

@xavierleroy , I am considering cherry-picking to 5.1 to avoid a feature hole between OCaml 5.0 and OCaml 5.2. Any objections?

@xavierleroy
Copy link
Contributor

No objections, thanks for asking.

Octachron pushed a commit that referenced this pull request Dec 5, 2023
Fixes: #12790

Co-authored-by: Shayne Fletcher <shaynefletcher@meta.com>
(cherry picked from commit 16969c2)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[ocaml-5.1.0]: Bytecode -custom exes fail to link with cc a C++ compiler
4 participants