-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[RFC] custom runtimes & reproducible builds #1845
Conversation
I like the idea. However, why "camlprimc" as a name? I find it kind of confusing... |
Basically, lack of imagination... the name of the temporary file being Suggestions are welcome! |
Xavier Clerc (2018/06/19 05:45 -0700):
… Basically, lack of imagination... the name of the temporary file being `"camlprimXYZ.c"`.
Suggestions are welcome!
-dprims? -dprimtbl? -dprimitives?
|
I can see some use for keeping the temporary file, for debugging purpose, but if you are only interested in reproducible builds, why do you conflate the idea of using a stable name with that of keeping the temporary file around? I suspect that, similarly, you might want a way to use a stable name for the "startup" object file, without necessarily keeping the file itself after compilation. |
-dstabletmps ? |
Would it be risky to always use a name derived from the target? |
@alainfrisch I understand (and partially share) your concerns. When you suggest to always derive the file name from the |
I was thinking indeed about creating the file in the target folder. Alternatively, one could create a temporary directory, and create files with stable names in it -- or does the full path appear in the object file? |
As far as
However, I think it will cause other problems; e.g. if your |
An alternative would be to use the The patch would be something like the following (plus configuration
|
I think we may want both:
The question is: in the cases where we were not asked from the user to preserve the file, should we generate it in a stable position and then remove it, or should we generate a random temporary file and stabilize post-facto? I think both option have merits (eg. in case the compiler fails abruptly at this point, maybe having the file around is actually nice?). |
For most build systems the latter is better. For instance, a command that evaluate the glob |
From both @alainfrisch and @gasche feedback, I understand that As a consequence, I would propose to extend this PR with the |
How portable is the flag? Does it work with slightly older GCC versions, does it really work with Clang, and what about the Windows world? Wouldn't it make sense to check its availability at configure--time? I think that we could enable it by default on all systems that support it. |
From what I see, the flag was introduced in gcc 4.3.0.
It does (tested on macOS by @diml).
Indeed, that's what I proposed -- just wanted to gather feedback
I concur; in theory, some people might rely on the fact that each |
(I just noticed that the |
I have updated this PR with the detection and use of |
bytecomp/bytelink.ml
Outdated
@@ -536,8 +536,13 @@ let link_bytecode_as_c ppf tolink outfile = | |||
|
|||
let build_custom_runtime prim_name exec_name = | |||
let runtime_lib = "-lcamlrun" ^ !Clflags.runtime_variant in | |||
let debug_prefix_map = | |||
if Config.c_has_debug_prefix_map then |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably not pass this option if -dcamlprimc
is passed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure; it makes the build reproducible even if you change
the output name.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realized that my definition of "reproducible" was slightly
broken: I understood it as "produces the same contents"
while it is "produces the contents in the same file".
configure
Outdated
@@ -409,6 +409,7 @@ export cc verbose | |||
# Determine the C compiler family (GCC, Clang, etc) | |||
|
|||
ccfamily=`$cc -E cckind.c | grep '^[a-z]' | tr -s ' ' '-'` | |||
cc_has_debug_prefix_map=`$cc -E cc_has_debug_prefix_map.c | grep '^[a-z]'` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What about just passing -fdebug-prefix-map
and seeing if the compiler fails or not?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I reused the logic from config/auto-aux/cckind.c
.
Actually, a third possibly would be to do all the work
in the aforementioned file.
I don't have a strong opinion.
If this is a concern, one could use a different extension (*.tmp) -- or no extension at all -- for the temporary file and pass Also, since only the basename seems to matter, one could create a temporary directory and put the file in it. |
@alainfrisch Another possibility we briefly considered was to |
Apprently not possible with msvc (https://social.msdn.microsoft.com/Forums/vstudio/en-US/c6ee0a6c-2354-4fe7-9723-366f886c3c90/about-compilercl-can-input-a-source-code-to-clexe-from-stdin?forum=vclanguage). But how would you do that without the Unix library, anyway? |
Thanks for the pointer; however, a decade later, the feature
Sorry I was not clear: we would generate a file, and then basically (By the way, doesn't your directory-creation-based solution suffer |
Ok, understood. Is this much better than creating the temporary file in the target folder, with a name derived from the target, and without the .c extension?
Indeed, I did not realize that the stdlib does not allow creating a directory, ... |
I'm a bit confused because I am not sure what problems of the proposed solution(s) @xclerc and @alainfrisch are discussing. My understanding of the current decisions are as follows:
As far as I can tell this describes the current state of the PR which sounds fine to me, modulo the discussion on how to best implement debug-prefix-map support detection in Is the rest of the discussion trying to find other approaches that are more portable to compilers which do not support (It may be useful to review the other temporary files produced by the compiler and see whether the same approach can be used. In particular, are there other C files that could benefit from the debug-prefix-map treatment?) |
Yes, I think this is about finding an alternative, perhaps simpler and more portable solution. Also, if we add the possibility to tell the compiler to keep the file around, it would just amount to not deleting it. This seems simpler and more robust than having two different ways to generate the file name and to compile it depending on whether the file should be kept around for debugging purposes. |
No, it is not; it only allows you to get reproducible builds with |
Btw https://gcc.gnu.org/onlinedocs/gcc/Debugging-Options.html says:
This option seems to be used here to map the full filename, while the text above mentions only mapping directory names. Is that a documented/stable-enough behavior? |
Note also the hint about the more recent -ffile-prefix-map, which would be needed if the temporary C code used |
Note that this is also not completely portable. See #568 for a discussion. |
I will have a look - I will also (in another PR), use |
I did not use
|
… a non-C output.
I did indeed check for temporary files. The native part of the
I have updated this PR accordingly. |
I have changed the detection logic to follow @diml advice. |
raise x | ||
end else begin | ||
let basename = Filename.chop_extension output_name in | ||
let temps = ref [] in | ||
let c_file = | ||
let c_file, stable_name = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that stable_name
should be an option
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Possibly, however it is not an option in build_custom_runtime
(one can only avoid passing -fdebug-prefix-map
by keeping
the generated file).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I misinterpreted your comment as command-line option
rather than string option
parameter -- I have pushed a new
version.
@alainfrisch regarding portability, I admit that I don't know what other C compilers are doing and whether they are affected at all by this issue. For instance does the msvc compiler stores absolute paths in object files? It seems to me that the proposed solution is non-invasive: it doesn't change the way things are compiled and simply pass the right option to the C compiler to make builds more reproducible, so I'm in favor of merging this PR as it.
We tested it with both gcc and clang, so I tend to think we can rely on the current behavior. |
The current patch looks good to me. @xclerc can you add a changelog entry? |
Entry added. |
When using the
-custom
option ofocamlc
, a C file containingthe information about primitives is generated. The file name
comes from a call to
Filename.temp_file
, meaning that successiveinvocations will generate different file names.
The problem, with respect to reproducible builds, is that the file
name ultimately appears in the produced binary, making the build
non-reproducible.
This PR introduces
-dcamlprimc
, a command-line switch similarto
ocamlopt
's-dstartup
, to keep the generated file (whose nameis then derived from the specified compiler output).