Join GitHub today
GitHub is home to over 36 million developers working together to host and review code, manage projects, and build software together.Sign up
Merge the runtime directories #1852
This PR merges the
This has been achieved in four steps, each of them being implemented
This includes giving better names to the variables listing files.
Also, this first commit makes sure the object files get a name which
Five files were present with the same basename under both
This step leaves
This mainly involves moving all the files from
As a sidenote: commit 3 leaves the repository in a state were
Alain Frisch (2018/06/21 06:30 -0700):
I like the change, but can you make the rationale explicit? What are the expected benefits?
Oh, sometimes I forget the simplest things, many thanks for your interest and the good question! Let me go to the smallest to the biggest things. 1. No need to do symlinks / copies any longer. 2. The build systems in these two directories were very smilar so there is a clear gain in gathering everything in the same directory. For example, the macro that compiles C files now factorizes not less than 10 make rules and will make future changes way easier than before (e.g. re-introducing the CFLAGS variable, etc;). 3. In the long run: I think there is a project / hope to bring the two runtimes closer to each other. Having them in the same directory will make such a change easier. 4. Should we want to add another runtime, I think it makes more sense to have everything in the same directory than to have to copy / duplicate it. It's cumbersome and difficult to maintain (error-prone when making changes).
I would suggest that we agree on a common suffix for native and bytecode files in the
As an aside, something that I trip up on frequently is that the header files are in a
Mark Shinwell (2018/06/21 06:55 -0700):
I would suggest that we agree on a common suffix for native and bytecode files in the `runtime/` directory -- there seem to be a couple of different conventions at present.
What do you mean? As far as I can tell, in runtime, those C files that are backend-specific are suffixed with either `_native` or `_bytecode`. What is it that you are referring to?
As an aside, something that I trip up on frequently is that the header files are in a `caml/` subdirectory. Could these just go directly in `runtime/` too? (Or if not, maybe we could rename `caml/` somewhere else -- possibly to `include/`, or even `runtime_include/` at the top level.)
I have no srong opinion so I'll let other speak. Let me just say that: - If a change is considered desirable, then I'd suggest to keep it for another PR - I kind of like the way it's done now because you cna `#include <caml/something.h>` both when you are working in the compiler and whenyou write bindings relying on the installed headers.
@alainfrisch: to complete what I wrote about the rationale, I think
Re: include files, we need them in a
Xavier Leroy (2018/06/21 10:12 -0700):
Re: include files, we need them in a `caml/` subdirectory so that we can write `#include <caml/mlvalues.h>`. `runtime/caml` is a better (easier to find!) place than `byterun/ocaml`,
It was `byterun/caml` and not `byterun/ocaml`, to be fair. In other words, this PR does not change anything re:the path to includes, except for the byterun->runtime renaming.
but I agree with @mshinwell that `include/caml` would be an even better place.
Are you suggesting an `include` folder at the toplevel of the source tree? I'm a bit skeptical, I have to say.
Actually I looked into this a while ago but didn't finish. Let's wait until the fate of this PR is decided before giving `include/caml` another try.
Well if there is an agreement that it's a better place I can definitely work on the change. But tome, `runtime/caml` actually makes a lot of sense given that the runtime is in C so I find it more than reasonable that the headers are there. But then it's true there is an `include` folder at the toplevel of Linux kernel's source tree, if that matters.
Xavier Leroy (2018/06/21 10:18 -0700):
xavierleroy commented on this pull request. Some not-very-consistent file names. > @@ -74,12 +74,12 @@ ASPPFLAGS += -DMODEL_$(MODEL) endif NATIVE_C_SOURCES := $(addsuffix .c, \ - startup_aux startup main fail roots signals signals_asm misc freelist \ - major_gc minor_gc memory alloc compare ints floats str array io extern \ - intern hash sys parsing gc_ctrl md5 obj lexing $(UNIX_OR_WIN32) printexc \ - callback weak compact finalise custom globroots backtrace_prim backtrace \ - natdynlink debugger meta dynlink clambda_checks spacetime \ - spacetime_snapshot afl bigarray) + startup_aux startup_native main fail_native roots_native signals \ + signals_asm misc freelist major_gc minor_gc memory alloc compare ints \ + floats str array io extern intern hash sys parsing gc_ctrl md5 obj \ + lexing $(UNIX_OR_WIN32) printexc callback weak compact finalise custom \ + globroots backtrace_prim_native backtrace natdynlink debugger meta \ + dynlink clambda_checks spacetime_native spacetime_snapshot afl bigarray) Why `fail_native` but `signals_asm` ?
Because I did the change in a conservative way, renaming only those files which were present in the two directories with the same base name and didn't touch the other ones.
It could make sense to have the same `_native` suffix for all files.
Sure! If this is what is desired I can definitely do that. Could that be part of a further PR, though?
> @@ -74,12 +74,12 @@ ASPPFLAGS += -DMODEL_$(MODEL) endif NATIVE_C_SOURCES := $(addsuffix .c, \ - startup_aux startup main fail roots signals signals_asm misc freelist \ - major_gc minor_gc memory alloc compare ints floats str array io extern \ - intern hash sys parsing gc_ctrl md5 obj lexing $(UNIX_OR_WIN32) printexc \ - callback weak compact finalise custom globroots backtrace_prim backtrace \ - natdynlink debugger meta dynlink clambda_checks spacetime \ - spacetime_snapshot afl bigarray) + startup_aux startup_native main fail_native roots_native signals \ + signals_asm misc freelist major_gc minor_gc memory alloc compare ints \ + floats str array io extern intern hash sys parsing gc_ctrl md5 obj \ + lexing $(UNIX_OR_WIN32) printexc callback weak compact finalise custom \ + globroots backtrace_prim_native backtrace natdynlink debugger meta \ + dynlink clambda_checks spacetime_native spacetime_snapshot afl bigarray) Re `backtrace_prim_native`: why not just `backtrace_native`?
Because, as you know, the prim is kind of the low-level part of backtrace, so I tried to be conservative here, too. But, if you think it's fine to get rid of the prim I can do that too. Again, my preference would be to do the name adjustments in a separate PR since that one has already been well tested. But if you insist I'll integrate the discussed changes here.
> @@ -86,16 +86,16 @@ PRIMS=\ alloc.c array.c compare.c extern.c floats.c gc_ctrl.c hash.c \ intern.c interp.c ints.c io.c lexing.c md5.c meta.c obj.c parsing.c \ signals.c str.c sys.c callback.c weak.c finalise.c stacks.c \ - dynlink.c backtrace_prim.c backtrace.c spacetime.c afl.c \ - bigarray.c + dynlink.c backtrace_prim_bytecode.c backtrace.c spacetime_bytecode.c \ + afl.c bigarray.c Likewise, `backtrace_bytecode` instead of `backtrace_prim_bytecode`?
See above, to be discussed.
BYTECODE_C_SOURCES := $(addsuffix .c, \ - interp misc stacks fix_code startup_aux startup freelist major_gc \ - minor_gc memory alloc roots globroots fail signals signals_byt \ - printexc backtrace_prim backtrace compare ints floats str array io \ - extern intern hash sys meta parsing gc_ctrl md5 obj lexing callback \ - debugger weak compact finalise custom dynlink spacetime afl \ - $(UNIX_OR_WIN32) bigarray main) + interp misc stacks fix_code startup_aux startup_bytecode freelist major_gc \ + minor_gc memory alloc roots_bytecode globroots fail_bytecode signals \ + signals_byt printexc backtrace_prim_bytecode backtrace compare ints \ + floats str array io extern intern hash sys meta parsing gc_ctrl md5 obj \ Likewise, why `signals_byt` but `roots_bytecode`?
Same response. Thanks a lot for having reviewed!
Thanks for your work on appropriate names for source files. I finished reading the PR and have a few more comments that I would like to see discussed before merging.
Xavier Leroy (2018/06/27 02:47 -0700):
xavierleroy commented on this pull request. Thanks for your work on appropriate names for source files. I finished reading the PR and have a few more comments that I would like to see discussed before merging.
Sure! Thanks for having taken the time to go through this tedious thing!
* The naming of object files `foo_libcamlrund.$(O)$`, etc, is quite * heavy. Befoer we made do with one letter suffixes `.d.$(O)`. Maybe * one letter is no longer enough but two letters would certainly be.
I have to say I am really not found of this approach. To be honest, what you call "heavy" I would call "clear". ;-) And I would even add "scalable". :)
* Alternatively, object files could be generated in subdirectories of `runtime/`. Then, it would make sense to use a full name such as `libcamlrund`, but as a directory name: `libcamlrund/foo.$(O)`. For one thing, it would reduce clutter in the `runtime/` directory. We did something similar with the `compilerlibs/` directory at top-level, which contains compiled files only.
That could perhaps work, yes. Would be interested in hearing other opinions, though, because deciding to put the object files in a different directory makes the build rules slightly more complex and there is also the fact that these directories will have to be created somewhere and it's not yet clear to me what would be the best place. More fundamentally, my opinion is that you are trying to solve a real problem but in a way which is not so nice IMO. I mean: the concern of keeping the source directories clean seems right to me. About how to achieve this, though, I think being able to build the whole compiler out of source tree would be a much better way to reach this point. I realise it is not going to happen now, but I do hope it will happen one day and, meanwhile, having `runtime` be "poluted" seems much less problematic regarding the maintainance than the situation as it was before this PR. But again I'm open to hearing from others on all aspects in general and this one in particular.
* I still dislike the GNU make kung-fu on lines 331 to 365 of `runtime/Makefile`. Spelling out the 9 rules for `.c` to `.$(O)` would be shorter and easier to read.
To read perhaps, but to change / maintain, it's not that obvioous to me. You may think that these rules are not going to change, but I think they will and that's why I introduced this macro. For example, to conform to the autoconf style, the build system should take into account variables like CFLAGS, CPPFLAGS, etc. When these variables will be introduced, if the rules are spelled out, the variables will have to be added to all the rules and there will be no guarantee by design that things are done in a consistent way, while this approach does offer the by-design guarantee.
I convinced myself that the same rules apply to
That is the extreme solution, and an overkill in my opinion. I don't mind some derived files lying around; I'm worried about having 9 derived files with super-long names for every .c source file.
Xavier Leroy (2018/06/27 06:22 -0700):
I convinced myself that the same rules apply to `name_suffix.$(O)` and to `directory/name.$(O)`.
Just tat one of the rules we have now is of the form %.$(O): %.c which could ultimately be shared by the whole compiler's build system (which is what I am trying to achieve actuallèy!), while the rules you propose are specific. Also, @damiendoligez mentionned that not all C compilers may support object files being created in a directory different from the source file.
Well, `make clean` needs another rm pattern.
Wouldn't be a big deal, I thinnk.
> and there is also the fact that these directories will have to be created somewhere and it's not yet clear to me what would be the best place. Just like `compilerlibs/`: the directories are part of the source tree and don't need to be created during build. Git cannot record empty directories, so just put a `.gitignore` in there so that the directory is not empty.
One other possibility would be to give the directory as a prerequisite for the different `.$(O)` files and to have `make` create it.
> build the whole compiler out of source tree That is the extreme solution, and an overkill in my opinion.
Well I'd say if several build systems agree on doing things that way it's maybe because there are good reasons to do so. I think this is explained e.g. in the GNU build system recommendations.
I don't mind some derived files lying around; I'm worried about having 9 derived files with super-long names for every .c source file.
So just to make things more concrete to those who did perhaps not take the time to go through the code, the names are of the form `SOURCE_LIBRARY.o`, where `SOURCE` it the basename of a `.c` source file and `LIBRARY` the basename of the library that source file will belong to.
@xavierleroy: I just pushed a commit that shortens the names of the
Jun 28, 2018
Jun 28, 2018
0 of 4 checks passed
FYI: CI fails on OpenBSD 32, tests/output-complete-obj/'test.ml' with 1.1.1 (ocamlc.byte). See https://ci.inria.fr/ocaml/job/main/526/flambda=false,label=ocaml-openbsd-32/console