-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Add cplugins and add a configure option -fPIC
#668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Building the runtime with -fPIC by default probably incurs a performance penalty (as does building OCaml code with PIC by default on x86-64). I tend to think we shouldn't do it in the longer term, instead favouring a proper cross-compilation solution as is starting to evolve in #620 and #634. In the short term maybe it would be reasonable to make PIC the default, but the penalty should be measured. |
byterun/caml/misc.h
Outdated
#define CAML_CPLUGINS_CHDIR 6 | ||
#define CAML_CPLUGINS_GETENV 7 | ||
#define CAML_CPLUGINS_SYSTEM 8 | ||
#define CAML_CPLUGINS_READ_DIRECTORY 9 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Wouldn't it be wise to have some sort of CAML_CPLUGINS_LAST_PRIM
definition giving the highest primitive number? This would allow users to distinguish unexisting primitives from primitives that they don't support but exists in the current OCaml versions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good idea !
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, it should be a value, not a macro, for a plugin to be able to access it at runtime. Or maybe passed as a parameter to the plugin init function (I am thinking about passing a record to the plugin with a variable number of fields depending on the version of OCaml).
I did a quick review (not in depth) of the patch and the code seems ok. I'm a bit surprised by the idea of passing plugins by an environment variable. It would seem natural to use a command-line parameter instead -- wasn't the |
@gasche: The idea here is to use C plugins on a set of programs (for example, to watch a complete build without modifying it). For that, the env. variable is the best. I don't think using |
@lefessan Can you comment as to why LD_PRELOAD does not suffice for the interception of C library calls? |
@mshinwell Indeed, using -fPIC probably decreases performances, but the slowdown is normally rather small, most people won't care about it, and people would care can easily disable cplugins to remove the default fPIC. I think the overall benefit of having the full power of dynamic linking is greater for most people (it allows them to bundle OCaml programs as dynamically linked libraries into foreign applications). |
@mshinwell |
Since http://caml.inria.fr/mantis/view.php?id=6693 , an fPIC variant of the native runtime is built. Is it easy enough to use it in practice? It would be good to actually measure the overhead of fPIC in the runtime system. It it is really small (and my guess is that it is), it would simplify the life of users and speed-up the compilation of OCaml itself to stop supporting the non fPIC runtime. (One could also make fPIC the default and let people disable it at configure time, or build a non-fPIC variant.) |
@alainfrisch
If |
@alainfrisch Even if it's 1 or 2%, I don't think we should be deprecating the support for non-PIC. (Bear in mind that even with all the work on flambda, we're still only getting 10% improvement for some software at the moment, so 2% is fairly significant.) In my view the right answer is to support various combinations properly as per #620. @lefessan Do forgive me, but your previous answer was sphinx-like. Can you go into more detail about the motivation behind this patch? What are the "other reasons"? It isn't clear to me exactly which calls need to go through this mechanism and which do not, and surely that needs to be pinned down if the division is to be sensibly maintained into the future. |
I'm worried by this idea of building all possible variants upfront. Just for the runtime system, we'll have in a few months: debugging, frame-pointers, fPIC, spacetime, afl, multicore. So already 64 versions of asmrun/ to compile/install for every build of OCaml? At some point, there is a tradeoff between the runtime performance and its cost in terms of complexity/performance in the code base, build system, and user experience. (If only because making the developers experience smoother gives them more time to profile/optimize their code.) I believe that 2% is insignificant for the vast majority of users. I don't see the point in comparing this with the gains obtained by flambda: making flambda better or worse does not make these 2% more or less significant. Anyway, doing some actual benchmarks would be useful to drive the discussion (0.5% is not the same as 4%, for instance). |
Again, the solution is probably in OPAM and having different switches, the only problem is to find a nice way to pass options to the configure script of a new OPAM switch. @mshinwell I mean that, for example, |
@alainfrisch You don't necessarily need to build all (or even any) combinations. The aim is to provide a proper framework that enables people to build what they need, whilst at the same time eliminating special cases (as we have at the moment for PIC, gprof, etc). Ideally it would work in some modular way, so if a user finds a requirement for a different set of flags later, just that portion can be built without having to rebuild the whole OCaml system again. It's worth noting that the proposed functionality is pretty close to what GCC has provided for many years in the form of multilib support. |
@lefessan If you're specifically concerned with the OCaml code, have you considered instrumenting the "external" calls themselves? For example, some kind of attribute could be added that indicates the call should be redirected via a wrapper if it is present. It's maybe slightly less fine-grained, but perhaps that doesn't matter, and it might be more straightforward overall. Also, what happens if the OCaml program includes its own C bindings? Should they be instrumented too? (I presume that would need LD_PRELOAD or similar.) |
@mshinwell don't you think one would have to build all combinations at
least for continuous integration?
|
@mshinwell Yes, I considered intercepting externals, but I had the feeling that it would require more work: for example, I had also considered virtualizing the file-system at the OCaml level directly: for example, have a record with all file-system related externals, that would be used by all Pervasives and other modules functions. Then, the user would be able to take the current record, and replace it by its own record. However, again, there are problems with some externals, for example |
@shindere Not for the default I wouldn't have thought. A less frequent check of all of them would seem fine. |
@lefessan Is recording which externals get called sufficient, though (i.e. record Sys.foo rather than the C library calls that "foo" uses)? Again I'm not exactly sure of the target application. I missed a point earlier: an OPAM solution involving configure options is not sufficient. I think I covered that on the other GPR. |
@mshinwell No, it's not just about recording the call (and we don't really care about having the backtrace or just the caller), it's also about intercepting it. For example, we might want to implement a "replay" plugin: in "record" mode, it will monitor what an OCaml program is accessing, saving every file that is opened by the program somewhere, so that in "replay" mode, it will provide the previous files to the |
I see. I still don't really understand how it's supposed to work with libraries outside the stdlib though. Isn't it the case that many programs that do interesting I/O things will be doing them via some external library that isn't instrumented in the way you propose? |
I'm also trying to understand how this might work (since the overall functionality being proposed is interesting, but the patches seem to be coming in piecemeal). As a concrete example, moby/vpnkit#69 replaces most of the I/O in Docker for Mac/Win with a libuv based implementation. Could these C calls also be intercepted with the proposed patch? |
For now, my own use cases only need the primitives in the patch, but indeed, it might become interesting in the future to provide a way for libraries to extend the current mechanism for their own stubs. We will always find cases that the proposed mechanism cannot handle, but the idea for me is to provide a simple way to handle the majority of simple cases, and then let more complex cases be handled by more complex solutions, such as |
@avsm I am not sure if it would catch all the I/O that |
@lefessan About PIC, you need to do the benchmarking that both @mshinwell and @alainfrisch have requested. |
@damiendoligez I don't really understand why benchmarking is important here. This patch only changes the default, it does not prevent people for who performance is important to change their own settings to no-fpic, recovering the same performance as before. |
I'm confused. This is about runtime of compiled programs, not compilation time. Each users will have a different sensitivity to runtime performance. In the next release, assuming -fPIC becomes the default, one will need to tell people that they can disable it "if they want". This will be much more user friendly if this information comes together with some indication of the slowdown to be expected if they don't. But if the slowdown is really 5% or 10% (which I doubt), the discussion about making it the default becomes a bit different from the one if it is below 1%. |
Running
Surprisingly, |
I just want to point out that we are mostly at the end of that era, barring a very surprising new technological development. The technology that allowed this speed gain over the last 20 years has matured, and it can no longer give us serious speed improvements. Thinking about speed is now becoming more important. |
@bluddy I was more discussing the "default" settings of OCaml, i.e. what trade-off we should provide by default, for the newcomer, and I think the performance is good enough now, that we could degrade it a little for the benefit of other features, such as the extensibility of the system. This PR does not degrade the performance of OCaml when compiled without I was also discussing yesterday with a time-traveler, who told me something new was coming, and that we shouldn't worry too much for speed improvements in the next 13 years. But I have to keep my mouth shut on it ! |
Yay! I generally agree that PIC is the way to go, btw. It was a nitPIC. |
63719ed
to
006bc0c
Compare
@lefessan: of course there is a lot of noise in these measurements, because using the PIC runtime system changes code placement, impacting performance in a random manner. My own quick benchmarking (on KB and my other favorite small benchmarks) is similarly noisy, but suggests that the PIC runtime degrades performance by about 2% on average. This is for x86 64 bits. I'd expect more degradation for x86 32 bits and for PowerPC 32 bits, which lack hardware support for PC-relative addressing. |
There is too much rethorics in this discussion. Trying to stick with facts:
I'd sugges that @lefessan continues his experiments with plugins and virtualization using |
006bc0c
to
6346455
Compare
-fPIC
6346455
to
6a83bdd
Compare
Ok, I removed the |
CAMLextern int caml_read_directory(char * dirname, struct ext_table * contents); | ||
|
||
|
||
#ifdef CAML_INTERNALS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it on purpose that definitions just above (caml_ext_table) are no longer protected by CAML_INTERNALS?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there is a comment line 218 explaining that, because we intercept and must be able to call caml_read_directory
.
The current proposal has been reviewed, and there is no more the objection about |
I'm curious: has there been a discussion somewhere about whether we want this form of plugin interception, or for example another interface to extend the compiler capabilities? There has not been in this PR thread, but maybe in a developer meeting? |
@gasche I am not sure I understand your question, but Mark asked if the same result could be achieved using annotations on externals, and I also suggested the use of an OCaml record to intercept calls to the file-system. However, both approaches would be much more complex to implement, since some C externals are actually performing the work of other externals (for example, |
This plugin functionality doesn't seem particularly useful outside of a very narrow usecase involving code that exclusively uses the standard library, as far I can tell from the answer above to my and @mshinwell's query. Is the interface at least experimental, or are these plugins expected to be supported forever with their current interface? |
Add cplugins and add a configure option `-fPIC`
Add the ability to link C plugins into the runtime, which can have many purposes, with a virtualization layer on file-system calls (to be able to monitor/intercept them also for many purposes). Add an option
-fPIC
to./configure
to the default runtime with -fPIC.