New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make (nat)dynlink sound (also fixes MPR#4208, MPR#4229, MPR#4839, MPR#6462, MPR#6957, MPR#6950) #1063
Conversation
I think that RTLD_DEEPBIND could still be useful for the
This would be a useful rule to support (on platforms with RTLD_DEEPBIND or equivalent) since it allows you to load plugins without having to worry if they conflict with each other -- only whether they conflict with the main program. |
After looking at MPR#4208 it looks like the |
@lpw25 Yes, perhaps the rules could be refined to take into account the private case. However it isn't clear to me whether it's worth spending the time to implement and test that, especially if it only works on a subset of platforms. At the least I think we should probably get the basic fix in first before relaxing it, given how long some of these issues have been open. Other opinions on the benefit of a more precise treatment would be welcome. |
Thanks for getting the ball rolling on this long-standing issue. Maybe @maximedenes and the Coq dev team wish to follow the discussion, as Coq is a heavy users of plugins that don't respect any naming discipline :-) and will be broken by the proposed solution. |
@xavierleroy thanks for drawing my attention to this patch. I don't expect so much trouble because nowadays, Coq plugins are packed (as in |
@maximedenes Do you use natdynlink or the bytecode version? If testing, it's probably best to use the native version with this patch at the moment, since the bytecode version may need a few tweaks (though it should approximately work). |
What is the easiest way to get camlp4/camlp5 compiled for ocaml with this patch? It is required for testing Coq plugins. |
@maximedenes, for camlp4 I think the easiest way it is build it by hand from the git repository |
@maximedenes Has there been any progress on testing this patch? |
Oh! I thought I had sent a partial report, sorry. I could only devote a few minutes to it, but I can already say that this patch breaks the compilation of Coq's stdlib. It is surprising, but could be explained if the behavior of some static initializers (code like I can investigate more and report next week. |
@maximedenes Was that when running bytecode or native code? |
I tested a bit more, and the incompatibility I was mentioning is in fact already present in the base commit of this PR (sorry, I should have checked earlier). So not directly related to this PR, but it prevents me from testing properly. I'll try with 4.05 beta 3 and a recent 4.06 to see if rebasing this PR could help. |
@mshinwell I was running native code. The failure is in fact unrelated, so I'll try again with a rebased version of this PR. So far, testing did reveal that we link twice a camlp5 module, but it is easy to fix on Coq's side. |
Ok, so after rebasing, I confirm that modulo a small fix, Coq compiles fine with this PR, and plugins are usable. I tested only native code. |
OK, that's good news, thanks. I need to do some more careful work on the bytecode part of this PR, but will try to do that soon, and then it can be considered further. |
I did check various self loading situations and couldn't come up with any failing one. Not that there isn't any possibility, but this would certainly not appear by surprise. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This effectively checks the presence of previously loaded modules. A test exercising it might be useful. I can provide it.
I actually noticed a bug in this and the bytecode version needed some more work (as per previous conversation with @chambart ). Whilst looking at that the large amount of duplicated code made me think that we can probably make this more robust with a bit more refactoring and sharing of code, which I'm working on now. I will post here again once it's ready. |
@alainfrisch In the existing |
@chambart Would you be able to take a look at this version? This now passes the test suite although I need to do a pass of review myself. Adding the test case you suggest would be appreciated. This pull request refactors the This patch also removes the deprecated functions in the interface and the I hope that overall this produces a significant increase in robustness and maintainability for this library. I had a long struggle with the various makefiles and the requirement that programs using I would like to tidy the formatting of these files, which is currently rather inconsistent; this can be done as a separate changeset after the main review is done. The diff for this PR is probably best read using patdiff. |
Probably some left over of a work-around required at that time, where builtin exception global identifiers were recorded in ui_imports_cmx. FTR, this was introduced in: apparently to make Camlp4 compile. This can probably go away now. (On a related note: https://caml.inria.fr/mantis/view.php?id=3829 ) |
MPR#3829 appears fixed, so I've closed it. We also experimented the other day with various scenarios involving compilation units with the same names as predefined exceptions and couldn't get anything to fail. |
MPR#3829 is actually not fixed, I've reopened and added a slightly modified example. |
This explains why it is wrong for a dynlinked plugin to contain a copy of a module that is part of the main program. What about the case when two dynlinked plugins both contain a copy of the same module, which the main program does not contain? I would expect this to work when I load the plugins via Does this patch support such usage? |
I've noticed that an important piece of this patch appears to be missing; I will fix that and then think further about @stedolan 's comment. (I think the answer is that we could support such uses so long as the platform supports |
I don't understand why |
@dra27 I've fixed |
Merged, thanks everyone! |
Dear Ocaml devs, see https://caml.inria.fr/mantis/view.php?id=7876 which could be related to this PR. |
Looks like this PR broke the cygwin64 CI (which currently does not support dynlink). See https://ci.inria.fr/ocaml/job/main/flambda=false,label=ocaml-cygwin-64/833/console You can reproduce the bug on other architectures by configuring with AFAICT it's simply that the |
Ack |
@damiendoligez Please see #2197 |
This indeed did break old versions of Coq, including 8.9.1: #8868. |
I would like to offer a small nitpick and advice about backwards-compatibility. This commit removed the deprecated function [Dynlink.init]. It was commented as deprecated since 2008, however there had been no warning at compilation. The result is unexplained breakage of legacy code with no documented evolution path (see ppedrot/ocaml-melt#1). The adverse effect is that although the solution is trivial (remove the call), this results in disproportionate time spent by the non-author user looking up for a solution, reading git commits and github PRs, to find what is the appropriate action. Moreover, it seems that the issue was properly caught by the test As a suggested alternative, it would have been possible to have deprecation warnings in place in 2008 with suggested evolution path, or, even simpler for everybody, keep providing (cc @tabareau) |
This patch forbids dynlinking of compilation units that have already been loaded (either as part of the main program or via a previous dynlink). This appears to be sufficient to solve the problems described in the five Mantis issues in the title of this pull request.
In bytecode, it was previously possible to break type abstraction, since there is no implementation CRC for bytecode. In native, it was also possible to produce segmentation faults, via various means.
It has previously been suggested that platform-dependent solutions such as
RTLD_DEEPBIND
on glibc systems might be sufficient to solve these problems. @chambart and I discussed this and it appears not to be the case: for example suppose there is some moduleN
that generates fresh names.A
depends onN
in the main program. We dynlinkB
which also depends onN
. Further,B
passes values made inN
toA
. Whilst underRTLD_DEEPBIND
the code ofB
successfully references its own copy ofN
, it passes values toA
that could break invariants ofA
's copy ofN
if they were to flow into there---and they might. It seems safer to just be conservative.@chambart and I discussed whether this new restriction would be problematic for authors of plugin systems that might end up loading modules more than once. However we believe these problems can be solved by such systems using their own caching layer that can catch the new error value returned (
Dynlink.Error(Module_already_loaded ...)
).A further improvement would be able to effectively incorporate such caching into the dynlink system itself such that, for example, if a module had already been dynlinked then some handle to that module could be returned. Such a handle would naturally be a corresponding first class module. Some of this work is already in GPR #100, although that does not include the protections introduced by this pull request. @chambart will follow up about that.
It is possible that there remains one problematic situation: a module that tries to load itself whilst it is being initialised. @chambart is checking this now.