Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get rid of bfd optional dependency #9306

Closed
dbuenzli opened this issue Feb 14, 2020 · 6 comments
Closed

Get rid of bfd optional dependency #9306

dbuenzli opened this issue Feb 14, 2020 · 6 comments

Comments

@dbuenzli
Copy link
Contributor

dbuenzli commented Feb 14, 2020

In the context of RFC 7 we want to provide the ability for ocamlobjinfo to report about libraries required by archives.

For cmxs this is currently only possible when the external bfd library is available which is used by shelling out to a seperate objinfo_helper tool. Optional dependencies are however a hassle for testing, packaging and end-users.

So there is the idea of making ocamlobjinfo work on cmxs files by simply using the low-level caml_natdynlink_open function to lookup the metatadata. A POC can be found here. It should be noted than no OCaml code gets executed by using this function, the cmxs is just dynlinked and the symbol caml_plugin_header is looked up and umarshalled.

Looking at the history on how objinfo_helper came out to be it seems there was the desire in 2010 to have ocamlobjinfo as a "pure Caml part and a pure C part". This has been triggered by Windows and bootstraping build problems. The discussion seems to conclude that while possible to have C in tools/ they should rather be pure OCaml. I'm not knowledgable enough about the build system changes and the bootstrapping process to assess whether this is still an issue.

In any case if ocamlobjinfo has to remain pure OCaml then we could likely simply use caml_natdynlink_open in objinfo_helper.c to output the marshalled caml_plugin_header for ocamlobjinfo to read and unmarshal, instead of outputing the offset to the identifier by using the bfd library.

Given all this I'm asking:

  1. Does anyone see a problem in getting rid of the bfd dependency using one or the other technique mentioned above ?
  2. Has there been enough build system changes so that caml_natdynlink_open can be directly used in ocamlobjinfo without problems or the do we still need these tools to be pure OCaml ? (I can simply try though).
@dra27
Copy link
Member

dra27 commented Feb 14, 2020

I haven’t checked the full detail, but if the same thing can be achieved without libbfd, then this would be an improvement for the native Windows ports (especially msvc) - I’ve never got a working version of the helper before (I’ve not tried that hard either)

It would remain desirable for ocamlobjinfo to be a bytecode tool, which means that caml_natdynlink_open would have to move runtimes, but that shouldn’t be a problem.

@xavierleroy
Copy link
Contributor

So there is the idea of making ocamlobjinfo work on cmxs files by simply using the low-level caml_natdynlink_open function to lookup the metatadata. A POC can be found here. It should be noted than no OCaml code gets executed by using this function, the cmxs is just dynlinked and the symbol caml_plugin_header is looked up and umarshalled.

The dynamic loading of the cmxs can fail if the cmxs needs C functions that are not present in the ocamlobjinfo executable. Also, C initialization code (e.g. C++ constructors for global variables) could be executed.

Looking at the history on how objinfo_helper came out to be it seems there was the desire in 2010 to have ocamlobjinfo as a "pure Caml part and a pure C part".

There were licensing issues too: the BFD library is GPL, so it's safer not to link it directly with the ocamlobjinfo code.

But, IIRC, the point of using the BFD library in the first place was to avoid loading the cmxs in memory, as this was deemed too fragile.

Don't get me wrong: I'd love to get rid of the dependency on BFD; it's just that the proposed approach seems unsafe to me.

It would remain desirable for ocamlobjinfo to be a bytecode tool, which means that caml_natdynlink_open would have to move runtimes, but that shouldn’t be a problem.

It would probably not work because cmxs shared libraries expect the native runtime system, so I'm pretty sure you can't load them in a bytecode executable.

@dbuenzli
Copy link
Contributor Author

Also, C initialization code (e.g. C++ constructors for global variables) could be executed.

Ok thanks @xavierleroy for your answer. I suspected I was missing something here.

@dra27
Copy link
Member

dra27 commented Feb 14, 2020

Tangentially to the original problem, the Windows side may be fixable by using a custom version of caml_natdynlink_open (since you can dlopen without running code), so perhaps that code could yet be useful, just not to solve the problem you hoped!

@dbuenzli
Copy link
Contributor Author

(since you can dlopen without running code)

You mean on Windows ?

But wouldn't that part remain problematic then:

The dynamic loading of the cmxs can fail if the cmxs needs C functions that are not present in the ocamlobjinfo executable.

E.g. on unix it seems functions would maybe not be a problem by using RTLD_LAZY, however there may be references to variables, which according to the docs of RTLD_LAZY do get resolved immediately in any case.

@dra27
Copy link
Member

dra27 commented Feb 14, 2020

I did mean “on Windows”, yes. You can load without worrying about references too (e.g. in order to access resource blocks) but it doesn’t matter - the units on Windows are compiled with FlexDLL so the dependencies aren’t known to the OS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants