-
Notifications
You must be signed in to change notification settings - Fork 242
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
dunification of MirageOS #1020
dunification of MirageOS #1020
Conversation
This patch adds a way to configure the UNIX target where we define a dune's alias: dune build @default and an underlying rule to _generate_ the unikernel.
This module is not mandatory but give us a way to link static libraries with the _unikernel_ into a more predictable way than before. This module tries to resolve static libraries and notice to the user (by logs) which libraries will be linked.
1a6138a
to
3d8efd4
Compare
Nice work!
Do you have an example where you need this? This is supposed to be working fine I guess. I'll have a proper review tomorrow. |
I like the overall approach. As a quick response, I think that with:
That this should never happen:
If we just compile every target in a well defined workspace, then dune caching in dune 2.0 can make it as fast, without any dangerous sharing involved. |
By facts, we need to put a Into an higher-level, the question is already opened on mirage/ocaml-solo5#66 where:
As @hannesm said, the |
This patch loads into `pwd`/.mirage.solo5/ some well-formed files about flags (used by `cc` and `ld`). They will be tracked by `dune` then according emitted dune configuration. At least, these files and specially `cflags.sexp` will replace the expansion variable :standard available into `c_flags` dune's field. Compilation of manifest.o will use these flags. This part depends on `pkg-config` to load flags from *.pc files.
As UNIX and Solo5 target, Xen can emit special dune's configuration.
This patch integrate the biggest update about the _dunification_ of MirageOS where mostly all are orchestrated here. `configure_dune` is the main function to emit the right `dune.build` file according target __and__ libraries described in `config.ml` It will emit the main `(executable ...)` artifact with specific cflags and lflags. Both are delivered by a __post__ processs `configure_post_build_rules` according the requested target. cc-to-ocamlopt is used to properly link static libraries expected by `config.ml` and -l flags provided by `*.cmxa` artifacts. The executable is described as a target which will used specific variants (`xen` or `freestanding`). _dunified_ project can take the advantage of that to plug a special implementation according the target requested. A `dune-workspace` and a `dune-project` are emitted too to provide a good context about how to compile C stubs specially. `duniverse` can take the advantage of that.
See update into Mirage_clean and give a proper way to delete all generated files.
The link process is defined into the `dune.build` file and the mirage tool does not need to take care about that when `dune build` will do the linking step.
This code try to retrieve static libraries (`*.a`) according the requested target and it uses `pkg-config`. This function is used by the configuration step to emit right link flags.
This patch wants to replace the call of `ocamlbuild` to `dune` with the requested target (see alias emitted by the configuration step).
3d8efd4
to
5b2c1bb
Compare
I do prefer having explicit files with the flags rather than invocations to |
Ok good, from what I know, It seems that this last way can remove all occurrence of |
Yes, mostly this the xen support libraries, and in those case we adapt them all via files containing CFLAGS or some |
I did a new branch From what I did, if we integrate mirage/ocaml-solo5#66, we mostly can delete any use of At this stage and to avoid an overlap with #1018, do we want to use files or
|
@samoht highlighted some problems about the linking:
About dependencies on |
The change is available in this patch: d18c02b. Into details, flags provided by A |
Ok, good news, from
|
@dinosaure: Thanks for working on this. It's a complex change with many moving pieces, and I'm not sure I understand all the changes you are making. I'm going to try and reply to the various points you've raised, but will also ask questions that may have already been answered elsewhere, or seem "obvious" to others, so please bear with me.
If I understand this correctly, you are proposing that: For a package which requires specially ("cross") compiled C libraries and a target, where target is, for example, xen or freestanding (a.k.a. Solo5): Instead of package providing a library Correct?
Without As an alternative, could we instead have_package_-target install its libraries into a shared directory, such as Related to this:
That's because Could the solution I write about further above, using a single, shared target-specific directory for C libraries be made to work for
Three questions:
Again, this seems like almost but not quite the same issue as above? Sorry if I'm repeating myself, but it's important. We need a way to ensure that the toolchain is using |
Link stepAs @samoht asked, I delayed the resolution of static libraries compiled with specific flags according the target to $ opam install mirage && mirage configure -t hvt Should works. However, A new file appears under the hood In others words, A new flag was added (
A clear limitation is that for any library which uses a C stub and do not get right information on |
Correct at the beginning, not needed now because we use Possible solution about that:
About the current status of Now, with
I think, you describe in some words what is the current problem. layout of libraries with C stubs to be compatible with MirageOS is not documented. At the end, the story is only about how to correctly link an unikernel. But I will describe by through in another comment.
We currently need this tool not only for
I completely agree with you but the diff between now and what is proposed in this PR is to fix it a bit with what we can have from the eco-system.
|
Status of link stepSo I just would like to clarify a non obvious situation about MirageOS. Currently, most of problems highlighted by the dunification of MirageOS is about the linking step. This problem is currently fixed by several ways:
All of three can give to us information needed to correctly link libraries and C stubs/archives, etc. From what I know, the mostly popular way to get this information is The
And, on top of that, we want to move to the Concerns about what
|
Me neither.
We should try and find a proper solution to the problem of "Mirage-compatible" packages, and we should do that now rather than later, since as far as I can tell the current state of the "dunification" just increases complexity and technical debt. If we can replace pkg-config entirely and get other benefits such as a standardised way of building Mirage-compatible packages, we should consider such a solution "now" (with e.g. a "flag day" for the MirageOS release that includes it), even if that means breaking compatibility with existing packages. Conversely, if the tools ( Given the complexity involved, the best way to progress on this would be to actually sit down together for a couple of days at a computer, possibly with some Any ideas on how to progress this? @avsm @samoht @hannesm @dinosaure |
Attaching today's discussion from Slack, so that we don't lose it. |
/cc @mirage/core and @TheLortex
Hi all, this is the first draft of a not-completely tested dunification of MirageOS. First, this PR does not work but I think, I reached a step where we need to start to talk technically about the dunification of MirageOS and when we should track minor updates.
The Pull-Request
The PR is the smallest of what I can do (and #1017 helps me a lot). Each commit has a description about the goal and update only one file per one file the project to avoid a misleading of patches.
This PR does not have deletion - not yet. Some warnings (eg. 32) was added. The goal is to propose a smooth view about the dunification of MirageOS.
This PR is only about the dunification. Behind this word, we can imagine a lot (
duniverse
,variants
and so on) but the only goal is to usedune
to compile an unikernel.This PR is not yet fully tested. Currently, only pasteur was tested -
pasteur
includes a large stack when it useirmin.2.0.0
,git
anddigestif
andconduit
withhttpaf
(includingtls
andnocrypto
even if we don't support HTTPS, shame on me). Only UNIX and Solo5 target are tested - I don't have a deployment process for Xen. Despite all this,pasteur
still is a good PoC to test/improve dunification of MirageOS.This PR trusts on
dune.2.0.0
. Several features are used but not so really required. Small patches on some libraries needed bypasteur
are needed because:dune.2.0.0
This PR should not expect anything else from
dune.2.0.0
Issues on the eco-system
MirageOS has a large eco-system where we can find some unusual packaging politics. When we are the ownership on these packages, it's fine to update them according the dunification of MirageOS. However, it's not totally the case where some MirageOS unikernels depend on some libraries outside the scope of the MirageOS organization.
According all of that, and even if we have this PR, we still have some issues on our eco-system. The goal of this PR and the dunification in general is not to fix all of these issues. At least, we highlighted them. But, please, keep in your mind that the only goal of this PR is to tap
dune build
in his keyboard.One of the biggest issue on the eco-system was fixed with #1010. I don't want to re-explain all the story about
mirage-os-shim
, however, as I said, this issue highlighted a weak about a documentation of: how to be MirageOS compatible? And we just discovered that we have many ways.But an other issue still exists on this PR and was explained/discussed into #1018 about
pkg-config
.dune
can provides some others ways to do some tricks at the link time (eg. variants) but it's not the purpose of this PR. However, we should start then to talk about all of that.The last issue is about GMP and we will go a bit into details.
Zarith and GMP
The
zarith.cmxa
has an information used byocamlopt
then:-lgmp
. If you want to compile an executable or an object withzarith
and without-noautolink
,ocamlopt
will emit togcc
(at the link time)-lgmp
(and then,gcc
will takelibgmp.a
available into yourLD_LIBRARY_PATH
).Of course, considering targets like Solo5 or Xen,
libgmp.a
available on your system does not fit into constraints expected by Xen or Solo5. GMP must be compiled with specific C flags.This is the purpose of the package
gmp-freestanding
andgmp-xen
. My only concern about this way is to renamelibgmp_freestanding.a
andlibgmp_xen.a
tolibgmp.a
and let the compiler to choose these implementations according the target. Considering flags, Solo5 and Xen can provide in some ways-L$(opam config var lib)/gmp-{freestanding,xen}
as a dependency.NOTE: that means for any compilation of any unikernels,
-L
shoud always be emitted bymirage-os-xen
orocaml-freestanding
even if the unikernel does not depend ongmp
. However,-lgmp
will be emitted only if we want to link withzarith.cmxa
.Versatility of the linking
The linking step is done by
gcc
currently in our OCaml world, the linking step is super versatile where static libraries are chosen depending of which path is available on the command-line (and the environment). The predictability of which library will be statically linked to our unikernel is hard to follow/track.A new tool was added into
mirage
:cc-to-opt
orcc-to-ocamlopt
which will takes C flags and do a translation of-L
to-I
and-l
to-cclib -l
. It wants to resolve static libraries and say that they exists according all given path by-L
. Then, it re-order options due to the precedence of-L
over-l
(-LA -la -LB
and-LA -LB -la
don't have the same behavior ifliba.a
is available intoA
andB
- this behavior is pretty-close to the GMP case wherelibgmp.a
is available into several directories).At the end, the choice was made to let
ocamlopt
to link the object file (for Xen/Solo5/Unix). Then, andld
command will be done according the target:unix
: nothing to do, at least, anln -sfn
solo5
:ld
with Solo5 link scriptxen
:ld
with some optionsCorrectness of C stubs linked
The previous point want to serve an other issue about the expected flags used to compile C sources (like GMP or
digestif
). An unikernel for Solo5 which links withdigestif.c
(compiled to be used by an UNIX executable) can work. That means that silently, we can link with the wronglibdigestif.a
and the unikernel still works.The only way to verify that we link with the right C stubs is to check (from what I know) assembly code between your unikernel and expected static library.
At this stage, no solution exists so. But, again, this PR highlighted this issue.
UNIX module and
-dontlink
On the currant status of MirageOS,
ocamlbuild
is used with-dontlink
to avoid any link withunix.cmxa
. However, we lost this option if we want to usedune
.About that, the deletion of this option/feature highlighted several packages which are linked with
unix.cmxa
mostly because they want to be linked withbigarray.cmxa
which containsmap_file
(an UNIX syscall). Again, some packages have this dependency again (likesexplib
).This is the purpose of
bigarray-compat
and we should use it at any of our projects, at least.Issue on
dune
dune
unlocks several features to improve the way to compile an unikernel. Some of them are not used now but they exist. The notably feature is, may be, the variant wheredune
can orchestrate the choice of which OCaml library we should use according the target (freestanding
orxen
).At this stage, it's possible to use it when the generated
dune
file describe which variant it will use (with thevariant
field).A proposition on Zarith was made by @TheLortex in this way. We can talk about that on an other issue/PR but the idea is to have a Zarith library which provides
freestanding
andxen
which the right link with GMP.Another feature is about the
forbidden_libraries
which is not yet used - but it can be. The purpose of it is to follow who want to link with a specific library and, in our case,unix.cmxa
. Thisdune
's field can help us a lot to track the compatibility with MirageOS of our libraries.The compilation of an unikernel with
dune
unlocks an other way to develop an our MirageOS where we can start to useduniverse
and have a self-contained workspace of all of what we need. In this way,mirage
still continues to emit an OPAM file used byduniverse
then to download all libraries needed by our unikernel.And from that,
dune
can provide a well-defined workspace with right C flags according the target to let the user to use C stubs without the hard plumbing needed when we want to package that to OPAM.(dirs ...)
The only problem today about
dune
is: we can not use(dirs ...)
to load some specific underlying directories into thedune.build
file. I don't know why. The current trick is thatmirage configure
rewrite not-onlydune.build
butdune
too to load correctly files needed to compile unikernel.flags needed by the target
Currently, a proper way to get flags needed by the target (C flags for C stubs and linking flags for
ld
) is not well-defined. The current way chosen by Solo5 is to usepkg-config
. The other way chosen by Xen is to provide into the distribution ofmirage-os-xen
some files needed bydune
then.An explanation is available here: mirage/ocaml-solo5#66
This question is pretty-close to what we want to do with
pkg-config
(if we want to use it or remove it). Currently, we don't have an homogeneous way to get these information and we should start to discuss/document that.Conclusion
Let start to talk about the dunification of MirageOS. Please, I really would like to stay focus on the real dunification of MirageOS. I know that this PR can unlock several possibilities but keep in your mind that as long as this PR will not be merged, all other discussions/issues will be vain - because the predicate is to use
dune
first.Next week, I will try to provide a CI with this PR to show that this PR works! But we can start to talk about details of the way to dunify a MirageOS.
PS: thanks for reading all!