Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Variant selection mechanism for virtual libraries #1900

Merged
merged 1 commit into from Mar 27, 2019

Conversation

Projects
None yet
3 participants
@TheLortex
Copy link
Contributor

commented Mar 4, 2019

The issue that was tracking this is #1859. I've been able to implement the variant selection mechanism along with default implementations in lib.ml, within closure construction.

It adds some syntax in dune files: for libraries, an implementation can specify its variant and a virtual library can specify its default_implementation. For executables, it's possible to give a set of variants to match.

What it does is the following: when the closure is computed, a map from virtual libraries to implementations is also built. Then variants are resolved and after that there's only virtual libraries with default implementations remaining unsolved: here we need to check for potential ambiguity while finding out which default implementation we can safely choose to add in the closure (and it loops until everything is resolved.).

This is not ready for a merge but we can start reviewing code and thinking about nice error messages !

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 6, 2019

Will try to give this a review today.

Show resolved Hide resolved src/dune_file.ml Outdated
@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 6, 2019

Looks good so far. Great work. I think that we'll be able to get this in for 1.8. From my brief look so far:

The test suite needs a bit more work. We need to add a test that witnesses the changes to the dune-package format. We also need some tests for the case where variants are loaded from an external package. Also, I regret the pattern of stuffing all the variant tests into a single run.t. It really slows down the test suite, so we should not do it. I propose that we split the test suite into:

  • virtual libraries (basically my old tests)
  • default implementations
  • variants

I will do a more thorough review tomorrow

Show resolved Hide resolved test/blackbox-tests/test-cases/variants/variants-base/lib/dune Outdated
Show resolved Hide resolved src/toplevel.ml Outdated
@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 6, 2019

Also: please rebase the PR when you have the time. The documentation and change log need to be updated as well

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch from 0f70b02 to dea1085 Mar 7, 2019

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 7, 2019

Okay, I refactored the tests myself. It's a mess that I caused anyway.

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 7, 2019

I've added some formatting tweaks:

  • Reformatted everything with ocp-indent (please use it as well)
  • Wrapped to 80 chars
  • Remove most spurious uses of let .. and. If there is no recursion, let's just stick to normal let bindings.
Show resolved Hide resolved src/dune_file.mli Outdated

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch from 41c1722 to 80f8ff1 Mar 7, 2019

@TheLortex TheLortex requested review from diml and emillon as code owners Mar 7, 2019

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch 4 times, most recently from 3b54daa to 163ea3c Mar 7, 2019

Show resolved Hide resolved src/lib.ml

and closure_with_overlap_checks db ts ~stack:orig_stack ~linking ~variants =
let name_to_lib name loc =
resolve_dep (Option.value_exn db) name

This comment has been minimized.

Copy link
@rgrinberg

rgrinberg Mar 7, 2019

Member

This can't be right I think. db is None in some cases. Why can't it be None here?

This comment has been minimized.

Copy link
@TheLortex

TheLortex Mar 7, 2019

Author Contributor

I agree this is weird. I assumed the database was accessible when virtual libraries resolution was needed. We just need to be sure that variants is empty when db is None. It's correct for every case, except in lib.ml: resolve_user_written_deps_for_exe, where db is (Option.some_if (not allow_overlaps) t). I don't really know about that allow_overlap feature: when is it set to true ?

@TheLortex TheLortex force-pushed the TheLortex:impl-variants branch 2 times, most recently from b089e6b to 7ac752c Mar 7, 2019

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch 2 times, most recently from bd2425a to 56bdee9 Mar 11, 2019

@@ -227,6 +237,7 @@ module DB : sig
val create
: ?parent:t
-> resolve:(Lib_name.t -> Resolve_result.t)
-> find_implementations:(Lib_name.t -> Lib_info.t list Variant.Map.t)

This comment has been minimized.

Copy link
@rgrinberg

rgrinberg Mar 12, 2019

Member

Could you comment on the necessity of this new field? Why can't we just use all and iterate over all the public libraries this way for example? Is it because we want to avoid resolving them?

This comment has been minimized.

Copy link
@TheLortex

TheLortex Mar 12, 2019

Author Contributor

The goal was to avoid resolving them. Before coming to this solution I wanted to put possible implementations in the Lib.t type but as you mentioned after this introduces a dependency cycle on instantiation.

This comment has been minimized.

Copy link
@rgrinberg

rgrinberg Mar 12, 2019

Member

Yes, indeed that your first attempt would have been ideal. Unfortunately, we will need forward references before it's possible.

Why is resolving the implementations a problem however? Are there situations where we just need the Lib_info.t? We can't really do much with this value apart from resolving it into a proper library.

This comment has been minimized.

Copy link
@TheLortex

TheLortex Mar 12, 2019

Author Contributor

It's not really a problem, I just tried to do a minimal amount of computation when evaluating variants: we only need to resolve the implementations that are actually selected.

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

The issue with these Option.value_exn should really be solved by having a field such as:

; default_implementation : Lib.t Or_exn.t option

in the Lib.t type.

However, this isn't really possible to add in instantiate as it would introduce a dependency cycle when resolving the libraries.

@diml Would it be possible to create some sort of a forward reference in Lib.t to address this issue?

@diml

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

Ah, that's annoying. A forward reference is ok if it is filled immediately after the Lib.t value is created, otherwise it is going to be hard to reason about when it is safe to reuse these values between runs.

BTW, I've been wondering if instantiate wasn't too complicated. Instead of storing direct Lib.t values in Lib.t, we would just store a new Resolved_lib_name.t type. And the idea would be that going from Resolved_lib_name.t to Lib.t would be very fast so it would be almost like having a Lib.t directly, but on the other hand it would simplify this code.

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch from 56bdee9 to 27f1039 Mar 12, 2019

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

BTW, I've been wondering if instantiate wasn't too complicated. Instead of storing direct Lib.t values in Lib.t, we would just store a new Resolved_lib_name.t type. And the idea would be that going from Resolved_lib_name.t to Lib.t would be very fast so it would be almost like having a Lib.t directly, but on the other hand it would simplify this code.

That might work, but then we have to allow for cycles when resolving the Resolved_lib_name.t, right? Seems like it's the same problem. What's the semantic difference between Resolved_lib_name.t and Lib.t?

@avsm avsm referenced this pull request Mar 12, 2019

Open

Add -noruntime option #2309

@diml

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

Nope, we wouldn't need to recurse for getting a Resolved_lib_name.t. The only information attached to a Resolved_lib_name.t would be that it exist. We wouldn't know whether it is hidden or not for instance. It's pretty much the same as storing Lib_name.t values in Lib.t, and doing a resolution everytime we want to access one of those, except that it is faster because we already did the job of resolving the name.

@diml

This comment has been minimized.

Copy link
Member

commented Mar 12, 2019

To be clear, the distinction is as follow:

  • Lib_name.t: just a string
  • Resolved_lib_name.t: a unique identification of a library, i.e. a lib name + a path
  • Lib.t: information about a library
@TheLortex

This comment has been minimized.

Copy link
Contributor Author

commented Mar 19, 2019

So where are we now ? Shoud I try to implement that Resolved_lib_name module and refactor the code ?

By the way regarding libraries forcing implementations, I was suggested that it might not be a sane behavior, so we could simply forbid that. What do you think ?

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 20, 2019

So where are we now ? Shoud I try to implement that Resolved_lib_name module and refactor the code ?

I need to think about this change a bit more. @diml would this require storing the Lib.DB.t in the Lib.t? Otherwise, I'm not really sure what a resolved name means if we don't know which database it came from.

By the way regarding libraries forcing implementations, I was suggested that it might not be a sane behavior, so we could simply forbid that. What do you think ?

I think it's fine to forbid it.

@TheLortex Let me have a look at this patch again today. Sorry about the delay, but we really need some simplifications here. The code was already too complex even before this PR.

@diml

This comment has been minimized.

Copy link
Member

commented Mar 20, 2019

Well, resolved library names would be globally unique, so we would have a single db going from resolved library names to Lib.t values. In fact, it could be just a memoized function. It is basically the function that takes a unique way to identify a library and returns its definition by reading dune, dune-package or META files. Breaking things down this way would also help having more fine grained memoization: the library databases would just become string -> resolved named maps, which are much smaller than the current string -> lib maps. And individual library definition would have less dependency in memory.

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch 2 times, most recently from 9375820 to bd08856 Mar 24, 2019

src/lib.ml Outdated
type nonrec t =
| No_implementation
| Implemented_by of t
| Too_many_impl of t list

This comment has been minimized.

Copy link
@rgrinberg

rgrinberg Mar 25, 2019

Member

Why do we need the arguments to the Implemented_by and Too_many_impl constructors? I don't see them being used anywhere.

This comment has been minimized.

Copy link
@TheLortex

TheLortex Mar 25, 2019

Author Contributor

Indeed they are not useful. They were built for potential error messages but we can remove that actually.

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 25, 2019

I've squashed the commits into one as it makes rebasing simpler. @TheLortex I've removed a few cases where libraries were compared by name. This isn't correct in dune and names aren't unique. In fact, I agree with Jeremie's current idea of adding an intermediate step for name resolution. So even though there are still some problematic parts in the code that rely on concrete library names too much, I think this is good enough for now. I will look into implementing Jeremie's proposal in a subsequent PR.

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch from bebc548 to 789f100 Mar 25, 2019

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

@TheLortex Have another look at my code. I've tried to simplify some things.

For your Vlib_status type:

module Vlib_status = struct
  type nonrec t =
    | No_implementation
    | Implemented_by of t
    | Too_many_impl of t list
end

I'm wondering if it overlaps, a little too much with our:

      type vlib_status =
        | No_impl of Dep_stack.t
        | Impl of lib * Dep_stack.t
      type t = vlib_status Map.t

So instead of a Vlib_status.t Map.t, we could use vlib_status Map.t Or_exn.t.

Yes, our error handling is a little less precise, but at least it will be consistent everywhere. Moreover, cases like this are confusing to me:

        Map.foldi !virtual_status ~init:(Ok ([], []))
          ~f:(fun lib status acc ->
            match status with
            | Implemented_by _
            | Too_many_impl _ -> acc

Shouldn't we be treating Too_many_impl _ as an error there? It probably doesn't matter, but using the type that I've proposed should simplify some things I think.

Using Lib_info.t for the dependency is something that I wish we'd avoid. We already have a Dep.Stack.t type. This is just a minor point and doesn't prevent the PR from being merged, but is there a reason why you did it this way?

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

@TheLortex Could you add a comment documenting vlib_default_parent by the way? I think we should have a comment for every single one of these refs that we use while traversing the closure.

@TheLortex

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

Thank you for your work on this ! I took a look, it's indeed more readable than the first version.
Moreover, if we forbid libraries to depend (and thus force) implementations, we can actually remove the default implementation dependency cycle check.

Shouldn't we be treating Too_many_impl _ as an error there? It probably doesn't matter, but using the type that I've proposed should simplify some things I think.

The error is handled later, in the Virtual_libs module it throws Double_implementation so I didn't want to make a duplicate. I guess we could use vlib_status instead.

Using Lib_info.t for the dependency is something that I wish we'd avoid.

Could you add a comment documenting vlib_default_parent by the way?

If we forbid libraries to depend on implementations we can actually remove the whole resolve_default_libraries function.

@rgrinberg I'll update the PR accordingly

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

I'm not sure we should forbid to depend on implementations. The reason being is that it makes it hard to turn existing libs into implementations.

Also non dune users can only consume implementations so I think we have to live with the check for now.

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch from 1b8c2ae to 9f21d34 Mar 26, 2019

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

@TheLortex The last commit replaces the custom map with Vlib.Table.Partial. Things almost seem to work except for the resolution_priority issue. Could you explain that test a little more?

@TheLortex

This comment has been minimized.

Copy link
Contributor Author

commented Mar 26, 2019

Yes you're right. Let's keep the check then.

resolution_priority was a case where the three kinds of implementation selection were used:

  • for direct it's an user defined dependency to direct.ocaml
  • for variant it's selected by using variant c
  • for test it uses a default implementation

It also checks that an user defined dependency overrides variant selection. Thus even if the c variant is given, direct.c should not be selected because direct is already implemented by direct.ocaml.

This allows to override variant selection in case of ambiguity.

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch from ad97f1e to 92e42d8 Mar 26, 2019

@rgrinberg

This comment has been minimized.

Copy link
Member

commented Mar 26, 2019

I've removed the Vlib_status stuff in favor of just keeping unimplemented libraries in an abstract type. I think this is good enough to merge now.

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch from 92e42d8 to 888062d Mar 26, 2019

Introduce Variants/Default implemeantions
Variants are a mechanism for selecting virtual library implementations en masse.
This PR includes the following changes:

* Add a [variant] field to implementations. This tags a virtual library
  implementation with a variant.

* Add [variants] field to executables and toplevels. This allows us to select
  implementations en masse when linking.

Default implementation is a feature to select a library to be a default
implementation when no implementation is selected by the user. This feature
allows to smoothly port existing libraries to be virtual.

Signed-off-by: Lucas Pluvinage <lucas.pluvinage@gmail.com>
Signed-off-by: Rudi Grinberg <rudi.grinberg@gmail.com>

@rgrinberg rgrinberg force-pushed the TheLortex:impl-variants branch from 888062d to 6439ed5 Mar 27, 2019

@rgrinberg rgrinberg merged commit 23f6db7 into ocaml:master Mar 27, 2019

1 of 3 checks passed

continuous-integration/appveyor/pr Waiting for AppVeyor build to complete
Details
continuous-integration/travis-ci/pr The Travis CI build is in progress
Details
DCO All commits are signed off!
Details

rgrinberg added a commit to rgrinberg/opam-repository that referenced this pull request Apr 9, 2019

[new release] dune (1.9.0)
CHANGES:

- Warn when generated `.merlin` does not reflect the preprocessing
  specification. This occurs when multiple stanzas in the same directory use
  different preprocessing specifications. This warning can now be disabled with
  `allow_approx_merlin` (ocaml/dune#1947, fix ocaml/dune#1946, @rgrinberg)

- Watch mode: display "Success" in green and "Had errors" in red (ocaml/dune#1956,
  @emillon)

- Fix glob dependencies on installed directories (ocaml/dune#1965, @rgrinberg)

- Add support for library variants and default implementations. (ocaml/dune#1900,
  @TheLortex)

- Add experimental `$ dune init` command. This command is used to create or
  update project boilerplate. (ocaml/dune#1448, fixes ocaml/dune#159, @shonfeder)

- Experimental Coq support (fix ocaml/dune#1466, @ejgallego)

- Install .cmi files of private modules in a `.private` directory (ocaml/dune#1983, fix
  ocaml/dune#1973 @rgrinberg)

- Fix `dune subst` attempting to substitute on directories. (ocaml/dune#2000, fix ocaml/dune#1997,
  @rgrinberg)

- Do not list private modules in the generated index. (ocaml/dune#2009, fix ocaml/dune#2008,
  @rgrinberg)

- Warn instead of failing if an opam file fails to parse. This opam file can
  still be used to define scope. (ocaml/dune#2023, @rgrinberg)

- Do not crash if unable to read a directory when traversing to find root
  (ocaml/dune#2024, @rgrinberg)

- Do not exit dune if some source directories are unreadable. Instead, warn the
  user that such directories need to be ignored (ocaml/dune#2004, fix ocaml/dune#310, @rgrinberg)

- Fix nested `(binaries ..)` fields in the `env` stanza. Previously, parent
  `binaries` fields would be ignored, but instead they should be combined.
  (ocaml/dune#2029, @rgrinberg)

- Allow "." in `c_names` and `cxx_names` (ocaml/dune#2036, fix ocaml/dune#2033, @rgrinberg)

rgrinberg added a commit to rgrinberg/opam-repository that referenced this pull request Apr 9, 2019

[new release] dune (1.9.0)
CHANGES:

- Warn when generated `.merlin` does not reflect the preprocessing
  specification. This occurs when multiple stanzas in the same directory use
  different preprocessing specifications. This warning can now be disabled with
  `allow_approx_merlin` (ocaml/dune#1947, fix ocaml/dune#1946, @rgrinberg)

- Watch mode: display "Success" in green and "Had errors" in red (ocaml/dune#1956,
  @emillon)

- Fix glob dependencies on installed directories (ocaml/dune#1965, @rgrinberg)

- Add support for library variants and default implementations. (ocaml/dune#1900,
  @TheLortex)

- Add experimental `$ dune init` command. This command is used to create or
  update project boilerplate. (ocaml/dune#1448, fixes ocaml/dune#159, @shonfeder)

- Experimental Coq support (fix ocaml/dune#1466, @ejgallego)

- Install .cmi files of private modules in a `.private` directory (ocaml/dune#1983, fix
  ocaml/dune#1973 @rgrinberg)

- Fix `dune subst` attempting to substitute on directories. (ocaml/dune#2000, fix ocaml/dune#1997,
  @rgrinberg)

- Do not list private modules in the generated index. (ocaml/dune#2009, fix ocaml/dune#2008,
  @rgrinberg)

- Warn instead of failing if an opam file fails to parse. This opam file can
  still be used to define scope. (ocaml/dune#2023, @rgrinberg)

- Do not crash if unable to read a directory when traversing to find root
  (ocaml/dune#2024, @rgrinberg)

- Do not exit dune if some source directories are unreadable. Instead, warn the
  user that such directories need to be ignored (ocaml/dune#2004, fix ocaml/dune#310, @rgrinberg)

- Fix nested `(binaries ..)` fields in the `env` stanza. Previously, parent
  `binaries` fields would be ignored, but instead they should be combined.
  (ocaml/dune#2029, @rgrinberg)

- Allow "." in `c_names` and `cxx_names` (ocaml/dune#2036, fix ocaml/dune#2033, @rgrinberg)

- Format rules: if a dune file uses OCaml syntax, do not format it.
  (ocaml/dune#2014, fix ocaml/dune#2012, @emillon)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.