Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Produce a proposal for composable workspaces #52

Open
acmcarther opened this issue Jun 11, 2018 · 13 comments
Open

Produce a proposal for composable workspaces #52

acmcarther opened this issue Jun 11, 2018 · 13 comments

Comments

@acmcarther
Copy link
Member

I'm not really sure how widely I'd like this scoped but the reality is that there are workspace-level projects that will depend on something like cargo-raze to synthesize a list of dependencies for use external to the workspaces themselves. These use cases will need a real answer to the composition problem.

Specific examples:
bazelbuild/rules_rust#92
pubref/rules_protobuf#226

In part, this will involve deferring generation of exact rust_library decls behind some sort of middle layer. This hits an issue intrinsic to the problem space though -- there isn't a perfect place in the Bazel build lifecycle for the kind of dependency resolution that Cargo likes to do. The hardest problems that I see are that some logic can be performed pre-loading phase (ala cargo-vendor), but some ought to happen once the target platform is known. Cargo's implementation of target platform handling requires parsing a target specification grammar which I was disinclined to duplicate (see CfgExpr in Cargo for details).

I discussed some of this in brief @ rust-lang/rfcs#2136 (comment) and throughout that issue.

@acmcarther
Copy link
Member Author

Yanking people who have some degree of stake in the issue here: @mfarrugi @damienmg. Feel free to respond later / ignore / add response emoji as appropriate.

@acmcarther
Copy link
Member Author

Prior art:
https://github.com/acmcarther/cargo-raze-examples/blob/8dba8be7d0ef1e71a4258a5d43c6808f884a7c7f/bazel/complicated_cargo_library/Cargo.bzl

This is a variant of the older cargo-raze logic that defers rule generation to later in the bazel pipeline. Instead of generating BUILD files, it generates bzl files that contain structs loaded with details about each crate.

@damienmg
Copy link
Member

@acmcarther My view on that is that cargo raze should emit flat list of dependency that use a custom crate_repository that would download the crate and generate the correct build file with dependency.

The bad thing with that solution is the duplicate external dependency line in the crates.bzl file that would contains multiple configuration but that is fine because bazel doesn't download them unless it needs them.

Finally you actually want to output all the configuration you can think of in the BUILD file so that bazel can do cross compilation if necessary (the host platform with remote execution can actually be different that the local platform).

Side-note about external deps on Bazel: the only thing missing to have everything running as part of the Bazel build would be recursive workspace.

@mfarrugi
Copy link
Collaborator

(I'm not confident in my understanding here, so take this with a grain of salt)

The current stopgap solution to this seems to be having canonical workspace names for crates, like https://github.com/bazelbuild/rules_go/blob/07ef9e307a8f0095e5b10a70e1d291eace65eb61/tests/integration/popular_repos/popular_repos.bzl#L21

This doesn't appear to let us handle version resolution / conflicts at all.. but lets us generate rust_library targets directly. We could create @crate_name_${major_version} and that gets us pretty far. It feels like #8 could be finagled to work with something like this as well.

@damienmg
Copy link
Member

FWIW #50 is already introducing the use of native.existing_rules to avoid a package being loaded twice.

Raze convention raze__<crate>__<major>_<minor>_<patch> seens to work fine and the crates.bzl file is creating convenience aliases.

The issues I see:

@acmcarther
Copy link
Member Author

acmcarther commented Jun 13, 2018

@damienmg:

There is a particular problem I'm trying to highlight in this issue that's separate from the physical crates being pulled down. Maybe a table helps.

Stage Current Cargo-Raze Hypothetical Composable Cargo-Raze Cargo
Declaration Cargo.toml Cargo.tomls? Cargo.tomls
Version Resolution BUILD Something.bzl Cargo.lock
Resolved Features + Optional Deps BUILD Late genn'd BUILD? in-process of cargo

Legend (simplified example)
Declaration: mio = "^0.6.4"
Version Resolution: "mio @ ['0.6.4', '0.5.1'], fuchsia-zircon @ ['0.3.3']"
Resolved Features + Optional Deps: "mio['0.6.4'] { deps: ["fuchsia-zircon['0.3.3']"], features: ['with-deprecated'] }"


What we have now is basically equivalent to trying to compose bundles of lock files plus Cargo in-process state together. It can be done, but it's brittle and any two "bazel level dependencies" that share common crates will not play nice together. Of particular note, if you want workspace level players to interact together, we'll need to be thinking about workspace-level interaction at what I called Declaration above, or something between Declaration and Version Resolution.

I think this can be done in a user friendly way, but the trick is puzzling it into Bazel's lifecycle. Something I'm thinking about and tracking here.


All of that said, we might already all be on the same page here and I'm just being stubborn. LMK if that's the case.

@acmcarther
Copy link
Member Author

acmcarther commented Jun 13, 2018

As an extension of that:

The historical reason why "cargo_crate" hasn't existed is because it requires non-local state to be stored somewhere (in the form some kind of "registry"). As far as Cargo has defined them -- or at least how I understand Cargo to define them -- Crates can't exist in a vacuum. Their dependencies can only be resolved in some larger context where the search space of possible dependency versions can be narrowed (though the trivial case is that you just want a crate and that crate has no siblings; this allows you to pick deps but the root crate isnt guaranteed to interoperate with other cargoy-crates later on in any fashion). A sort of "second pass" later happens to determine which dependencies are actually real and which feature flags need to be set.

EDIT: For grammar and punctuation.

EDIT: As a non-emaily addendum: When I say "registry" I don't mean "crates.io". I mean the full list of workspace (and depended-upon-workspace) wanted crates must be known at some early point in the build process, and that information needs to live in a blob that I call a "registry".

@damienmg
Copy link
Member

I think we are on the same page. I basically want to move to your second column now by having a "cargo_crate" repository that raze would generate and that simply download the crate and create the corresponding build file with the dependency it has constructed. I have a prototype almost ready for doing that.

We could actually do all the version resolution in the same way by shelling out to cargo but that's a bit ugly and would integrate badly with Bazel because we would lose a lot of caching (downloading all the deps for all the configurations, whatever we need to use). There is a proposal for separating those two phases that with recursive workspace would make that whole integration nice but there are not there and doesn't seem close either.

@damienmg
Copy link
Member

By the way, I drafted a prototype to address bazelbuild/rules_rust#2 at bazelbuild/rules_rust#100 :)

@acmcarther
Copy link
Member Author

acmcarther commented Jun 21, 2018

The discussion here is fragmentary and I've shotgunned the same statement across several issues. I'll reiterate it here.

Any cargo_crate build rule that functions outside of the most simple use case is going to need a lot of params and it'll need to delegate a lot of work to the person or system figuring out what those params are.

By itself, cargo_crate can't do enough work to pull its own weight for an automatic caller such as cargo_raze because there are too many BUILD-related attrs that can't be determined in isolation and skylark doesn't appear to provide a way for independent rule declarations to coordinate to facilitate determining those things internally. It also isn't useful for someone manually editing their WORKSPACE because they can't hope to know things like "which features do my upstream crates expect me to have" or "which target-specific flags do I need to use".

I'm open to being shown to be wrong here and willing to drop what I use on the floor and use something better but I hope I've made my apprehensions about the corner cases clear at this point.


That aside: Two proposals for #52 (supporting cross WORKSPACE dependencies):

I'm thinking that support is a kind of "all or nothing" thing. Specifically:

  1. "All": Provide a toml-like declaration that WORKSPACEs can export.

Perform cargo-like resolution for the tree of WORKSPACEs (potentially using synthetic Cargo workspaces), culminating in a final lock which can then be processed by a raze-like system. This approach requires adding a manner for workspaces to address crates that they expect to exist ala Cargo.toml.

EDIT: The synthetic workspace approach is limited by lack of support for nested cargo workspaces, see rust-lang/cargo#5042

  1. "Nothing": Have "dependency" workspaces take an arg to their repository_rule (that all WORKSPACEs expose by convention anyway) that specifies paths to dependencies and generates a workspace proxy that the workspace can use to access its deps. This could also be declared in the dependency workspace's WORKSPACE file, effectively allowing the workspace to function in isolation or as a dep. Example:

WORKSPACE

# When used as a root workspace alias to local raze
  native.new_local_workspace(
    name = "something_unique__cargo_deps",
    path = "third_party/cargo-deps"
    BUILD = "alias(name="libc", actual="@//third_party/cargo:libc")"

dependencies.bzl

# When used as a dependency use provided paths
def load_repositories(cargo_deps):
  native.new_local_workspace(
    name = "something_unique__cargo_deps",
    path = "third_party/cargo-deps"
    BUILD = """
      alias(name="libc", actual="{0}"),
""".format(cargo_deps["libc"])

some.BUILD

rust_library(
  name = "something_i_wrote",
  srcs = "lib.rs"
  deps = [
    "@something_unique__cargo_deps//:libc"
  ]

Notably, neither approach is completely functional until recursive workspaces are fully supported as any WORKSPACEs intended to be used as a dependency would not be able to specify dependencies of its own. Root cause: repository rules cannot invoke load -- which is necessary to pull downstream functions up after the first jump.

@damienmg
Copy link
Member

Alex, I might be wrong but I see the feature functionality as pretty similar to Bazel constraints system. We could emit BUILD file that have those feature and express which dep+feature which target+feature as Bazel constraints. In my current view, cargo raze would emit the deps list as cargo_crate rules that would just download cargo files and create the good BUILD file. Ideally the cargo_crate rule is actually able to

The fuzziness still remains about how it should behave if you override dependencies in your Cargo.toml, mostly because I am myself unaware on how Cargo do when you depend on such dependency.

Sorry I wasn't able to join over VC to discuss more, I'll reschedule next week (will be around 10pm MUC Time).

@johnedmonds
Copy link
Contributor

I found my way here from the two rust protobuf PRs. Is there any update on this?

@acmcarther
Copy link
Member Author

@damienmg and I met some time in June to discuss this, but it was mostly just synchronizing on the idiosyncrasies of Cargo and we did not come to any decision on how to approach this problem.

It seems likely that in the short term we'll punt on a "good" solution to this problem. Instead, I suspect what will happen is that we'll provide a way for the required dependency to be provided by the user while prepackaging a fallback impl with an already resolved dependency tree. Users would be expected to override the dependency-of-interest if they ever bring it into their workspace separately to avoid the confusing "multiple independent dependency resolutions" issue described several comments up.

My understanding is that all of this is blocked on cargo-raze generating outputs for multiple platforms simultaneously though. This is moving forward now and can be followed in the cargo-raze PR where some discussion is ongoing

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants