Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docs: give advice on project structure to avoid unnecessary 'load' dependencies #12835

Open
meteorcloudy opened this issue Jan 15, 2021 · 16 comments
Assignees
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: feature request

Comments

@meteorcloudy
Copy link
Member

meteorcloudy commented Jan 15, 2021

Bazel only downloads external dependencies that are needed for building your target? That's not exactly true.

Problem

Bazel fetches unneeded dependencies due to load statements.

Reproduce

Considering the following example:

Repo A is your main repo, which depends on some library in repo B. Repo B depends on a test rule in repo C for testing the its library.

In A/BUILD:

cc_binary(
    name = "bin",
    srcs = ["main.cc"],
    deps = ["@B//:lib"]
)

In B/BUILD:

load("@C-dev//:def.bzl", "my_cc_test")

cc_library(
    name = "lib",
    srcs = ["lib.cc"],
)

my_cc_test(
    name = "lib_test",
    srcs = ["lib_test.cc"],
    deps = [
        ":lib",
        "@C-dev//:test",
    ],
)

Although C is only a dev dependency of B, in order to build //:bin A has to declare C as a dependency in WORKSPACE and Bazel will fetch C at build time.

The root cause is load("@C-dev//:def.bzl", "my_cc_test") being in the same package as @B//:lib, in the loading phase, Bazel has to fetch C to correctly parse the BUILD file.

A full repo case is here: https://github.com/meteorcloudy/my_tests/tree/master/dev_deps_test

@meteorcloudy meteorcloudy added P2 We'll consider working on this in future. (Assignee optional) type: feature request labels Jan 15, 2021
@lberki
Copy link
Contributor

lberki commented Jan 15, 2021

One does need to fully evaluate the BUILD file to interpret even @cc/:lib: it's possible that my_cc_test() is a macro that creates a rule called lib.cc.

@meteorcloudy
Copy link
Member Author

That's a good point. In that case, C is indeed a valid dependency. But I'm wondering how (or if) we could break the dependency when @cc//:lib doesn't need my_cc_test(). This is obviously not an easy task.

@lberki
Copy link
Contributor

lberki commented Jan 15, 2021

Not that it makes me happy, but that's what the state of affairs is :( and this is not the only thing that makes partial evaluation of packages difficult; see Package.beforeBuild() and .afterBuild() for the full laundry list.

@Wyverald
Copy link
Member

I think Lukács means that, in order to figure out whether @cc//:lib even needs my_cc_test(name="lib_test"), we need to evaluate my_cc_test(name="lib_test"). So essentially we almost always need to evaluate everything in the same BUILD file.

That is, of course, assuming we don't do something more sophisticated / more strictly enforced, like @comius suggested before -- enforce some sort of structure to the labels of outputs and targets generated by macros. Not sure how feasible that would be.

@meteorcloudy
Copy link
Member Author

I think Lukács means that, in order to figure out whether @cc//:lib even needs my_cc_test(name="lib_test"), we need to evaluate my_cc_test(name="lib_test"). So essentially we almost always need to evaluate everything in the same BUILD file.

Oh, I see. That just makes this problem even harder than I thought...

@alandonovan
Copy link
Contributor

alandonovan commented Jan 15, 2021

Another way to describe the status quo:

"Load statements create dependencies. If you depend on a package, you depend on all the .bzl files it transitively loads. Rule-set developers are encouraged not to add heavyweight .bzl dependencies (for example, to other repositories) to the public packages of their repositories if they are needed only during rule-set development. Such dependencies should be relegated to subpackages that are not typically loaded by ordinary users."

It's worth noting that, if a rule-set developer is wedded to the existing structure, they can move the tests into a tests/BUILD subpackage and declare aliases (or any other label-based indirect dependency) to those tests in the original package.

ADDED: This approach is similar to that used when testing an ordinary program in Go or Java: tests do not live in the same unit as the production code, they live in a separate unit that depends on the code under test, and which may have additional dependencies (possibly with some relaxation of the usual visibility rules).

@laurentlb
Copy link
Contributor

Possible duplicate: #8466

(I agree with Alan that the package should be split)

@alandonovan
Copy link
Contributor

alandonovan commented Jan 15, 2021

Does anyone think this is an actual bug that requires a behavior change? If not, let's make this a documentation issue.

@meteorcloudy
Copy link
Member Author

meteorcloudy commented Jan 15, 2021

@alandonovan @laurentlb
Thanks for adding the context!

I don't think this is a bug (I added the feature request label), but it's an issue that Bazel users would easily fall into. Better documentation could definitely help.

I'm also thinking about how could we improve the situation with bzlmod. Maybe we can have optional dependencies. Users can specify some dependencies are only required for testing or enabling some specific feature (eg. the gazelle plugin in skylib). Then the module authors have to ensure the targets they normally expose to users still builds without even declaring those dependencies. This would force them to separate the targets into different packages.

@lberki
Copy link
Contributor

lberki commented Jan 15, 2021

You are saying "force them to separate targets into different packages" as if that was a bad thing. It looks like the desired behavior is implementable without extra features in Bazel, so why add those extra features?

@alandonovan alandonovan changed the title load statements force Bazel to download unneeded external dependencies docs: give advice on project structure to avoid unnecessary 'load' dependencies Jan 15, 2021
@alandonovan
Copy link
Contributor

I think Lukacs' point (despite a small misunderstanding) is that we already have a solution that doesn't add more concepts to the design, and therefore it is preferable.

@comius
Copy link
Contributor

comius commented Jan 15, 2021

You are saying "force them to separate targets into different packages" as if that was a bad thing. It looks like the desired behavior is implementable without extra features in Bazel, so why add those extra features?

Well, shuffling the effort to the user is probably not a good thing. The user might be unfamiliar with Bazel and would not know when and why to split the BUILD files. Additionally there might be lack of motivation to fix things: The user might not have any problem with own builds, however still cause problems downstream.

Splitting BUILD files seems just a solution to make labels injective, as seen by Bazel. Solution for a problem that we're causing with current labeling schema.

@meteorcloudy
Copy link
Member Author

You are saying "force them to separate targets into different packages" as if that was a bad thing.

I certainly doesn't mean that's a bad thing. Instead of expecting users to read every documentation and best practice then follow them, a way to enforce this is more helpful. Hope that's clear in the context.

It looks like the desired behavior is implementable without extra features in Bazel, so why add those extra features?

Yeah, I meant if changing the behaviour of load statements is not favored, we can still improve the situation with bzlmod.

@brandjon
Copy link
Member

Was linked here from a discussion on modifying load semantics. I see this thread seems to be more about documentation so I'll remove the starlark label.

@meteorcloudy
Copy link
Member Author

meteorcloudy commented Jan 18, 2021

It's worth noting that, if a rule-set developer is wedded to the existing structure, they can move the tests into a tests/BUILD subpackage and declare aliases (or any other label-based indirect dependency) to those tests in the original package.

Another problem of splitting the binary and test into different packages is that it's against conventions in many languages:

Go: "Put your tests alongside your code in the same directory in a file called file_test.go where "file" is the name of the source code file you're testing. This is convention and I've found it to be best in my own experience."

@alandonovan
Copy link
Contributor

alandonovan commented Jan 19, 2021

Another problem of splitting the binary and test into different packages is that it's against conventions in many languages...

It's worth noting that the number of targets loaded by Blaze is quite significantly increased by the colocation of tests with production code. Many tests, at least in google3, have enormous data dependencies, enumerated by 'glob', that are entirely irrelevant when the BUILD package is being loaded only for a single cc_library. Furthermore, colocation also increases spurious reloads and automated test re-runs. Moving all tests into a subdirectory or parallel tree (as in Java) would be a major optimization.

Instead of expecting users to read every documentation and best practice then follow them, a way to enforce this is more helpful.

One could easily inform users and enforce good habits with a linting tool that detects when a file loads from another repo but uses the loaded symbols only for tests. A language change is not necessary.

@jin jin added the team-Documentation Documentation improvements that cannot be directly linked to other team labels label Jan 22, 2021
@sgowroji sgowroji added the team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website label Nov 22, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P2 We'll consider working on this in future. (Assignee optional) team-Documentation Documentation improvements that cannot be directly linked to other team labels team-OSS Issues for the Bazel OSS team: installation, release processBazel packaging, website type: feature request
Projects
None yet
Development

No branches or pull requests

9 participants