Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Treat failure to resolve a toolchain as if it were an incompatible target for skipping in ... expansion. #12419

Open
aiuto opened this issue Nov 4, 2020 · 17 comments
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Configurability Issues for Configurability team type: feature request

Comments

@aiuto
Copy link
Contributor

aiuto commented Nov 4, 2020

Incompatible target skipping will skip targets that can not be built for the specified destination platform.
There is also a need to be able to skip targets which can not be built on the execution platform. For example, because a test might depend on an OS native tool which makes no sense on any other platform, or a compiler is only licensed to a few users in an organization.

It would be convenient if failure to resolve a toolchain (usually because one was not registered) as a skippable target.

TBD: Bring in more context from the mail thread which sprouted this idea.

cc: @gregestren @AustinSchuh @philsc @katre

@aiuto aiuto added P2 We'll consider working on this in future. (Assignee optional) team-Configurability Issues for Configurability team labels Nov 4, 2020
@philsc
Copy link
Contributor

philsc commented Nov 12, 2020

I'm admittedly more of a toolchains newbie than I'd like to be. I'll do some reading to better understand what it means to have missing toolchain etc.

At a high level I can definitely see that being a desirable feature for the reasons you listed.

Quoting @aiuto from the e-mail chain:

Proposal:
For ... resolution, failure to find a toolchain would be skip just
like target_compatible_with. You would, however, get the warning
that the toolchain was not found. If you try to build/test a specific
target, however, it still fails with an error.

Pro:

  • easy to understand.
  • easier for the rule author to get the toolchain specs correct
  • no overloading target_compatible_with with failure to have the
    tools at the executable side.

Con:

  • if you had tests which required their own unique toolchain you
    may skip the tests on some platforms.

@aiuto
Copy link
Contributor Author

aiuto commented Nov 12, 2020

The ' easier for the rule author to get the toolchain specs correct' requires some explanation.

In bazelbuild/rules_pkg#254, I am wrapping a linux only tool (rpmbuild) in a toolchain.
In order to get ... skipping to work on platforms where rpmbuild can not be found, I had to create a no-op toolchain which resolved, but contained an attribute saying it was invalid. I exposed the validity as a constraint which we could select() on in a target_compatible_with clause.

It is not a huge amount of work, but you have to see the "trick" to doing it that way, To me, that means it is too hard.

@AustinSchuh
Copy link
Contributor

There's something to the idea to explore for sure. Here is another use cases that exists today:

  • Bazel can't guarentee that the android SDK/NDK is installed when it is being built. It has workspace entries which the user un-comments to enable the toolchains. If the user doesn't un-comment those toolchains, there is a build failure.
    -> This proposal would skip the android targets if there is no compiler available for android.

Honestly, that sounds pretty helpful. Likely that should have the same set of constraints that Phil added to incompatible target skipping. If the user requests it through //..., no error and print skipped. If the user explicitly requests an android target in that case, error out and complain.

@aiuto
Copy link
Contributor Author

aiuto commented Nov 12, 2020

That's a perfect example. We see this all the time with mobile apps and a shared code base.

  • the iOS developers don't have any android SDKs installed
  • the Android developers don't have xcode installed
  • they both want to be able to say bazel test ... and see the tests for their platform.

@philsc
Copy link
Contributor

philsc commented Nov 23, 2020

While not helpful for resolving this issue, I did mention this issue in the Filtering incompatible targets in Starlark proposal. My naive thought is that skipping targets with a missing toolchain can also be expressed via a Starlark-accessible provider. That becomes relevant in that proposal.

@gregestren
Copy link
Contributor

Would this work as a current workaround?

  1. Define a default "empty" toolchain that's compatible with everything.
  2. For rule implementations using that toolchain:
if toolchain="empty toolchain":
   return [IncompatiblePlatformProvider()]

@philsc
Copy link
Contributor

philsc commented Jan 5, 2021

@gregestren , the IncompatiblePlatformProvider class is currently not exposed to Starlark so that approach unfortunately doesn't work at the moment. However, I think that's a really neat, simple idea!

@github-actions
Copy link

Thank you for contributing to the Bazel repository! This issue has been marked as stale since it has not had any activity in the last 2+ years. It will be closed in the next 14 days unless any other activity occurs or one of the following labels is added: "not stale", "awaiting-bazeler". Please reach out to the triage team (@bazelbuild/triage) if you think this issue is still relevant or you are interested in getting the issue resolved.

@github-actions github-actions bot added the stale Issues or PRs that are stale (no activity for 30 days) label Apr 29, 2023
@aiuto
Copy link
Contributor Author

aiuto commented Apr 29, 2023

Not really stale. We should probably get this effect somehow.

@sgowroji sgowroji removed the stale Issues or PRs that are stale (no activity for 30 days) label Apr 29, 2023
@philsc
Copy link
Contributor

philsc commented May 3, 2023

@gregestren , do you have any cool ideas on a good way to accomplish this? I assumed it couldn't be too bad. I think I almost got there with this branch: master...philsc:bazel:unreviewed/philsc/fix-12419

But the test encounters this error:

$ bazel test -c opt //src/test/shell/integration:target_compatible_with_test --test_output=streamed --test_filter=test_incompatible_with_missing_toolchain
...
FATAL: bazel crashed due to an internal error. Printing stack trace:
java.lang.IllegalStateException: Unexpected exception: dep Dependency{label=//target_skipping:compiler_flag, configuration=1882c65bed561e30d863f6ecf00a6c3357d15ac54eca64e00aa2a71ff790ba77, aspects=AspectCollection{[]}, transitionKeys=[], executionPlatformLabel=null} had null value, even though there were no values missing in the initial fetch. That means it had an unexpected exception type (not ConfiguredValueCreationException)
        at com.google.devtools.build.lib.bugreport.BugReport.sendBugReport(BugReport.java:183)
        at com.google.devtools.build.lib.bugreport.BugReport.logUnexpected(BugReport.java:154)
        at com.google.devtools.build.lib.skyframe.PrerequisiteProducer.resolveConfiguredTargetDependencies(PrerequisiteProducer.java:949)
        at com.google.devtools.build.lib.skyframe.PrerequisiteProducer.computeDependencies(PrerequisiteProducer.java:740)
        at com.google.devtools.build.lib.skyframe.PrerequisiteProducer.evaluate(PrerequisiteProducer.java:349)
        at com.google.devtools.build.lib.skyframe.ConfiguredTargetFunction.compute(ConfiguredTargetFunction.java:203)
        at com.google.devtools.build.skyframe.ParallelEvaluator.bubbleErrorUp(ParallelEvaluator.java:422)
        at com.google.devtools.build.skyframe.ParallelEvaluator.waitForCompletionAndConstructResult(ParallelEvaluator.java:211)
        at com.google.devtools.build.skyframe.ParallelEvaluator.doMutatingEvaluation(ParallelEvaluator.java:177)
        at com.google.devtools.build.skyframe.ParallelEvaluator.eval(ParallelEvaluator.java:672)
        at com.google.devtools.build.skyframe.InMemoryMemoizingEvaluator.evaluate(InMemoryMemoizingEvaluator.java:177)
        at com.google.devtools.build.lib.skyframe.SkyframeExecutor.configureTargets(SkyframeExecutor.java:2276)
        at com.google.devtools.build.lib.skyframe.SkyframeBuildView.configureTargets(SkyframeBuildView.java:343)
        at com.google.devtools.build.lib.analysis.BuildView.update(BuildView.java:440)
        at com.google.devtools.build.lib.buildtool.AnalysisPhaseRunner.runAnalysisPhase(AnalysisPhaseRunner.java:242)
        at com.google.devtools.build.lib.buildtool.AnalysisPhaseRunner.execute(AnalysisPhaseRunner.java:140)
        at com.google.devtools.build.lib.buildtool.BuildTool.buildTargets(BuildTool.java:182)
        at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:529)
        at com.google.devtools.build.lib.buildtool.BuildTool.processRequest(BuildTool.java:497)
        at com.google.devtools.build.lib.runtime.commands.BuildCommand.exec(BuildCommand.java:103)
        at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.execExclusively(BlazeCommandDispatcher.java:625)
        at com.google.devtools.build.lib.runtime.BlazeCommandDispatcher.exec(BlazeCommandDispatcher.java:240)
        at com.google.devtools.build.lib.server.GrpcServerImpl.executeCommand(GrpcServerImpl.java:550)
        at com.google.devtools.build.lib.server.GrpcServerImpl.lambda$run$1(GrpcServerImpl.java:614)
        at io.grpc.Context$1.run(Context.java:566)
        at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(Unknown Source)
        at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(Unknown Source)
        at java.base/java.lang.Thread.run(Unknown Source)
------------------------------------------------------------------------
test_incompatible_with_missing_toolchain FAILED: Bazel build failed unexpectedly..

Something is unhappy about me trying to propagate the ToolchainException up higher. It's looking for a ConfiguredValueCreationException instead.

Anyway, if you have any ideas, please let me know :)

@gregestren
Copy link
Contributor

I'll need to refresh my memory on this. I'll schedule time for me and @katre and @aiuto to review the issue again.

@aiuto aiuto added P3 We're not considering working on this, but happy to review a PR. (No assignee) and removed P2 We'll consider working on this in future. (Assignee optional) labels May 5, 2023
@katre
Copy link
Member

katre commented May 5, 2023

We chatted about this: the underlying idea that targets which fail toolchain resolution should be marked incompatible instead of failing the entire build makes sense. It's unclear to me now if that should always be true (I'd really prefer to not add Yet Another Flag), but it makes sense as a baseline.

@philsc Your change looks reasonable at a glance. You need to either wrap the ToolchainException in a ConfiguredValueCreationException (see an example) or handle it directly and not re-throw anything (possibly by returning the incompatible configured target).

I haven't done a deep review of the code, let me know when you have a PR ready.

@philsc
Copy link
Contributor

philsc commented May 16, 2023

Sounds good. Thanks @katre . I'm going to try a few things.

@gregestren
Copy link
Contributor

Debugging more, I'm still unsure what's triggering that error. Two followup thoughts:

  1. Note the expressed failing target is //target_skipping:compiler_flag. What happens if you modify the test to build that directly?
  2. Does the error still occur if you catch the ToolchainException as your code already is and comment out the code that returns the incompatible provider? i.e. catch the exception and re-throw it just as the pre-existing code does, would that still trigger this error?

@fmeum
Copy link
Collaborator

fmeum commented Jun 27, 2023

Since the planned behavior is another situation in which an error (toolchain resolution failed) is effectively silenced (target is skipped), I would prefer to have this configurable.

@katre I formulated a concept for a similar setting that doesn't involve yet another flag here: #18707 (comment). I would be interested in hearing your thoughts on this.

@philsc
Copy link
Contributor

philsc commented Jul 3, 2023

I believe that whatever solution manifests for #18707 should solve the same concern for this, correct @fmeum ?

@fmeum
Copy link
Collaborator

fmeum commented Jul 3, 2023

@philsc Yes, I think so. The would ideally also be configurable on a per-repo level.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
P3 We're not considering working on this, but happy to review a PR. (No assignee) team-Configurability Issues for Configurability team type: feature request
Projects
None yet
Development

No branches or pull requests

8 participants