Unclear documentation on how workers treat tool inputs and non-tool inputs #10498

endobson · 2019-12-29T21:57:34Z

ATTENTION! Please read and follow:

if this is a question about how to build / test / query / deploy using Bazel, or a discussion starter, send it to bazel-discuss@googlegroups.com

if this is a bug or feature request, fill the form below as best as you can.

Description of the problem / feature request:

With a custom worker, each action can reuse an existing worker process or start a new one. It seems that the current logic doesn't allow specifying what inputs are needed by the worker before it is able to accept requests and what is needed only for handling individual requests. Thus while if code that is used by the worker at startup is changed it means that the old worker processes cannot be safely reused, bazel currently assumes that worker processes can be reused.

Inputs vs tools is not a sufficient distinction here as that is currently for host vs target configuration and a rule may not need the worker restarted if a tool is changed.

This leads to bugs that are able to be fixed by a bazel clean, (which seems to shutdown old workers).

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

I don't have one. You can try taking, endobson/minimal-racket@3ad8f44 and mucking with the tools to make it fail to compile and then revert the changes. You should still see issues after reverting the changes, but before running bazel clean.

What operating system are you running Bazel on?

Mac OS

What's the output of `bazel info release`?

release 2.0.0

What's the output of `git remote get-url origin ; git rev-parse master ; git rev-parse HEAD` ?

This question really needs to be reworded as only applying if bazel is a git version, and that you want this run in the development version of the bazel directory.

Have you found anything relevant by searching the web?

Yes that the documentation for this feature (workers) is non-existent and has been so for years, and fixing this has been a P1 for over 6 months. #7997

The text was updated successfully, but these errors were encountered:

rupertks · 2020-01-02T16:52:12Z

@jmmv

jmmv · 2020-01-09T16:15:10Z

Could you provide some example with specific kinds of inputs you are talking about? After a quick look, I cannot understand the problem you are describing. Thanks.

endobson · 2020-01-09T16:43:14Z

I linked the code that is for running the workers, endobson/minimal-racket@3ad8f44. The worker executable reads some of the inputs before it starts accepting requests. These are the libraries it uses in its implementation, as the language runtime I'm using doesn't produce a single native executable. If these change then the worker should produce different results, but that will only happen if the worker process gets restarted, which it doesn't (as that is the point of persistent workers).

This means that using an existing worker versus using bazel clean to shut down the workers produce different results.

endobson · 2020-01-09T16:56:14Z

Maybe here is a better question: Some time when rerunning an action with a persistent worker the worker will need to be restarted due to the worker's code changing. Which inputs correspond to that versus just running the action on the existing persistent worker process?

jmmv · 2020-01-09T18:10:41Z

Ah, I see what you mean now. So you are talking about the dependencies of the worker program itself. How do these change during the build though? I'm surprised that's possible.

endobson · 2020-01-09T18:29:08Z

They don't change during regular operation. But when first writing the worker program and debugging issues with it, such changes were very common.

jmmv · 2020-01-09T18:45:18Z

Ah, I see!

But... it's pretty much impossible for Bazel to know what the dependencies of an arbitrary binary are, and having a way to manually specify them seems pretty error prone. We could monitor for changes to the binary itself (and we probably should if we don't already), but that doesn't help if dynamic libraries change for example.

Do you have a specific proposal other than the simplistic suggestion above?

endobson · 2020-01-09T18:49:37Z

I want documentation telling me how this is supposed to work, and fairly big warnings on how this can cause inconsistent builds, mechanisms to detect it/avoid it.

For example would having all the dependencies as part of the runfiles of the target used as 'executable' have solved this? Thats doable for me, and seems like a fairly easy to explain rule.

meisterT · 2020-08-14T11:24:02Z

cc @larsrc-google

larsrc-google · 2020-08-14T14:14:44Z

Documentation for persistent workers is now available at https://docs.bazel.build/versions/master/persistent-workers.html, though it doesn't address this particular problem.

I don't think adding the dependencies as runfiles would help at the moment. The tools are by and large treated as black boxes - they could be executables created by a third party that doesn't want to give you their source. They are not expected to be Bazel targets. However, if the WorkerKey changes, new workers are created. So if you can somehow update the arguments or environment on recompile, that would get picked up automatically.

Alternatively, when working on the workers themselves, use the --worker_quit_after_build flag.

larsrc-google · 2021-04-08T11:29:02Z

Reading through this again, it seems to be a bigger issue than just workers. If I understand correctly, your action runs a binary that loads some dynamic library, but this library is not declared as part of the runfiles (or some of its further dependencies aren't declared). But that would mean that your build, regardless of use of workers, is not properly hermetic, since a change that Bazel is not informed of can change what the result is.

endobson · 2021-04-08T17:09:41Z

All of my tools are intended to be hermetic and I use bazel to generate them. The underlying racket vm/runtime is an external repository in my WORKSPACE and I'm using toolchains to find them for the rules (https://github.com/endobson/minimal-racket/blob/master/WORKSPACE).

What I think you are referring to as the dynamic library is the bazel-tools/racket-compiler library that I use here: https://github.com/endobson/minimal-racket/blob/3ad8f443f517b1ece4a1e7bbd1b8ee2b37b7919f/racket.bzl#L93

That library is intended to be declared as a dependency of the action here: https://github.com/endobson/minimal-racket/blob/3ad8f443f517b1ece4a1e7bbd1b8ee2b37b7919f/racket.bzl#L93 through the toolchain https://github.com/endobson/minimal-racket/blob/3c95378d2c5379225fd8a233988773d97c9c4fc4/BUILD#L29.

IIRC when originally filing this issue I observed that changes to (https://github.com/endobson/minimal-racket/blob/3ad8f443f517b1ece4a1e7bbd1b8ee2b37b7919f/build_rules/racket-compiler.rkt) would make bazel rebuild my resulting code if I wasn't using workers but would not do the same if I was.

larsrc-google · 2021-04-08T17:45:58Z

After chatting, we found that the four lines from https://github.com/endobson/minimal-racket/blob/3ad8f443f517b1ece4a1e7bbd1b8ee2b37b7919f/racket.bzl#L108 should go into tools instead of inputs. @endobson will try this out.

endobson · 2021-04-09T03:42:44Z

Haven't yet got to seeing if tools versus inputs will solve my particular problem, but I'm fairly confident we correctly identified the issue. But I spent some more time thinking about it and I think the original point that the distinction between deps and tools is not sufficient for workers solely given their definitions. But it might be made sufficient with documentation on how workers treats them.

I'm going to say that the environment/input that the worker is executed in/on is split into two parts, the components that are to be reused/cached across requests (e.g. worker startup flag, interpreted code) and those that will be different between requests (e.g. filename to compile, file contents of file to compiles). These may take the form of flags or files in the filesystem. It is important for correctness that the worker to be restarted every time one of its reusable inputs changes/for it not to treat the inputs that bazel thinks are changeable as reusable. Thus the specification of what bazel thinks is which is very important for a tool author to know. And I cannot find that in the public documentation:

https://docs.bazel.build/versions/master/persistent-workers.html#implementing-persistent-workers Says that the worker key is based on the startup flags and mnemonic but doesn't mention files.
https://docs.bazel.build/versions/master/creating-workers.html#work-action-requirements Doesn't have much to say on the topic.

The internal implementation seems to use only the tools in computing the hash:

bazel/src/main/java/com/google/devtools/build/lib/worker/WorkerFilesHash.java

Line 60 in f6e1074

List<ActionInput> tools =

This seems like a very reasonable decision because tools correspond to those built for the execution platform where as inputs correspond to things built for the target platform. But the difference between those fields is the platform for which they are built, and I think that if persistent workers are going to use tools for as part of the worker key then documenting that is the right solution. That doesn't let a worker mark target platform dependencies as ones where it wants to be rebuilt, but I cannot find a good use case for that and it can be handled by the worker restarting itself internally when the hash changes on a new WorkRequest.

irengrig added team-Local-Exec Issues and PRs for the Execution (Local) team untriaged labels Jan 2, 2020

jmmv added the more data needed label Jan 9, 2020

jmmv added P3 We're not considering working on this, but happy to review a PR. (No assignee) type: documentation (cleanup) and removed more data needed untriaged labels Jan 16, 2020

larsrc-google self-assigned this Apr 9, 2021

keertk added the team-Documentation Documentation improvements that cannot be directly linked to other team labels label Jan 10, 2023

larsrc-google changed the title ~~Workers allow inconsistent builds~~ Unclear documentation on how workers treat tool inputs and non-tool inputs Apr 26, 2023

larsrc-google assigned wilwell and unassigned larsrc-google Jul 4, 2023

crt-31 mentioned this issue Jul 11, 2023

call close() on compiler in scalacWorker after each job bazelbuild/rules_scala#1504

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unclear documentation on how workers treat tool inputs and non-tool inputs #10498

Unclear documentation on how workers treat tool inputs and non-tool inputs #10498

endobson commented Dec 29, 2019

rupertks commented Jan 2, 2020

jmmv commented Jan 9, 2020

endobson commented Jan 9, 2020

endobson commented Jan 9, 2020

jmmv commented Jan 9, 2020

endobson commented Jan 9, 2020

jmmv commented Jan 9, 2020

endobson commented Jan 9, 2020

meisterT commented Aug 14, 2020

larsrc-google commented Aug 14, 2020

larsrc-google commented Apr 8, 2021

endobson commented Apr 8, 2021

larsrc-google commented Apr 8, 2021

endobson commented Apr 9, 2021

Unclear documentation on how workers treat tool inputs and non-tool inputs #10498

Unclear documentation on how workers treat tool inputs and non-tool inputs #10498

Comments

endobson commented Dec 29, 2019

Description of the problem / feature request:

Bugs: what's the simplest, easiest way to reproduce this bug? Please provide a minimal example if possible.

What operating system are you running Bazel on?

What's the output of bazel info release?

What's the output of git remote get-url origin ; git rev-parse master ; git rev-parse HEAD ?

Have you found anything relevant by searching the web?

rupertks commented Jan 2, 2020

jmmv commented Jan 9, 2020

endobson commented Jan 9, 2020

endobson commented Jan 9, 2020

jmmv commented Jan 9, 2020

endobson commented Jan 9, 2020

jmmv commented Jan 9, 2020

endobson commented Jan 9, 2020

meisterT commented Aug 14, 2020

larsrc-google commented Aug 14, 2020

larsrc-google commented Apr 8, 2021

endobson commented Apr 8, 2021

larsrc-google commented Apr 8, 2021

endobson commented Apr 9, 2021

What's the output of `bazel info release`?

What's the output of `git remote get-url origin ; git rev-parse master ; git rev-parse HEAD` ?