From d1fd6ab3ccca8272a42c1d023ad76a846f392142 Mon Sep 17 00:00:00 2001 From: Googler Date: Mon, 1 Aug 2022 02:31:53 -0700 Subject: [PATCH] Clarify that the JSON worker protocol uses camel-case and update the examples to match. PiperOrigin-RevId: 464490891 Change-Id: I579ce4cf11fe9a85b644fe5e747b7c6dd45fcf3a --- site/en/docs/creating-workers.md | 187 ++++++++++++++++------------- site/en/docs/persistent-workers.md | 24 ++-- 2 files changed, 118 insertions(+), 93 deletions(-) diff --git a/site/en/docs/creating-workers.md b/site/en/docs/creating-workers.md index 59414091d0b38d..9ad25593bf929f 100644 --- a/site/en/docs/creating-workers.md +++ b/site/en/docs/creating-workers.md @@ -3,32 +3,35 @@ Book: /_book.yaml # Creating Persistent Workers -[Persistent workers](/docs/persistent-workers) can make your build faster. -If you have repeated actions in your build that have a high startup cost or -would benefit from cross-action caching, you may want to implement your own -persistent worker to perform these actions. +[Persistent workers](/docs/persistent-workers) can make your build faster. If +you have repeated actions in your build that have a high startup cost or would +benefit from cross-action caching, you may want to implement your own persistent +worker to perform these actions. The Bazel server communicates with the worker using `stdin`/`stdout`. It supports the use of protocol buffers or JSON strings. The worker implementation has two parts: -* The [worker](#making-worker). -* The [rule that uses the worker](#rule-uses-worker). +* The [worker](#making-worker). +* The [rule that uses the worker](#rule-uses-worker). ## Making the worker {:#making-worker} A persistent worker upholds a few requirements: -* It reads [WorkRequests](https://github.com/bazelbuild/bazel/blob/6d1b9725b1e201ca3f25d8ec2a730a20aab62c6e/src/main/protobuf/worker_protocol.proto#L35){: .external} - from its `stdin`. -* It writes [WorkResponses](https://github.com/bazelbuild/bazel/blob/6d1b9725b1e201ca3f25d8ec2a730a20aab62c6e/src/main/protobuf/worker_protocol.proto#L49){: .external} - (and only `WorkResponse`s) to its `stdout`. -* It accepts the `--persistent_worker` flag. The wrapper must recognize the - `--persistent_worker` command-line flag and only make itself persistent if - that flag is passed, otherwise it must do a one-shot compilation and exit. +* It reads + [WorkRequests](https://github.com/bazelbuild/bazel/blob/6d1b9725b1e201ca3f25d8ec2a730a20aab62c6e/src/main/protobuf/worker_protocol.proto#L35){: .external} + from its `stdin`. +* It writes + [WorkResponses](https://github.com/bazelbuild/bazel/blob/6d1b9725b1e201ca3f25d8ec2a730a20aab62c6e/src/main/protobuf/worker_protocol.proto#L49){: .external} + (and only `WorkResponse`s) to its `stdout`. +* It accepts the `--persistent_worker` flag. The wrapper must recognize the + `--persistent_worker` command-line flag and only make itself persistent if + that flag is passed, otherwise it must do a one-shot compilation and exit. -If your program upholds these requirements, it can be used as a persistent worker! +If your program upholds these requirements, it can be used as a persistent +worker! ### Work requests {:#work-requests} @@ -37,41 +40,46 @@ path-digest pairs representing the inputs the worker can access (this isn’t enforced, but you can use this info for caching), and a request id, which is 0 for singleplex workers. +NOTE: While the protocol buffer specification uses "snake case" (`request_id`), +the JSON protocol uses "camel case" (`requestId`). This document uses camel case +in the JSON examples, but snake case when talking about the field regardless of +protocol. + ```json { - "args" : ["--some_argument"], + "arguments" : ["--some_argument"], "inputs" : [ - { "/path/to/my/file/1" : "fdk3e2ml23d"}, - { "/path/to/my/file/2" : "1fwqd4qdd" } + { "path": "/path/to/my/file/1", "digest": "fdk3e2ml23d"}, + { "path": "/path/to/my/file/2", "digest": "1fwqd4qdd" } ], - "request_id" : 12 + "requestId" : 12 } ``` The optional `verbosity` field can be used to request extra debugging output from the worker. It is entirely up to the worker what and how to output. Higher values indicate more verbose output. Passing the `--worker_verbose` flag to -Bazel sets the `verbosity` field to 10, but smaller or larger values can be -used manually for different amounts of output. +Bazel sets the `verbosity` field to 10, but smaller or larger values can be used +manually for different amounts of output. The optional `sandbox_dir` field is used only by workers that support [multiplex sandboxing](/docs/multiplex-worker). ### Work responses {:#work-responses} -A `WorkResponse` contains a request id, a zero or nonzero exit -code, and an output string that describes any errors encountered in processing -or executing the request. The `output` field contains a short -description; complete logs may be written to the worker's `stderr`. Because -workers may only write `WorkResponses` to `stdout`, it's common for the worker -to redirect the `stdout` of any tools it uses to `stderr`. +A `WorkResponse` contains a request id, a zero or nonzero exit code, and an +output string that describes any errors encountered in processing or executing +the request. The `output` field contains a short description; complete logs may +be written to the worker's `stderr`. Because workers may only write +`WorkResponses` to `stdout`, it's common for the worker to redirect the `stdout` +of any tools it uses to `stderr`. ```json { - "exit_code" : 1, + "exitCode" : 1, "output" : "Action failed with the following message:\nCould not find input file \"/path/to/my/file/1\"", - "request_id" : 12 + "requestId" : 12 } ``` @@ -82,29 +90,29 @@ id, so the request id must be specified if it is nonzero. This is a valid ```json { - "request_id" : 12, + "requestId" : 12, } ``` A `request_id` of 0 indicates a "singleplex" request, used when this request cannot be processed in parallel with other requests. The server guarantees that a given worker receives requests with either only `request_id` 0 or only -`request_id` greater than zero. Singleplex requests are sent in serial, for example if the -server doesn't send another request until it has received a response (except -for cancel requests, see below). +`request_id` greater than zero. Singleplex requests are sent in serial, for +example if the server doesn't send another request until it has received a +response (except for cancel requests, see below). **Notes** -* Each protocol buffer is preceded by its length in `varint` format (see -[`MessageLite.writeDelimitedTo()`](https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo-java.io.OutputStream-){: .external}. -* JSON requests and responses are not preceded by a size indicator. -* JSON requests uphold the same structure as the protobuf, but use standard - JSON. -* In order to maintain the same backward and forward compatibility - properties as protobuf, JSON workers must tolerate unknown fields in - these messages, and use the protobuf defaults for missing values. -* Bazel stores requests as protobufs and converts them to JSON using -[protobuf's JSON format](https://cs.opensource.google/protobuf/protobuf/+/master:java/util/src/main/java/com/google/protobuf/util/JsonFormat.java) +* Each protocol buffer is preceded by its length in `varint` format (see + [`MessageLite.writeDelimitedTo()`](https://developers.google.com/protocol-buffers/docs/reference/java/com/google/protobuf/MessageLite.html#writeDelimitedTo-java.io.OutputStream-){: .external}. +* JSON requests and responses are not preceded by a size indicator. +* JSON requests uphold the same structure as the protobuf, but use standard + JSON and use camel case for all field names. +* In order to maintain the same backward and forward compatibility properties + as protobuf, JSON workers must tolerate unknown fields in these messages, + and use the protobuf defaults for missing values. +* Bazel stores requests as protobufs and converts them to JSON using + [protobuf's JSON format](https://cs.opensource.google/protobuf/protobuf/+/master:java/util/src/main/java/com/google/protobuf/util/JsonFormat.java) ### Cancellation {:#cancellation} @@ -118,31 +126,33 @@ cancellation, add `supports-worker-cancellation: 1` to the A **cancel request** is a `WorkRequest` with the `cancel` field set (and similarly a **cancel response** is a `WorkResponse` with the `was_cancelled` field set). The only other field that must be in a cancel request or cancel -response is `request_id`, indicating which -request to cancel. The `request_id` field will be 0 for singleplex workers -or the non-0 `request_id` of a previously sent `WorkRequest` for multiplex -workers. The server may send cancel requests for requests that the worker has -already responded to, in which case the cancel request must be ignored. - -Each non-cancel `WorkRequest` message must be answered exactly once, whether -or not it was cancelled. Once the server has sent a cancel request, the worker -may respond with a `WorkResponse` with the `request_id` set -and the `was_cancelled` field set to true. Sending a regular `WorkResponse` -is also accepted, but the `output` and `exit_code` fields will be ignored. - -Once a response has been sent for a `WorkRequest`, the worker must not touch -the files in its working directory. The server is free to clean up the files, +response is `request_id`, indicating which request to cancel. The `request_id` +field will be 0 for singleplex workers or the non-0 `request_id` of a previously +sent `WorkRequest` for multiplex workers. The server may send cancel requests +for requests that the worker has already responded to, in which case the cancel +request must be ignored. + +Each non-cancel `WorkRequest` message must be answered exactly once, whether or +not it was cancelled. Once the server has sent a cancel request, the worker may +respond with a `WorkResponse` with the `request_id` set and the `was_cancelled` +field set to true. Sending a regular `WorkResponse` is also accepted, but the +`output` and `exit_code` fields will be ignored. + +Once a response has been sent for a `WorkRequest`, the worker must not touch the +files in its working directory. The server is free to clean up the files, including temporary files. ## Making the rule that uses the worker {:#rule-uses-worker} You'll also need to create a rule that generates actions to be performed by the -worker. Making a Starlark rule that uses a worker is just like [creating any other rule](https://github.com/bazelbuild/examples/tree/master/rules){: .external}. +worker. Making a Starlark rule that uses a worker is just like +[creating any other rule](https://github.com/bazelbuild/examples/tree/master/rules){: .external}. In addition, the rule needs to contain a reference to the worker itself, and there are some requirements for the actions it produces. ### Referring to the worker + The rule that uses the worker needs to contain a field that refers to the worker itself, so you'll need to create an instance of a `\*\_binary` rule to define your worker. If your worker is called `MyWorker.Java`, this might be the @@ -179,32 +189,32 @@ as tool during the build). The rule that uses the worker creates actions for the worker to perform. These actions have a couple of requirements. -* The _"arguments"_ field. This takes a list of strings, all but the last - of which are arguments passed to the worker upon startup. The last element in - the "arguments" list is a `flag-file` (@-preceded) argument. Workers read - the arguments from the specified flagfile on a per-WorkRequest basis. Your - rule can write non-startup arguments for the worker to this flagfile. +* The *"arguments"* field. This takes a list of strings, all but the last of + which are arguments passed to the worker upon startup. The last element in + the "arguments" list is a `flag-file` (@-preceded) argument. Workers read + the arguments from the specified flagfile on a per-WorkRequest basis. Your + rule can write non-startup arguments for the worker to this flagfile. -* The _"execution-requirements"_ field, which takes a dictionary containing - `"supports-workers" : "1"`, `"supports-multiplex-workers" : "1"`, or both. +* The *"execution-requirements"* field, which takes a dictionary containing + `"supports-workers" : "1"`, `"supports-multiplex-workers" : "1"`, or both. - The "arguments" and "execution-requirements" fields are required for all - actions sent to workers. Additionally, actions that should be executed by - JSON workers need to include `"requires-worker-protocol" : "json"` in the - execution requirements field. `"requires-worker-protocol" : "proto"` is also - a valid execution requirement, though it’s not required for proto workers, - since they are the default. + The "arguments" and "execution-requirements" fields are required for all + actions sent to workers. Additionally, actions that should be executed by + JSON workers need to include `"requires-worker-protocol" : "json"` in the + execution requirements field. `"requires-worker-protocol" : "proto"` is also + a valid execution requirement, though it’s not required for proto workers, + since they are the default. - You can also set a `worker-key-mnemonic` in the execution requirements. This - may be useful if you're reusing the executable for multiple action types and - want to distinguish actions by this worker. + You can also set a `worker-key-mnemonic` in the execution requirements. This + may be useful if you're reusing the executable for multiple action types and + want to distinguish actions by this worker. -* Temporary files generated in the course of the action should be saved to the - worker's directory. This enables sandboxing. +* Temporary files generated in the course of the action should be saved to the + worker's directory. This enables sandboxing. -Note: To pass an argument starting with a literal `@`, start the argument -with `@@` instead. If an argument is also an external repository label, it will -not be considered a flagfile argument. +Note: To pass an argument starting with a literal `@`, start the argument with +`@@` instead. If an argument is also an external repository label, it will not +be considered a flagfile argument. Assuming a rule definition with "worker" attribute described above, in addition to a "srcs" attribute representing the inputs, an "output" attribute @@ -224,19 +234,26 @@ ctx.actions.run( ) ``` -For another example, see [Implementing persistent workers](/docs/persistent-workers#implementation). +For another example, see +[Implementing persistent workers](/docs/persistent-workers#implementation). ## Examples {:#examples} -The Bazel code base uses [Java compiler workers](https://github.com/bazelbuild/bazel/blob/a4251eab6988d6cf4f5e35681fbe2c1b0abe48ef/src/java_tools/buildjar/java/com/google/devtools/build/buildjar/BazelJavaBuilder.java){: .external}, -in addition to an [example JSON worker](https://github.com/bazelbuild/bazel/blob/c65f768fec9889bbf1ee934c61d0dc061ea54ca2/src/test/java/com/google/devtools/build/lib/worker/ExampleWorker.java){: .external} that is used in our integration tests. +The Bazel code base uses +[Java compiler workers](https://github.com/bazelbuild/bazel/blob/a4251eab6988d6cf4f5e35681fbe2c1b0abe48ef/src/java_tools/buildjar/java/com/google/devtools/build/buildjar/BazelJavaBuilder.java){: .external}, +in addition to an +[example JSON worker](https://github.com/bazelbuild/bazel/blob/c65f768fec9889bbf1ee934c61d0dc061ea54ca2/src/test/java/com/google/devtools/build/lib/worker/ExampleWorker.java){: .external} +that is used in our integration tests. -You can use their [scaffolding](https://github.com/bazelbuild/bazel/blob/a4251eab6988d6cf4f5e35681fbe2c1b0abe48ef/src/main/java/com/google/devtools/build/lib/worker/WorkRequestHandler.java){: .external} to make any Java-based tool into a worker by passing in the correct -callback. +You can use their +[scaffolding](https://github.com/bazelbuild/bazel/blob/a4251eab6988d6cf4f5e35681fbe2c1b0abe48ef/src/main/java/com/google/devtools/build/lib/worker/WorkRequestHandler.java){: .external} +to make any Java-based tool into a worker by passing in the correct callback. For an example of a rule that uses a worker, take a look at Bazel's [worker integration test](https://github.com/bazelbuild/bazel/blob/22b4dbcaf05756d506de346728db3846da56b775/src/test/shell/integration/bazel_worker_test.sh#L106){: .external}. External contributors have implemented workers in a variety of languages; take a -look at [Polyglot implementations of Bazel persistent workers](https://github.com/Ubehebe/bazel-worker-examples){: .external}. -You can [find many more examples on GitHub](https://github.com/search?q=bazel+workrequest&type=Code){: .external}! +look at +[Polyglot implementations of Bazel persistent workers](https://github.com/Ubehebe/bazel-worker-examples){: .external}. +You can +[find many more examples on GitHub](https://github.com/search?q=bazel+workrequest&type=Code){: .external}! diff --git a/site/en/docs/persistent-workers.md b/site/en/docs/persistent-workers.md index 7a6120b0bfab22..0df98f1c8e4bfe 100644 --- a/site/en/docs/persistent-workers.md +++ b/site/en/docs/persistent-workers.md @@ -158,7 +158,7 @@ For Android builds, see details at the ## Implementing persistent workers {:#implementation} -See the [creating persistent workers](/docs/creating-workers) page for +See the [creating persistent workers](/docs/creating-workers) page for more information on how to make a worker. This example shows a Starlark configuration for a worker that uses JSON: @@ -184,12 +184,19 @@ With this definition, the first use of this action would start with executing the command line `/bin/some_compiler -max_mem=4G --persistent_worker`. A request to compile `Foo.java` would then look like: -```prototext -arguments: [ "-g", "-source", "1.5", "Foo.java" ] -inputs: [ - {path: "symlinkfarm/input1" digest: "d49a..." }, - {path: "symlinkfarm/input2", digest: "093d..."}, -] +NOTE: While the protocol buffer specification uses "snake case" (`request_id`), +the JSON protocol uses "camel case" (`requestId`). In this document, we will use +camel case in the JSON examples, but snake case when talking about the field +regardless of protocol. + +```json +{ + "arguments": [ "-g", "-source", "1.5", "Foo.java" ] + "inputs": [ + { "path": "symlinkfarm/input1", "digest": "d49a..." }, + { "path": "symlinkfarm/input2", "digest": "093d..." }, + ], +} ``` The worker receives this on `stdin` in newline-delimited JSON format (because @@ -262,6 +269,7 @@ For more information on persistent workers, see: {: .external} * [Front End Development with Bazel: Angular/TypeScript and Persistent Workers w/ Asana](https://www.youtube.com/watch?v=0pgERydGyqo) {: .external} -* [Bazel strategies explained](https://jmmv.dev/2019/12/bazel-strategies.html) {: .external} +* [Bazel strategies explained](https://jmmv.dev/2019/12/bazel-strategies.html) + {: .external} * [Informative worker strategy discussion on the bazel-discuss mailing list](https://groups.google.com/forum/#!msg/bazel-discuss/oAEnuhYOPm8/ol7hf4KWJgAJ) {: .external}