Start Date: 2019-02-12
RFC PR: (leave this empty)
Yarn Issue: (leave this empty)

Summary

This RFC proposes the addition of a new optional environment variable recognized by Yarn to allow users to specify the path of one or more structured logs, where Yarn can pipe its JSON-formatted output to. This allows automation systems such as continuous integration servers to monitor structured output of Yarn without affecting the usual human-readable unstructured output.

Motivation

Automation systems often want to make use of structured output of the kind generated by the --json flag. For example, a continuous integration server can use the structured output to detect and classify errors for reporting diagnostics in the user interface, or for monitoring and system metrics.

At the same time, human users of Yarn benefit from Yarn's beautiful and user-friendly console output and from being able to read that kind of output in unstructured logs.

This RFC proposes to allow both by specifying a colon-separated (:) list of filesystem paths in a YARN_LOGS environment variable, which Yarn will append JSON output to. The structured logs option has no effect on the choice of output format for the normal console, allowing humans to retain control over the format of Yarn's normal output.

Detailed design

The YARN_LOGS environment variable, if present, can contain any number of colon-separate (:) filesysem paths. (Empty entries are ignored.) As Yarn produces output, it adds the JSON-formatted structured log output to each of these files in addition to whatever output it would normally produce to the standard output streams.

How We Teach This

This is a relatively advanced use case, so for most users, they won't need to encounter this feature. It shouldn't carry any cognitive burden for most users.

It can be added to the env vars page of the Yarn documentation web site.

Drawbacks

There could potentially be some performance impact in the extra I/O, but I imagine it would be pay-as-you-go, since it should only involve some very inexpensive additional environment variable checks for the empty case.

This relies on some amount of benign collaboration for multiple distinct users not to stomp on each other's use of the environment variable.

I'm not sure if there are any implementation challenges with this approach.

Alternatives

For the use case I ran into, I've had trouble coming up with alternatives: at my company we have a CI server that we would like to be able to automatically detect network issues on, but we want to allow our developers to use Yarn however they like in their CI scripts.

Unresolved questions

What environment variable name to use? I went with the simple YARN_LOG, but I'm open to suggestions.
What about additional unstructured logs? We could offer two separate environment variables, one for structured logs and one for unstructured. The motivating use cases of automation is driven by structured logs, so I didn't have a use case in mind for generating multiple unstructured logs. But I'm certainly open to it.
Is colon-separation inconvenient for scripting? It's a little annoying to check for the empty case in shell scripts. But we could simply ignore empty entries in the list, so scripts could unconditionally write YARN_LOG=${YARN_LOG}:/path/to/my/log.
Should Windows use semicolon separators? It's more idiomatic to use semicolons (;) as path separators in Windows. However, I figured it was better to have the same behavior across operating systems. Is that the right call?

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

0000-structured-logs.md

0000-structured-logs.md

Summary

Motivation

Detailed design

How We Teach This

Drawbacks

Alternatives

Unresolved questions

Files

0000-structured-logs.md

Latest commit

History

0000-structured-logs.md

File metadata and controls

Summary

Motivation

Detailed design

How We Teach This

Drawbacks

Alternatives

Unresolved questions