Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Skip capture of TRACE workunits by default #13483

Merged
merged 7 commits into from
Feb 28, 2022

Conversation

stuhood
Copy link
Sponsor Member

@stuhood stuhood commented Nov 4, 2021

Currently, workunits are captured and stored (in memory, but also via any StreamingWorkunitHandler callbacks), regardless of their level. But in many cases, this results in tens of thousands of workunits, only a fraction of which will ever be consumed.

For some runs, the ratio between TRACE workunits and all other workunits (DEBUG, etc) is 16 to 1... meaning that disabling TRACE by default results in significantly less data being captured (and less CPU usage as well).

In order to continue to support capturing finer grained details, the WorkunitStore is adjusted to capture only what is required by:

  1. the --dynamic-ui, if it is enabled
  2. a new --streaming-workunits-level option
  3. the log --level option
  4. the --log-levels-by-target values

Microbenchmarks show about a 5% improvement (for hyperfine -w 2 './pants test --force --no-pytest-timeouts src/python/pants/engine/internals/engine_benchmarks_test.py'), and an 11-15% reduction in memory usage for ./pants dependencies --transitive ::.

[ci skip-build-wheels]

@stuhood
Copy link
Sponsor Member Author

stuhood commented Nov 4, 2021

Commits are useful to review independently.

Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thanks!

I'm trying to understand how this interacts with --log-levels-by-target? It'd be helpful to confirm that can still result in workunits being created at TRACE even if everything else is DEBUG - probably worth a unit test. Also might be worth updating the help message.

If you set --streaming-workunits-level=INFO, I don't see code to filter out DEBUG? The max_level would be set to at least DEBUG because of the dynamic_ui, so those workunits will be uploaded. We need to filter them out though before sending to streaming workunit handler.

src/python/pants/option/global_options.py Outdated Show resolved Hide resolved
src/python/pants/option/global_options.py Outdated Show resolved Hide resolved
src/rust/engine/workunit_store/src/lib.rs Show resolved Hide resolved
@asherf
Copy link
Member

asherf commented Nov 4, 2021

@stuhood please hold off on landing this until we get a chance to see if this causes some issues with the toolchain backend stuff that needs to be addressed. (I didn't get a chance to look at this yet)

@stuhood
Copy link
Sponsor Member Author

stuhood commented Feb 19, 2022

If you set --streaming-workunits-level=INFO, I don't see code to filter out DEBUG? The max_level would be set to at least DEBUG because of the dynamic_ui, so those workunits will be uploaded. We need to filter them out though before sending to streaming workunit handler.

This happens via StreamingWorkunitHandler.max_workunit_verbosity, which does the filtering lazily during poll.

@stuhood stuhood force-pushed the stuhood/configurable-workunit-level branch from 8dac301 to 5afa07f Compare February 19, 2022 22:32
pants.ci.toml Outdated Show resolved Hide resolved
stuhood added a commit that referenced this pull request Feb 21, 2022
In order to unblock optionally recording workunits without losing counter accuracy (in #13483), move to recording counters globally via the `StreamingWorkunitContext`.

[ci skip-build-wheels]
alonsodomin pushed a commit to alonsodomin/pants that referenced this pull request Feb 25, 2022
…#14541)

In order to unblock optionally recording workunits without losing counter accuracy (in pantsbuild#13483), move to recording counters globally via the `StreamingWorkunitContext`.

[ci skip-build-wheels]
… default (`DEBUG`) than before (`TRACE`).

[ci skip-rust]
[ci skip-build-wheels]
[ci skip-rust]
[ci skip-build-wheels]
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
…ults in skipping the construction of `TRACE` workunits.

[ci skip-build-wheels]
[ci skip-build-wheels]

[ci skip-rust]
@stuhood
Copy link
Sponsor Member Author

stuhood commented Feb 28, 2022

Anecdotally: this reduces memory usage for local runs of ./pants dependencies --transitive :: by 11-15%.

@stuhood stuhood force-pushed the stuhood/configurable-workunit-level branch from 5afa07f to 0adcef7 Compare February 28, 2022 18:09
[ci skip-rust]

[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@stuhood stuhood force-pushed the stuhood/configurable-workunit-level branch from 0adcef7 to 8adecea Compare February 28, 2022 18:16
Copy link
Member

@asherf asherf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@stuhood stuhood merged commit 30640b2 into pantsbuild:main Feb 28, 2022
@stuhood stuhood deleted the stuhood/configurable-workunit-level branch February 28, 2022 19:38
stuhood added a commit to stuhood/pants that referenced this pull request Mar 3, 2022
…being filtered out, which caused their (blocked) parents to render.
stuhood added a commit that referenced this pull request Mar 3, 2022
…iltering. (#14681)

#13483 resulted in `acquire_command_runner_slot` workunits being filtered out at capture time, which caused their (blocked) parents to render: in particular, `"Scheduling: $process"` workunits.
stuhood added a commit that referenced this pull request Mar 29, 2022
…e themselves (#14934)

#13483 broke the rendering of `EngineAwareReturnType` implementations which relied on starting a workunit at `Level::TRACE`, and then escalating its level to something visible (usually `INFO` or greater) when it completed. `check` outputs for the JVM were strongly affected (see #14867), since they relied on the fact that `FallibleClasspathEntry` escalates to `ERROR` to render compile errors.

To resolve this, we roll back a portion of #13483. Rather than not recording the workunit at all, we instead record it in the `WorkunitStore` as "disabled", which is signaled by a workunit not having any `WorkunitMetadata`. This has some of the efficiency benefits of #13483 (because we continue to skip heap allocating the metadata's fields), but allows a workunit to escalate itself from disabled to enabled as it completes by specifying a non-disabled level in `RunningWorkunit::update_metadata`.

The recording of "disabled" workunits is additionally necessary for #14680, because otherwise workunits which were not actually recorded would break the tracking of multiple parents: when adding a new parent to a workunit, you need an existing `SpanId` corresponding to the work that you are adding a parent to (or else you might accidentally depend on a parent arbitrarily far up the stack).

Fixes #14867.
stuhood added a commit to stuhood/pants that referenced this pull request Mar 29, 2022
…e themselves (pantsbuild#14934)

pantsbuild#13483 broke the rendering of `EngineAwareReturnType` implementations which relied on starting a workunit at `Level::TRACE`, and then escalating its level to something visible (usually `INFO` or greater) when it completed. `check` outputs for the JVM were strongly affected (see pantsbuild#14867), since they relied on the fact that `FallibleClasspathEntry` escalates to `ERROR` to render compile errors.

To resolve this, we roll back a portion of pantsbuild#13483. Rather than not recording the workunit at all, we instead record it in the `WorkunitStore` as "disabled", which is signaled by a workunit not having any `WorkunitMetadata`. This has some of the efficiency benefits of pantsbuild#13483 (because we continue to skip heap allocating the metadata's fields), but allows a workunit to escalate itself from disabled to enabled as it completes by specifying a non-disabled level in `RunningWorkunit::update_metadata`.

The recording of "disabled" workunits is additionally necessary for pantsbuild#14680, because otherwise workunits which were not actually recorded would break the tracking of multiple parents: when adding a new parent to a workunit, you need an existing `SpanId` corresponding to the work that you are adding a parent to (or else you might accidentally depend on a parent arbitrarily far up the stack).

Fixes pantsbuild#14867.
# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
stuhood added a commit that referenced this pull request Mar 29, 2022
…e themselves (cherrypick of #14854, #14856, #14934) (#14942)

Fix missing `check` output by allowing disabled workunits to re-enable themselves (#14934)

#13483 broke the rendering of `EngineAwareReturnType` implementations which relied on starting a workunit at `Level::TRACE`, and then escalating its level to something visible (usually `INFO` or greater) when it completed. `check` outputs for the JVM were strongly affected (see #14867), since they relied on the fact that `FallibleClasspathEntry` escalates to `ERROR` to render compile errors.

To resolve this, we roll back a portion of #13483. Rather than not recording the workunit at all, we instead record it in the `WorkunitStore` as "disabled", which is signaled by a workunit not having any `WorkunitMetadata`. This has some of the efficiency benefits of #13483 (because we continue to skip heap allocating the metadata's fields), but allows a workunit to escalate itself from disabled to enabled as it completes by specifying a non-disabled level in `RunningWorkunit::update_metadata`.

The recording of "disabled" workunits is additionally necessary for #14680, because otherwise workunits which were not actually recorded would break the tracking of multiple parents: when adding a new parent to a workunit, you need an existing `SpanId` corresponding to the work that you are adding a parent to (or else you might accidentally depend on a parent arbitrarily far up the stack).

Fixes #14867.

[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants