Mark workunits blocked, and skip rendering completed workunits #12369

stuhood · 2021-07-16T22:12:11Z

Our "graph" of execution is a DAG, while the workunits (used to visualize and record traces) form a tree. Because they form a tree, workunits do not always report that they are blocked on work that they didn't start, but which some other node started (common when lots of @rules are waiting on another single @rule which only one of them started).

To fix this, we make the blocked property of a workunit an atomic mutable, and skip rendering the parents of blocked leaves. We use the blocked flag both for Tasks (which wait directly for memoized Nodes, and so frequently block in this way), and in BoundedCommandRunner, which temporarily blocks the workunit while we're acquiring the semaphore. Additionally, we skip rendering or walking through Completed workunits, which can happen in the case of speculation if a parent workunit completes before a child.

In order to toggle the blocked property on workunits, we expose the current RunningWorkunit in two new places: the CommandRunner and WrappedNode. In both cases, this is to allow the generic code to consume the workunit created by their callers and mark it blocked (for Task and BoundedCommandRunner).

Fixes #12349.

…orkunits`. [ci skip-build-wheels]

…a static property. [ci skip-build-wheels]

[ci skip-build-wheels]

…ate workunit for eager fetching. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

…rkunits in case of spawned work (like cache writes) where a child is still running below a completed parent. [ci skip-build-wheels]

…andRunner can mark its parent workunit blocked. [ci skip-build-wheels]

…e BoundedCommandRunner's. [ci skip-build-wheels]

stuhood · 2021-07-17T23:30:00Z

src/rust/engine/workunit_store/src/lib.rs

+///
+/// Workunits form a tree of running, blocked, and completed work, with parent ids propagated via
+/// thread-local state.
+///
+/// While running (the Started state), a copy of a Workunit is generally kept on the stack by the
+/// `in_workunit!` macro, while another copy of the same Workunit is recorded in the WorkunitStore.
+/// Most of the fields of the Workunit are immutable, but an atomic "blocked" flag can be set to
+/// temporarily mark the running Workunit as being in a blocked state.
+///
+/// When the `in_workunit!` macro exits, the Workunit on the stack is completed by storing any
+/// local mutated values as the final value of the Workunit.
+///


src/rust/engine/process_execution/src/cache.rs

src/rust/engine/process_execution/src/remote.rs

src/rust/engine/workunit_store/src/lib.rs

src/rust/engine/process_execution/src/cache.rs

# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

Eric-Arellano

Thanks Stu! Should this be picked to 2.6?

Eric-Arellano · 2021-07-19T20:00:09Z

src/rust/engine/workunit_store/src/lib.rs

@@ -508,13 +525,19 @@ pub struct HeavyHittersInnerStore {
 fn first_matched_parent(
  workunit_records: &HashMap<SpanId, Workunit>,
  mut span_id: Option<SpanId>,
+  is_terminal: impl Fn(&Workunit) -> bool,


What does terminal mean here? Imo it's ambiguous if it's something like "final/complete" vs. "the CLI/terminal"

src/rust/engine/process_execution/src/cache.rs

… fixes `Starting` messages. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

Eric-Arellano

Thanks for fixing the logging of cache hits!

@rules

…build#12369) Our "graph" of execution is a DAG, while the workunits (used to visualize and record traces) form a tree. Because they form a tree, workunits do not always report that they are blocked on work that they didn't start, but which some other node started (common when lots of @rules are waiting on another single @rule which only one of them started). To fix this, we make the `blocked` property of a workunit an atomic mutable, and skip rendering the parents of blocked leaves. We use the `blocked` flag both for `Tasks` (which wait directly for memoized Nodes, and so frequently block in this way), and in `BoundedCommandRunner`, which temporarily blocks the workunit while we're acquiring the semaphore. Additionally, we skip rendering or walking through `Completed` workunits, which can happen in the case of speculation if a parent workunit completes before a child. In order to toggle the `blocked` property on workunits, we expose the current `RunningWorkunit` in two new places: the `CommandRunner` and `WrappedNode`. In both cases, this is to allow the generic code to consume the workunit created by their callers and mark it blocked (for `Task` and `BoundedCommandRunner`). Fixes pantsbuild#12349. [ci skip-build-wheels]

@rules

… (#12376) Our "graph" of execution is a DAG, while the workunits (used to visualize and record traces) form a tree. Because they form a tree, workunits do not always report that they are blocked on work that they didn't start, but which some other node started (common when lots of @rules are waiting on another single @rule which only one of them started). To fix this, we make the `blocked` property of a workunit an atomic mutable, and skip rendering the parents of blocked leaves. We use the `blocked` flag both for `Tasks` (which wait directly for memoized Nodes, and so frequently block in this way), and in `BoundedCommandRunner`, which temporarily blocks the workunit while we're acquiring the semaphore. Additionally, we skip rendering or walking through `Completed` workunits, which can happen in the case of speculation if a parent workunit completes before a child. In order to toggle the `blocked` property on workunits, we expose the current `RunningWorkunit` in two new places: the `CommandRunner` and `WrappedNode`. In both cases, this is to allow the generic code to consume the workunit created by their callers and mark it blocked (for `Task` and `BoundedCommandRunner`). Fixes #12349. [ci skip-build-wheels]

#12369 adjusted the workunit graph to have the `BoundedCommandRunner` mark (what it thought was) its parent workunit as blocking while waiting to acquire a slot on the semaphore. But when #12748 fixed rendering of parent workunits, we experienced a regression in rendering with remote caching enabled: "Scheduling: ..." workunits were rendered when a process was blocked. #12369 contained a bug: the workunit being marked blocked by the `BoundedCommandRunner` was not always it's direct parent: in particular, under remote caching the workunit being marked blocking was in fact its grandparent. Marking that workunit blocked had no effect, because its child (the parent of the semaphore acquisition) would still cause it to render. To fix that, we move back to directly creating a workunit for `BoundedCommandRunner` semaphore acquisition, rather than marking the inbound workunit blocked. This also has the benefit of recording how long processes waited to acquire slots. This bug is to some degree an indictment of explicitly passing workunits to improve clarity... but on the other hand, it also seems to more strongly encourage operating on workunits that you have created, and which are living on your stack.

…ild#12973) pantsbuild#12369 adjusted the workunit graph to have the `BoundedCommandRunner` mark (what it thought was) its parent workunit as blocking while waiting to acquire a slot on the semaphore. But when pantsbuild#12748 fixed rendering of parent workunits, we experienced a regression in rendering with remote caching enabled: "Scheduling: ..." workunits were rendered when a process was blocked. pantsbuild#12369 contained a bug: the workunit being marked blocked by the `BoundedCommandRunner` was not always it's direct parent: in particular, under remote caching the workunit being marked blocking was in fact its grandparent. Marking that workunit blocked had no effect, because its child (the parent of the semaphore acquisition) would still cause it to render. To fix that, we move back to directly creating a workunit for `BoundedCommandRunner` semaphore acquisition, rather than marking the inbound workunit blocked. This also has the benefit of recording how long processes waited to acquire slots. This bug is to some degree an indictment of explicitly passing workunits to improve clarity... but on the other hand, it also seems to more strongly encourage operating on workunits that you have created, and which are living on your stack. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

…ick of #12973) (#12975) #12369 adjusted the workunit graph to have the `BoundedCommandRunner` mark (what it thought was) its parent workunit as blocking while waiting to acquire a slot on the semaphore. But when #12748 fixed rendering of parent workunits, we experienced a regression in rendering with remote caching enabled: "Scheduling: ..." workunits were rendered when a process was blocked. #12369 contained a bug: the workunit being marked blocked by the `BoundedCommandRunner` was not always it's direct parent: in particular, under remote caching the workunit being marked blocking was in fact its grandparent. Marking that workunit blocked had no effect, because its child (the parent of the semaphore acquisition) would still cause it to render. To fix that, we move back to directly creating a workunit for `BoundedCommandRunner` semaphore acquisition, rather than marking the inbound workunit blocked. This also has the benefit of recording how long processes waited to acquire slots. This bug is to some degree an indictment of explicitly passing workunits to improve clarity... but on the other hand, it also seems to more strongly encourage operating on workunits that you have created, and which are living on your stack. [ci skip-build-wheels]

stuhood added 5 commits July 16, 2021 15:03

Add a basic test coverage of heavy_hitters and `render_straggling_w…

d30ba64

…orkunits`. [ci skip-build-wheels]

Make blocked an atomic property of a running workunit, rather than …

c8e4b70

…a static property. [ci skip-build-wheels]

Expose the RunningWorkunit for WrappedNode to its implementations.

4db5dd4

[ci skip-build-wheels]

Remove extraneous workunits in remote cache fetching, and add a separ…

2fddd6b

…ate workunit for eager fetching. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

Mark Task workunits blocked.

6414cad

# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

stuhood force-pushed the stuhood/blocked-workunits branch from d23f0eb to bbcc489 Compare July 17, 2021 01:28

stuhood added 3 commits July 17, 2021 15:59

Completed workunits are never visible. This avoids rendering ghost wo…

23f66b1

…rkunits in case of spawned work (like cache writes) where a child is still running below a completed parent. [ci skip-build-wheels]

Expose a RunningWorkunit to the CommandRunner so that the BoundedComm…

3eff423

…andRunner can mark its parent workunit blocked. [ci skip-build-wheels]

Render the MultiPlatformExecuteProcess node's workunit rather than th…

9ecd8ac

…e BoundedCommandRunner's. [ci skip-build-wheels]

stuhood force-pushed the stuhood/blocked-workunits branch from bbcc489 to 9ecd8ac Compare July 17, 2021 23:26

stuhood requested review from Eric-Arellano, tdyas and jsirois July 17, 2021 23:29

stuhood marked this pull request as ready for review July 17, 2021 23:29

stuhood commented Jul 17, 2021

View reviewed changes

tdyas reviewed Jul 18, 2021

View reviewed changes

Review feedback

88cc7b9

# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

tdyas approved these changes Jul 19, 2021

View reviewed changes

Eric-Arellano approved these changes Jul 19, 2021

View reviewed changes

stuhood added the needs-cherrypick label Jul 19, 2021

stuhood added this to the 2.5.x milestone Jul 19, 2021

stuhood added 2 commits July 19, 2021 15:08

Move back to rendering a leaf workunit below the CommandRunner, which…

bb1a915

… fixes `Starting` messages. # Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

Move cache hit messages onto cache lookup workunits at debug.

051062d

# Building wheels and fs_util will be skipped. Delete if not intended. [ci skip-build-wheels]

Eric-Arellano approved these changes Jul 19, 2021

View reviewed changes

stuhood merged commit debdf4f into pantsbuild:main Jul 19, 2021

stuhood deleted the stuhood/blocked-workunits branch July 19, 2021 23:10

stuhood removed the needs-cherrypick label Jul 20, 2021

stuhood mentioned this pull request Sep 21, 2021

Fix spurious "Scheduling: ..." workunits with remote caching #12973

Merged

stuhood mentioned this pull request Sep 21, 2021

Fix spurious "Scheduling: ..." workunits with remote caching (cherrypick of #12973) #12975

Merged

stuhood mentioned this pull request Mar 3, 2022

Workunits should form a DAG rather than a tree #14680

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Mark workunits blocked, and skip rendering completed workunits #12369

Mark workunits blocked, and skip rendering completed workunits #12369

stuhood commented Jul 16, 2021

stuhood Jul 17, 2021

Eric-Arellano left a comment

Eric-Arellano Jul 19, 2021

Eric-Arellano left a comment

Mark workunits blocked, and skip rendering completed workunits #12369

Mark workunits blocked, and skip rendering completed workunits #12369

Conversation

stuhood commented Jul 16, 2021

stuhood Jul 17, 2021

Choose a reason for hiding this comment

Eric-Arellano left a comment

Choose a reason for hiding this comment

Eric-Arellano Jul 19, 2021

Choose a reason for hiding this comment

Eric-Arellano left a comment

Choose a reason for hiding this comment