Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Two targets can swap positions with pantsd #7583

Merged
merged 15 commits into from Apr 24, 2019

Conversation

Projects
None yet
3 participants
@illicitonion
Copy link
Contributor

commented Apr 17, 2019

Before this PR, nothing would remove the edges of a dirty node, so if
two nodes swapped positions in the graph (e.g. if a dependency between
two targets inverted), a cycle would be detected.

With this PR, if we detect a cycle, but detect that there may be dirty
edges in play, we fully clear that node (including removing its edges),
which will cause it being re-triggered from scratch.

This is specifically in place to handle the cycle scenario - the dirty
bit, and dependency Generations are still the primary mechanism for
handling re-use of old versions.

There's an ugliness here that we still don't remove obsolete edges, so
if Generation 2 of a node has differing dependencies from Generation 1,
the dependency from Generation 1 will still dirty Generation 2. We may
want to consider solving that separately as/when it becomes a
significant issue, or we may want to re-work this PR to do something
like that... This PR happens to cover a part of that problem, but only
where it causes definitive problems (a fake cycle) rather than also
where it causes performance problems.

There's probably a slightly more principled solution here along the
lines of:

  • Rather than using () as an edge weight in the graph, use the
    Generation of the dependee Node as an edge weight.
  • When doing cycle detection, compare the edge weight against the
    generation of the node, and ignore obsolete edges.
    but I would want to think about that a lot more before doing it...

@illicitonion illicitonion requested review from stuhood, blorente and ity Apr 17, 2019

@stuhood
Copy link
Member

left a comment

Thanks a ton for looking at this!

Show resolved Hide resolved src/rust/engine/graph/src/entry.rs Outdated
@@ -360,6 +381,7 @@ impl<N: Node> Entry<N> {
} else {
None
},
dirty, // TODO: Should this also cover uncacheable?

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 17, 2019

Member

It feels like it is definitely related, yea. If this dirty value is associated with the previous_ value(s), then in the case where we've said: "you should definitely not trust/reuse the previous value", we should also not trust its edges.

But see the comments on #6598... it's pretty likely that the implementation of cacheability should switch to an implementation that changes the identity of the node each time (possibly by changing parameter identities)... and that would make this less relevant I think.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 17, 2019

Contributor

I think the node identity-based uncacheability approach seems like a good idea, could link to that issue here.

This comment has been minimized.

Copy link
@illicitonion

illicitonion Apr 18, 2019

Author Contributor

This now dirties in both cases, but I agree that reworking this in the future would be nice.

Show resolved Hide resolved src/rust/engine/graph/src/entry.rs
Show resolved Hide resolved src/rust/engine/graph/src/entry.rs Outdated
Show resolved Hide resolved src/rust/engine/graph/src/lib.rs Outdated
Show resolved Hide resolved src/rust/engine/graph/src/lib.rs Outdated
@cosmicexplorer
Copy link
Contributor

left a comment

This is a decidedly nontrivial issue and I'm very glad we have a handle on why it happens and how to fix it!

I've noted multiple places that I believe would strongly benefit from copious use of one-off enums. We can probably merge this PR first and then follow up with later enum changes to avoid blocking the fix.

When doing cycle detection, compare the edge weight against the
generation of the node, and ignore obsolete edges.
but I would want to think about that a lot more before doing it...

Is there additional complexity to implementing this beyond "we now have to compare generations", or is there a concern this would introduce difficult-to-debug errors?

@@ -62,10 +62,25 @@ pub(crate) enum EntryState<N: Node> {
// The previous_result value is _not_ a valid value for this Entry: rather, it is preserved in
// order to compute the generation value for this Node by comparing it to the new result the next
// time the Node runs.
//
// A note on dirty as was_dirty:

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 17, 2019

Contributor
Suggested change
// A note on dirty as was_dirty:
// A note on dirty versus was_dirty:

This comment has been minimized.

Copy link
@illicitonion

illicitonion Apr 18, 2019

Author Contributor

Replaced with different docs on the new enum

Show resolved Hide resolved src/rust/engine/graph/src/entry.rs Outdated
@@ -541,7 +565,7 @@ impl<N: Node> Entry<N> {
///
/// Clears the state of this Node, forcing it to be recomputed.
///
pub(crate) fn clear(&mut self) {
pub(crate) fn clear(&mut self, graph_still_contains_edges: bool) {

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 17, 2019

Contributor

This could also be converted into its own one-off enum instead of a bool.

Show resolved Hide resolved src/rust/engine/graph/src/lib.rs Outdated
@@ -514,17 +541,21 @@ impl<N: Node> Graph<N> {
// TODO: doing cycle detection under the lock... unfortunate, but probably unavoidable
// without a much more complicated algorithm.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Apr 17, 2019

Contributor

Unrelated: I would be interested in any thoughts on how to estimate the speedup we might get from incremental cycle detection (possibly just by using a profiler?) instead of holding the lock.

Show resolved Hide resolved src/rust/engine/graph/src/lib.rs
Show resolved Hide resolved tests/python/pants_test/pantsd/test_pantsd_integration.py
@stuhood
Copy link
Member

left a comment

Also, would it be possible to include the unit tests from master...twitter:stuhood/dirty-cycle-detection here? Can also mark this one as fixing #7404.

@stuhood

This comment has been minimized.

Copy link
Member

commented Apr 18, 2019

Also also:

There's an ugliness here that we still don't remove obsolete edges...

It would be good to incorporate some of the PR description into a TODO somewhere in the code. Definitely fine with leaving "non-problematic" edges in place for now and revisiting it in the (distant) future!

@stuhood
Copy link
Member

left a comment

Thanks!

Show resolved Hide resolved src/rust/engine/graph/src/entry.rs Outdated
@@ -428,6 +471,9 @@ impl<N: Node> Entry<N> {
"Not completing node {:?} because it was invalidated before completing.",
self.node
);
if let Some(previous_result) = previous_result.as_mut() {

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 18, 2019

Member

Nit: Possible that the EntryResult enum could gain a "NotPresent" variant to incorporate the None case? Possibly not worth it.

This comment has been minimized.

Copy link
@illicitonion

illicitonion Apr 23, 2019

Author Contributor

Will leave for now, can add in the future if needed.

Show resolved Hide resolved src/rust/engine/graph/src/entry.rs
@@ -567,6 +616,12 @@ impl<N: Node> Entry<N> {

trace!("Clearing node {:?}", self.node);

if graph_still_contains_edges {
if let Some(previous_result) = previous_result.as_mut() {

This comment has been minimized.

Copy link
@stuhood

stuhood Apr 18, 2019

Member

Yea, a lot of these would be eliminated by a "NotPresent" variant.

Show resolved Hide resolved src/rust/engine/graph/src/lib.rs Outdated

illicitonion added some commits Mar 19, 2019

Two targets can swap positions with pantsd
Before this PR, nothing would remove the edges of a dirty node, so if
two nodes swapped positions in the graph (e.g. if a dependency between
two targets inverted), a cycle would be detected.

With this PR, if we detect a cycle, but detect that there may be dirty
edges in play, we fully clear that node (including removing its edges),
which will cause it being re-triggered from scratch.

This is specifically in place to handle the cycle scenario - the dirty
bit, and dependency Generations are still the primary mechanism for
handling re-use of old versions.

There's an ugliness here that we still don't remove obsolete edges, so
if Generation 2 of a node has differing dependencies from Generation 1,
the dependency from Generation 1 will still dirty Generation 2. We _may_
want to consider solving that separately as/when it becomes a
significant issue, or we may want to re-work this PR to do something
like that... This PR happens to cover a part of that problem, but only
where it causes definitive problems (a fake cycle) rather than also
where it causes performance problems.

There's probably a slightly more principled solution here along the
lines of:
 * Rather than using () as an edge weight in the graph, use the
   Generation of the dependee Node as an edge weight.
 * When doing cycle detection, compare the edge weight against the
   generation of the node, and ignore obsolete edges.
but I would want to think about that a lot more before doing it...
Use Bellman-Ford not Dijkstra
So that we can report whole paths which would cause the cycle.

This will slow down cycle detection a little; if it becomes a problem,
we could do a Dijkstra run, and only if we detect a cycle (which is the
rare case), do the Bellman-Ford. But we've also been talking about
trying to do incremental cycle detection, so I'm not going to worry too
much about this unless it starts posing a noticeable problem.

@illicitonion illicitonion force-pushed the twitter:dwagnerhall/pantsd-cycle2 branch from f40b8fd to 283bfe4 Apr 23, 2019

illicitonion added some commits Apr 23, 2019

@illicitonion illicitonion force-pushed the twitter:dwagnerhall/pantsd-cycle2 branch from 88206e7 to 4bf0c74 Apr 24, 2019

@illicitonion illicitonion merged commit 9c121f1 into pantsbuild:master Apr 24, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@illicitonion illicitonion deleted the twitter:dwagnerhall/pantsd-cycle2 branch Apr 24, 2019

illicitonion added a commit that referenced this pull request Apr 24, 2019

illicitonion added a commit to twitter/pants that referenced this pull request Apr 24, 2019

Two targets can swap positions with pantsd (pantsbuild#7583)
Before this PR, nothing would remove the edges of a dirty node, so if
two nodes swapped positions in the graph (e.g. if a dependency between
two targets inverted), a cycle would be detected.

With this PR, if we detect a cycle, but detect that there may be dirty
edges in play, we fully clear that node (including removing its edges),
which will cause it being re-triggered from scratch.

This is specifically in place to handle the cycle scenario - the dirty
bit, and dependency Generations are still the primary mechanism for
handling re-use of old versions.

There's an ugliness here that we still don't remove obsolete edges, so
if Generation 2 of a node has differing dependencies from Generation 1,
the dependency from Generation 1 will still dirty Generation 2. We _may_
want to consider solving that separately as/when it becomes a
significant issue, or we may want to re-work this PR to do something
like that... This PR happens to cover a part of that problem, but only
where it causes definitive problems (a fake cycle) rather than also
where it causes performance problems.

There's probably a slightly more principled solution here along the
lines of:
 * Rather than using () as an edge weight in the graph, use the
   Generation of the dependee Node as an edge weight.
 * When doing cycle detection, compare the edge weight against the
   generation of the node, and ignore obsolete edges.
but I would want to think about that a lot more before doing it...

illicitonion added a commit that referenced this pull request Apr 24, 2019

Two targets can swap positions with pantsd (#7583) (#7617)
Before this PR, nothing would remove the edges of a dirty node, so if
two nodes swapped positions in the graph (e.g. if a dependency between
two targets inverted), a cycle would be detected.

With this PR, if we detect a cycle, but detect that there may be dirty
edges in play, we fully clear that node (including removing its edges),
which will cause it being re-triggered from scratch.

This is specifically in place to handle the cycle scenario - the dirty
bit, and dependency Generations are still the primary mechanism for
handling re-use of old versions.

There's an ugliness here that we still don't remove obsolete edges, so
if Generation 2 of a node has differing dependencies from Generation 1,
the dependency from Generation 1 will still dirty Generation 2. We _may_
want to consider solving that separately as/when it becomes a
significant issue, or we may want to re-work this PR to do something
like that... This PR happens to cover a part of that problem, but only
where it causes definitive problems (a fake cycle) rather than also
where it causes performance problems.

There's probably a slightly more principled solution here along the
lines of:
 * Rather than using () as an edge weight in the graph, use the
   Generation of the dependee Node as an edge weight.
 * When doing cycle detection, compare the edge weight against the
   generation of the node, and ignore obsolete edges.
but I would want to think about that a lot more before doing it...

cosmicexplorer added a commit to cosmicexplorer/pants that referenced this pull request Apr 29, 2019

cosmicexplorer added a commit that referenced this pull request Apr 29, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.