New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail for deleted-but-depended-on targets in changed #5636

Merged
merged 3 commits into from Mar 28, 2018

Conversation

Projects
None yet
4 participants
@stuhood
Copy link
Member

stuhood commented Mar 28, 2018

Problem

#5579 broke detection of deleted targets.

Solution

As described in #382 (mega classic!), it might be possible to more deeply integrate change detection into the v2 engine itself to compute a delta on the graph. But for now we defer a deeper solution and simply ensure that we fail for deleted targets by transitively expanding targets. Adds a test to cover the behaviour.

Result

Due to fully hydrating targets, this represents a linear performance regression from #5579: the runtime of --changed-include-dependees=transitive for a small set of roots used to be slightly lower than the runtime for ./pants list ::, because the operation that occurred "on the entire graph" was computing HydratedStructs, rather than computing TransitiveHydratedTargets.

The impact for exactly that step is constant, and fairly high:

# before:
  DEBUG] Root Select(Collection(dependencies=(DescendantAddresses(directory=u''),)), =Collection.of(Struct)) completed.
  DEBUG] computed 1 nodes in 1.709688 seconds. there are 8858 total nodes.

# after:
  DEBUG] Root Select(Collection(dependencies=(DescendantAddresses(directory=u''),)), =TransitiveHydratedTargets) completed.
  DEBUG] computed 1 nodes in 2.989497 seconds. there are 15916 total nodes.

... but the impact on overall runtime is dependent on the count of targets that are transitively affected, because for all affected targets, we're going to need to compute transitive roots anyway. So for the example change from #5579 which affects 567 targets:

time ./pants --changed-diffspec=22ca0604b1c6ce8de019214b821e922aac66b026^..22ca0604b1c6ce8de019214b821e922aac66b026 --changed-include-dependees=transitive list | wc -l
# before:
  real	0m4.877s
  user	0m4.081s
  sys	0m1.068s

# after
  real	0m5.294s
  user	0m4.487s
  sys	0m1.142s

For a change impacting only 14 targets the difference is slightly more pronounced:

$ time ./pants --changed-diffspec=f35e1e6fb1cdf45fcb5080cfe567bdbae8060125^..f35e1e6fb1cdf45fcb5080cfe567bdbae8060125 --changed-include-dependees=transitive list | wc -l
# before:
  real	0m4.279s
  user	0m3.376s
  sys	0m1.011s

# after:
  real	0m4.954s
  user	0m4.284s
  sys	0m1.120s

@stuhood stuhood requested review from benjyw and kwlzn Mar 28, 2018

@kwlzn

kwlzn approved these changes Mar 28, 2018

Copy link
Member

kwlzn left a comment

lgtm. is perf impacted?

@wisechengyi

This comment has been minimized.

Copy link
Contributor

wisechengyi commented Mar 28, 2018

Need 🍒 pick to 1.5.1?

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 28, 2018

lgtm. is perf impacted?

@kwlzn : Expanded the description. Long story short: yes, but not in the exponential way that #5579 was intended to fix.

@stuhood stuhood merged commit 739d74b into pantsbuild:master Mar 28, 2018

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@stuhood stuhood deleted the twitter:stuhood/detect-deleted-targets branch Mar 28, 2018

@stuhood

This comment has been minimized.

Copy link
Member

stuhood commented Mar 28, 2018

Need 🍒 pick to 1.5.1?

@wisechengyi : Yea, would probably be good. It's possible that other folks were relying on the eager failure in this case for the same reason.

@stuhood stuhood added this to the 1.5.x milestone Mar 28, 2018

@baroquebobcat

This comment has been minimized.

Copy link
Contributor

baroquebobcat commented Mar 28, 2018

Nice

wisechengyi added a commit to wisechengyi/pants that referenced this pull request Mar 31, 2018

Fail for deleted-but-depended-on targets in changed (pantsbuild#5636)
As described in pantsbuild#382 (mega classic!), it might be possible to more deeply integrate change detection into the v2 engine itself to compute a delta on the graph. But for now we defer a deeper solution and simply ensure that we fail for deleted targets by transitively expanding targets. Adds a test to cover the behaviour.

Due to fully hydrating targets, this represents a linear performance regression from pantsbuild#5579: the runtime of `--changed-include-dependees=transitive` for a small set of roots used to be slightly lower than the runtime for `./pants list ::`, because the operation that occurred "on the entire graph" was computing `HydratedStructs`, rather than computing `TransitiveHydratedTargets`.

The impact for exactly that step is constant, and fairly high:
```
  DEBUG] Root Select(Collection(dependencies=(DescendantAddresses(directory=u''),)), =Collection.of(Struct)) completed.
  DEBUG] computed 1 nodes in 1.709688 seconds. there are 8858 total nodes.

  DEBUG] Root Select(Collection(dependencies=(DescendantAddresses(directory=u''),)), =TransitiveHydratedTargets) completed.
  DEBUG] computed 1 nodes in 2.989497 seconds. there are 15916 total nodes.
```
... but the impact on overall runtime is dependent on the count of targets that are transitively affected, because for all affected targets, we're going to need to compute transitive roots anyway. So for the example change from pantsbuild#5579 which affects 567 targets:
```
time ./pants --changed-diffspec=22ca0604b1c6ce8de019214b821e922aac66b026^..22ca060 --changed-include-dependees=transitive list | wc -l
  real	0m4.877s
  user	0m4.081s
  sys	0m1.068s

  real	0m5.294s
  user	0m4.487s
  sys	0m1.142s
```
For a change impacting only 14 targets the difference is slightly more pronounced:
```
$ time ./pants --changed-diffspec=f35e1e6fb1cdf45fcb5080cfe567bdbae8060125^..f35e1e6 --changed-include-dependees=transitive list | wc -l
  real	0m4.279s
  user	0m3.376s
  sys	0m1.011s

  real	0m4.954s
  user	0m4.284s
  sys	0m1.120s
```

stuhood added a commit that referenced this pull request Nov 20, 2018

Move file owners computation into the engine and make lighter (#6790)
### Problem

#5579 added an optimization to avoid hydrating targets, of which a significant portion was later rolled back by #5636. #5636 reasoned that it was necessary to fully construct the graph in order to detect when targets have been deleted.

The solution used in #5636 though, was heavy-handed. By requesting `TransitiveHydratedTargets`, we fully constructed the graph of dependencies and then later _also_ constructed the dependent graph (in `DependentGraph`). Moreover, fully hydrating the targets meant unnecessarily fully expanding their source globs.

### Solution

Create a `find_owners` rule that consumes an `OwnersRequest` to compute the owning targets of some sources (and, optionally, their transitive dependents). Moving this logic into an `@rule` allows us to directly consume the symbol_table_contraint, without a wrapping collection (as in #5579).

Additionally, switch back to requesting target adaptor subclasses while building the `DependentGraph`. The graph-validation that use to occur via requesting `TransitiveHydratedTargets` now occurs as an explicit step during construction of a `DependentGraph`. A TODO there explains an existing weakness of change-detection in the engine.

### Result

```
./pants --changed-diffspec='49cc12e5cccc6164a2c4aa83c23300ad7254836f^..49cc12e' --changed-include-dependees=direct --glob-expansion-failure=ignore list
```
is 30 percent faster in the pantsbuild pants repo, primarily due to not expanding source globs.

stuhood added a commit that referenced this pull request Nov 20, 2018

Move file owners computation into the engine and make lighter (#6790)
The solution used in #5636 though, was heavy-handed. By requesting `TransitiveHydratedTargets`, we fully constructed the graph of dependencies and then later _also_ constructed the dependent graph (in `DependentGraph`). Moreover, fully hydrating the targets meant unnecessarily fully expanding their source globs.

Create a `find_owners` rule that consumes an `OwnersRequest` to compute the owning targets of some sources (and, optionally, their transitive dependents). Moving this logic into an `@rule` allows us to directly consume the symbol_table_contraint, without a wrapping collection (as in #5579).

Additionally, switch back to requesting target adaptor subclasses while building the `DependentGraph`. The graph-validation that use to occur via requesting `TransitiveHydratedTargets` now occurs as an explicit step during construction of a `DependentGraph`. A TODO there explains an existing weakness of change-detection in the engine.

```
./pants --changed-diffspec='49cc12e5cccc6164a2c4aa83c23300ad7254836f^..49cc12e' --changed-include-dependees=direct --glob-expansion-failure=ignore list
```
is 30 percent faster in the pantsbuild pants repo, primarily due to not expanding source globs.

wisechengyi added a commit that referenced this pull request Nov 20, 2018

Move file owners computation into the engine and make lighter (#6790)
### Problem

#5579 added an optimization to avoid hydrating targets, of which a significant portion was later rolled back by #5636. #5636 reasoned that it was necessary to fully construct the graph in order to detect when targets have been deleted.

The solution used in #5636 though, was heavy-handed. By requesting `TransitiveHydratedTargets`, we fully constructed the graph of dependencies and then later _also_ constructed the dependent graph (in `DependentGraph`). Moreover, fully hydrating the targets meant unnecessarily fully expanding their source globs.

### Solution

Create a `find_owners` rule that consumes an `OwnersRequest` to compute the owning targets of some sources (and, optionally, their transitive dependents). Moving this logic into an `@rule` allows us to directly consume the symbol_table_contraint, without a wrapping collection (as in #5579).

Additionally, switch back to requesting target adaptor subclasses while building the `DependentGraph`. The graph-validation that use to occur via requesting `TransitiveHydratedTargets` now occurs as an explicit step during construction of a `DependentGraph`. A TODO there explains an existing weakness of change-detection in the engine.

### Result

```
./pants --changed-diffspec='49cc12e5cccc6164a2c4aa83c23300ad7254836f^..49cc12e' --changed-include-dependees=direct --glob-expansion-failure=ignore list
```
is 30 percent faster in the pantsbuild pants repo, primarily due to not expanding source globs.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment