Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inlining of PathGlob expansion causes excess invalidation with the daemon #4394

Closed
stuhood opened this issue Mar 27, 2017 · 6 comments · Fixed by #6059
Closed

Inlining of PathGlob expansion causes excess invalidation with the daemon #4394

stuhood opened this issue Mar 27, 2017 · 6 comments · Fixed by #6059
Assignees
Milestone

Comments

@stuhood
Copy link
Sponsor Member

stuhood commented Mar 27, 2017

Inlining of PathGlobsExpansion causes Snapshots to be more fragile than necessary. In particular, because VFS::expand recursively walks from the root down to the relevant directories without memoizing PathGlobsExpansions, any change in the root invalidates all Snapshots from a changed path up to the root of the repo (but not the directory listings and stats).

This significantly affects warm performance in the presence of the daemon.

We should experiment with memoizing (portions of?) PathGlobsExpansion in the Graph (similar to what we do for ReadLink/Scandir), and see how much it affects cold/warm performance.

NB: As a wildcard: it's possible that fixing #4558 would be a better approach than increasing memoization would be.


The relevant codepath centers around

///
/// Recursively expands PathGlobs into PathStats while applying excludes.
///
/// TODO: Eventually, it would be nice to be able to apply excludes as we go, to
/// avoid walking into directories that aren't relevant.
///
fn expand(&self, path_globs: PathGlobs) -> BoxFuture<Vec<PathStat>, E> {
. When we call that method from the engine (here:
// Recursively expand PathGlobs into PathStats while tracking their dependencies.
context
.expand(path_globs)
.then(move |path_stats_res| match path_stats_res {
Ok(path_stats) => {
// Declare dependencies on the relevant Stats, and then create a Snapshot.
let stats = future::join_all(
path_stats
.iter()
.map(
|path_stat| context.get(Stat(path_stat.path().to_owned())), // for recording only
)
.collect::<Vec<_>>(),
);
// And then create a Snapshot.
stats
.and_then(move |_| {
context
.core
.snapshots
.create(&context.core.vfs, path_stats)
.map_err(move |e| throw(&format!("Snapshot failed: {}", e)))
})
.to_boxed()
}
Err(e) => err(throw(&format!("PathGlobs expansion failed: {:?}", e))),
})
.to_boxed()
), the VFS in question is
impl VFS<Failure> for Context {
, which has the effect of memoizing/recording ReadLink and Scandir calls in the Graph (which later allows for filesystem invalidation).

The effect of memoizing ReadLink and Scandir (but none of the rest of the other steps of a PathGlobsExpansion) is that a Snapshot Node in the Graph is flat: the Snapshot depends directly on the syscalls, and to recompute it the entire VFS::expand call reruns.


When we fix this, it would be good to get a simple benchmark in place on the rust side. But it's also important to directly measure the performance of:

  • cold ./pants list ::
  • warm ./pants list :: after touching one file at the root of the repo
@stuhood stuhood added the pantsd label Mar 27, 2017
@stuhood stuhood changed the title Inlining of PathGlob expansion causes excess invalidation Inlining of PathGlob expansion causes excess invalidation with the daemon Mar 27, 2017
@stuhood stuhood removed the bug label Apr 7, 2017
@stuhood
Copy link
Sponsor Member Author

stuhood commented May 1, 2017

This is primarily annoying when changing files in the root of the repo, which is relatively rare. And since it doesn't effect correctness, going to defer to 1.4.0.

@stuhood
Copy link
Sponsor Member Author

stuhood commented Oct 27, 2017

@jsirois : One thing to be cautious of: @illicitonion will be changing this code a bit in order to integrate it with #4585. The memoization that is done in this codepath will continue to be necessary to provide good incremental performance in the context of the engine though, so that part will not change.

@illicitonion : Worth being aware of this ticket, as it explains a bit of why the fs module does what it does.

@stuhood stuhood assigned illicitonion and unassigned jsirois Nov 14, 2017
@stuhood
Copy link
Sponsor Member Author

stuhood commented Nov 29, 2017

This will be resolved by #5106.

@kwlzn
Copy link
Member

kwlzn commented Mar 2, 2018

assuming fixed by #5106 (not sure why github didn't auto-close based on the annotation) - thanks @illicitonion !

@kwlzn kwlzn closed this as completed Mar 2, 2018
@kwlzn
Copy link
Member

kwlzn commented Mar 2, 2018

err, guess that was because #5106 was 'closed' vs 'merged' and superceded by #5496.

@stuhood
Copy link
Sponsor Member Author

stuhood commented Jun 16, 2018

So, it looks like this is still an issue, in that touching a file in the root of the repository will invalidate the listing there, which will invalidate all Snapshots, because they depend on the filesystem operations. Basically everything else in the graph depends on a Snapshot, so the result is that although only one new IO happens, we end up re-constructing the Snapshots (which requires some round trips to the database) and all rule logic re-runs.

Fixing #4558 seems like the least contorted way to resolve this, so will re-open this until we resolve #4558. cc @jsirois

stuhood pushed a commit that referenced this issue Jul 6, 2018
…nged (#6059)

### Problem

As described in #4558, we currently completely delete `Node`s from the `Graph` when their inputs have changed.

One concrete case where this is problematic is that all `Snapshots` in the graph end up with a dependency on the `scandir` outputs of all of their parent directories, because we need to expand symlinks recursively from the root when consuming a `Path` (in order to see whether any path component on the way down is a symlink). This means that changes anywhere above a `Snapshot` invalidate that `Snapshot`, and changes at the root of the repo invalidate _all_ `Snapshots` (although 99% of the syscalls they depend on are not invalidated, having no dependencies of their own).

But this case is just one of many cases affected by the current implementation: there are many other times where we re-compute more than we should due to the current `Node` invalidation strategy.

### Solution

Implement node "dirtying", as described on #4558.

There are a few components to this work:

* In addition to being `Entry::clear`ed (which will force a `Node` to re-run), a `Node` may be `Entry::dirty`ed. A "dirty" `Node` is eligible to be "cleaned" if its dependencies have not changed since it was dirtied.
* Each `Node` records a `Generation` value that acts as proxy for "my output has changed". The `Node` locally maintains this value, and when a Node re-runs for any reason (either due to being `dirtied` or `cleared`), it compares its new output value to its old output value to determine whether to increment the `Generation`.
* Each `Node` records the `Generation` values of the dependencies that it used to run, at the point when it runs. When a dirtied `Node` is deciding whether to re-run, it compares the previous generation values of its dependencies to their current dependency values: if they are equal, then the `Node` can be "cleaned": ie, its previous value can be used without re-running it.

This patch also expands the testing of `Graph` to differentiate dirtying a `Node` from clearing it, and confirms that the correct `Nodes` re-run in each of those cases.

### Result

Cleaning all `Nodes` involved in `./pants list ::` after touching `pants.ini` completes 6 times faster than recomputing them from scratch (56 seconds vs 336 seconds in our repository). More gains are likely possible by implementing the performance improvement(s) described on #6013. 

Fixes #4558 and fixes #4394.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants