Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Nhoward/rust graph validation pt i #4227

Conversation

baroquebobcat
Copy link
Contributor

This is the first step towards adapting the static noop culling I worked on last year to the rust code.

This first patch is working towards replacing the graph validator written in python with one in rust. At this point it covers quite a lot of behavior, but it isn't working quite correctly yet.

I'm putting it up in the current state to get some other eyes on it while I iterate.

I'm hoping to have this initial patch wrapped up tomorrow, modulo bugs.

Things to look at:

  • I replaced the root selector fn dict with a set of subjects and a method. Since all of the subjects now use the same selector, it didn't make sense.
  • rule_graph.rs is where the meat of the port is. It still is a bit messy with commented python in places. I'm planning on removing the rest of them and leaving TODOs where I want to pick up the thread in the next patch.
  • I've also added a few comments in the extension points for the next phase, but I can remove or clear them up if they are a problem.

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nick!

@@ -149,6 +149,8 @@
ExecutionStat execution_execute(RawScheduler*);
RawNodes* execution_roots(RawScheduler*);

void run_validator(RawScheduler*, TypeId*, uint64_t);
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Naming nit... the methods are mostly namespaced. So maybe validator_run.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

handled.

@@ -213,6 +216,7 @@ impl Graph {
let src = self.entry_for_id(src_id);
dsts.into_iter()
.filter(|dst_id| !(src.dependencies.contains(dst_id) || src.cyclic_dependencies.contains(dst_id)))
// RGTODO add filter that removes deps if they will noop based on their rule edges
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that this will obviously be part of phase 2/3, but it's not clear that this is where it would happen, so the todo is probably misplaced.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

removed.


// I think I ought to be able to replace the below with a set of structs keyed by EntryType.
// My first couple attempts failed.
#[derive(Eq, Hash, PartialEq, Clone, Debug)]
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this struct is probably 1 byte wide, just marking it Copy rather than Clone would be perfectly reasonable, and allow you to avoid explicitly calling clone in a bunch of places.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(...and ignore this if you end up going the "one big enum" route.)

entry_type: EntryType::Unreachable,
subject_type: None,
value: None,
rule: Some(rule.clone()), // TODO decide on clone vs lifetimes
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd stick with cloning for now. Can revisit later if it's an issue, but it seems unlikely to be.

selector: None,
reason: None,
}
}
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing whitespace

@@ -8,9 +8,13 @@ use selectors::{Selector, Select, SelectDependencies, SelectLiteral, SelectProje
* Registry of tasks able to produce each type, along with a few fundamental python
* types that the engine must be aware of.
*/
#[derive(Clone)]
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you open a github issue about removing Clone here? It's a bit error prone, because the tasks get passed around everywhere currently, and cloning them accidentally (probably in a parent struct) would get costly fast.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. #4236

self.all_rules().iter().map(|t| t.product).collect()
}

pub fn is_singleton_task(&self, sought_task: &Task) -> bool {
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Worth adding a Singleton/Intrinisic/Normal enum on Task?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hm. that would certainly simplify these. I think I'd want to split that out as a separate change though. Sound reasonable?

}

pub fn all_rules(&self) -> Vec<Task> {
let mut result: Vec<Task> = Vec::new();
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another way to accomplish method would be:

let rules: Vec<_> = self.tasks.values().chain(self.singletons.values()).chain(self.singletons.values()).collect();

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

neat. I had to add a flat_map, but that's much more concise without losing readability.

let rules_in_graph: HashSet<Task> = full_dependency_edges.keys().map(|f| f.rule().clone()).collect();
let unfulfillable_discovered_during_construction: HashSet<Task> = full_unfulfillable_rules.keys().map(|f| f.rule().clone()).collect();
let declared_rules = self.tasks.all_rules();
let unreachable_rules: HashSet<&Task> = declared_rules.iter()
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that when the type annotation is use to guide inference for collect, it isn't necessary to actually fill in its parameters:

let unreachable_rules: HashSet<_> = declared_rules.iter()...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ah, so it's like the diamond operator in Java. Also neat. Is doing it that way preferred over the turbo fish? ie let foo = ...collect::<Vec<_>>()

.filter(|g| !root_rule_dependency_edges.contains_key(g))
.map(|g| g.clone());
for unseen_rule in unseen_dep_rules {
rules_to_traverse.push_back(unseen_rule);
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can use extend here, which consumes an iterator.

rules_to_traverse.extend(unseen_dep_rules);

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think my only blocker for landing this would be cleaning up the commented code a bit.

Tweaking the datamodel a bit would likely lay important groundwork too, but I won't block on it.

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nick!

@@ -1,3 +1,5 @@
// Copyright 2016 Pants project contributors (see CONTRIBUTORS.md).
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're going to add these, might as well go with the current year.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

roger

@@ -59,6 +59,9 @@ impl Selector {
)
}

/*
* The product type this selector will ultimately produce.
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know we're not really following rust style at all, but the indentation is funny here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. I'll reformat to be at least Javaish

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm. After thinking a bit, I looked at the Rust book again. It seems like line comments are preferred. And the doc comments are form of line comments. I'm changing it to line comments
https://doc.rust-lang.org/beta/book/comments.html

// I think I ought to be able to replace the below with a set of structs keyed by EntryType.
// My first couple attempts failed.
#[derive(Eq, Hash, PartialEq, Clone, Debug)]
enum EntryType {
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Much better! It looks like is now just Entry?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good call.


},
&Selector::Task(ref select) =>{
// TODO, not sure what task is in this context exactly
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's for symmetry: every Node type has a Selector currently. But probably possible to remove.

-- also removed the Clone on Tasks since an upstream change caused it to not be viable
@@ -8,9 +8,14 @@ use selectors::{Selector, Select, SelectDependencies, SelectLiteral, SelectProje
* Registry of tasks able to produce each type, along with a few fundamental python
* types that the engine must be aware of.
*/
// TODO remove Clone https://github.com/pantsbuild/pants/issues/4236
#[derive(Clone)]
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the merge from master, I was forced to handle this. So it's gone. Closing the issue.

- fix year
- s/EntryType/Entry/
- change comments
- clarify panics in a few places
@baroquebobcat baroquebobcat merged commit 62f2b10 into pantsbuild:master Feb 6, 2017
lenucksi pushed a commit to lenucksi/pants that referenced this pull request Apr 25, 2017
…ld#4227)

This is the first step towards adapting the static noop culling I worked on last year to the Rust code base.

This first patch is working towards replacing the graph validator written in python with one in rust. At this point it covers quite a lot of behavior, but it isn't working quite correctly yet.

This patch includes porting the majority of the rule graph construction. Including
- replacing the root selector fn dict with a set of subjects and a method. Since all of the subjects now use the same selector, it didn't make sense to keep.
- running the rust validator, but ignoring the results.
- adding copyright headers.

rule_graph.rs is where the meat of the port is. There are still a number of unported sections. I'll be removing the commented python and replacing it with rust in subsequent pull requests

The remaining portions are
- adding a enum on tasks for their task type
- removing unfulfillable entries from the graph
- handling SelectDependencies / SelectProjection selectors
- handling snapshots
- propagating the results back to the python side
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants