[engine] Rules as decorators #4369

Merged
merged 22 commits into from Mar 29, 2017

Conversation

Projects
None yet
4 participants
@stuhood
Member

stuhood commented Mar 23, 2017

Problem

The "rule triple" / "task triple" syntax generally requires ~30 minutes worth of explanation for every new user. In particular, the triples look like they are decoupled from individual functions, when they are generally defined 1-to-1.

Solution

Switch the majority of rule declarations to a new @rule decorator which creates and annotates the function with a TaskRule. The decorator is usable in all cases where constraints are known statically... in about three cases (where constraints are computed dynamically: ie, types from the SymbolTable) it is necessary to explicitly create a TaskRule.

Additionally, removed SelectLiteral in favor of usage of SingletonRules in all cases. When a SingletonRule is declared, callers can directly use Select to request the value for the singleton. Just like SelectLiteral, these continue to be an escape hatch for values which are not constructed from the ground up using files on disk (see #3941).

Result

Fixes #4289; "task triples" are gone: long live the @rule decorator!

build-support/bin/isort.sh
-then
- ./pants -q --target-spec-file=${to_sort} fmt.isort -- ${isort_args[@]}
-fi
+./pants -q --changed-parent=HEAD fmt.isort -- ${isort_args[@]}

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

Thanks so much for this, but could you break it out into a separate change?

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

Thanks so much for this, but could you break it out into a separate change?

@kwlzn

kwlzn approved these changes Mar 24, 2017

lgtm!

thinking out loud about top level vocabulary, I wonder if “directive” is more descriptive for what a triple represents than “rule”. may be too entrenched to change w/o being a PITA at this point tho - but any chance to improve clarity for future task developers seems like it might be worth it at this point.

@@ -226,6 +244,7 @@ def addresses_from_address_families(address_families, spec):
return Addresses(addresses)
+@rule(BuildDirs, [Select(AddressMapper), Select(Snapshot)])
def filter_build_dirs(address_mapper, snapshot):

This comment has been minimized.

@kwlzn

kwlzn Mar 24, 2017

Member

how would you feel about a more semantically literal naming scheme for the decorator form?

I think it might make the whole thing a bit clearer to end users - e.g.:

@produces(BuildDirs, given=[Select(AddressMapper), Select(Snapshot)])
def filter_build_dirs(address_mapper, snapshot):
  ...
@kwlzn

kwlzn Mar 24, 2017

Member

how would you feel about a more semantically literal naming scheme for the decorator form?

I think it might make the whole thing a bit clearer to end users - e.g.:

@produces(BuildDirs, given=[Select(AddressMapper), Select(Snapshot)])
def filter_build_dirs(address_mapper, snapshot):
  ...

This comment has been minimized.

@stuhood

stuhood Mar 28, 2017

Member

I've thought about this a bit more, and I think I do like "rule" here, due at least in part to the similarity to prolog rules. And I think that people will think of the entire function and its annotation as the rule, so making the name more imperative/literal seems likely to cause confusion.

@stuhood

stuhood Mar 28, 2017

Member

I've thought about this a bit more, and I think I do like "rule" here, due at least in part to the similarity to prolog rules. And I think that people will think of the entire function and its annotation as the rule, so making the name more imperative/literal seems likely to cause confusion.

+ TaskRule(
+ HydratedTarget,
+ [Select(symbol_table_constraint),
+ SelectDependencies(HydratedField,

This comment has been minimized.

@kwlzn

kwlzn Mar 24, 2017

Member

is there a reason some of these can't use the decorator form?

@kwlzn

kwlzn Mar 24, 2017

Member

is there a reason some of these can't use the decorator form?

This comment has been minimized.

@stuhood

stuhood Mar 24, 2017

Member

Yea, mentioned in the description: in cases where constraints are dynamic rather than static. The symbol table isn't known until the Scheduler is constructed.

@stuhood

stuhood Mar 24, 2017

Member

Yea, mentioned in the description: in cases where constraints are dynamic rather than static. The symbol table isn't known until the Scheduler is constructed.

src/python/pants/engine/rules.py
- Rule):
- """A Rule that runs a task function when all of its input selectors are satisfied."""
+ # Validate selectors.
+ if not isinstance(input_selectors, (list, tuple)):

This comment has been minimized.

@kwlzn

kwlzn Mar 24, 2017

Member

also wonder if it would make sense to require that this be a tuple in all cases? afaict, there's no benefit to making these list by default other than that [] sticks out more in a call signature than (). whatever we do here in the common case is likely to get cargo culted forward.

maybe this could be a named type/tuple type alias? which could then be used in lieu of a kwarg in the decorator, e.g.

@produces(Product, WithInputs(SelectThing(), SelectThing()))

or something along those lines.

@kwlzn

kwlzn Mar 24, 2017

Member

also wonder if it would make sense to require that this be a tuple in all cases? afaict, there's no benefit to making these list by default other than that [] sticks out more in a call signature than (). whatever we do here in the common case is likely to get cargo culted forward.

maybe this could be a named type/tuple type alias? which could then be used in lieu of a kwarg in the decorator, e.g.

@produces(Product, WithInputs(SelectThing(), SelectThing()))

or something along those lines.

This comment has been minimized.

@stuhood

stuhood Mar 28, 2017

Member

List literals are less annoying for single item tuples... so if anything, I think I'd rather require a list here for consistency.

The input selectors are always going to be 1-to-1 with the arguments, so I don't think it is valuable to put them in a named tuple. And I think they are always going to be a required arg, so leaving them positional rather than kwarg'd seems reasonable.

@stuhood

stuhood Mar 28, 2017

Member

List literals are less annoying for single item tuples... so if anything, I think I'd rather require a list here for consistency.

The input selectors are always going to be 1-to-1 with the arguments, so I don't think it is valuable to put them in a named tuple. And I think they are always going to be a required arg, so leaving them positional rather than kwarg'd seems reasonable.

@JieGhost

I like this change!

@baroquebobcat

I like this because it puts more of product dispatch into one place. So instead of SelectLiteral, which binds a singleton to a particular rule, now any rule can depend on the singleton more easily.

It also removes the ability to declare intrinsics from python and simplies how intrinsics work, which makes sense to me.

Now intrinsics are dispatched just by product, which makes them easier to reason about. I still think we should ultimately have an explicit notion of intrinsic task, because it'll make manipulating them easier and ensure that their method of dispatch is consistent. But, I don't think it makes sense to do at this point.

src/python/pants/engine/scheduler.py
+ self._to_ids_buf(selector.field_types))
+ elif selector_type is SelectProjection:
+ if len(selector.fields) != 1:
+ raise ValueError("TODO: remove support for projecting multiple fields at once.")

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

Maybe: Projecting multiple fields at once is not supported. I think this should fail when constructing the selector, so that the error is more actionable.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

Maybe: Projecting multiple fields at once is not supported. I think this should fail when constructing the selector, so that the error is more actionable.

This comment has been minimized.

@stuhood

stuhood Mar 28, 2017

Member

Sure... can fix that while I'm in here.

@stuhood

stuhood Mar 28, 2017

Member

Sure... can fix that while I'm in here.

src/python/pants/engine/rules.py
-class RuleIndex(datatype('RuleIndex', ['tasks', 'intrinsics', 'singletons'])):
+class RuleIndex(datatype('RuleIndex', ['rules'])):
"""Holds an index of tasks and intrinsics used to instantiate Nodes."""

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

nit: s/and intrinsics//

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

nit: s/and intrinsics//

src/python/pants/engine/scheduler.py
+ def _register_singleton(self, output_constraint, rule):
+ """Register the given SingletonRule.
+
+ Singleton tasks are those that are the default for a particular type(product).

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

I thought that singletons are the only allowed value for that type since we don't allow merges from multiple producers. So, maybe "Singleton tasks are the only task that can produce a product for a particular type(product)"?

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

I thought that singletons are the only allowed value for that type since we don't allow merges from multiple producers. So, maybe "Singleton tasks are the only task that can produce a product for a particular type(product)"?

This comment has been minimized.

@stuhood

stuhood Mar 28, 2017

Member

Yea, makes sense. I'll see about unifying singletons and tasks in the Tasks map.

@stuhood

stuhood Mar 28, 2017

Member

Yea, makes sense. I'll see about unifying singletons and tasks in the Tasks map.

@@ -837,8 +820,10 @@ fn rhs_for_select(tasks: &Tasks, subject_type: TypeId, select: &Select) -> Entri
if externs::satisfied_by_type(&select.product, &subject_type) {
// NB a matching subject is always picked first
vec![Entry::new_subject_is_product(subject_type)]
+ } else if let Some(&(ref key, _)) = tasks.gen_singleton(&select.product) {

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

I didn't realize til now, but this needs to have the new intrinsic, ie snapshot / filecontents, dispatch here too to accurately model how the engine works. I think it's ok for now without.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

I didn't realize til now, but this needs to have the new intrinsic, ie snapshot / filecontents, dispatch here too to accurately model how the engine works. I think it's ok for now without.

This comment has been minimized.

@stuhood

stuhood Mar 25, 2017

Member

@baroquebobcat : The reason it works now is that we install dummy copies of those tasks from the python side (the *_noop tasks in fs.py).

@stuhood

stuhood Mar 25, 2017

Member

@baroquebobcat : The reason it works now is that we install dummy copies of those tasks from the python side (the *_noop tasks in fs.py).

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 27, 2017

Contributor

Yep. My issue is that those noop declarations and their associated dispatch paths represent the same piece of knowledge, but are not tied together.
When we clear up the todo around the dispatch, I think we should make sure to tie them together somehow.

@baroquebobcat

baroquebobcat Mar 27, 2017

Contributor

Yep. My issue is that those noop declarations and their associated dispatch paths represent the same piece of knowledge, but are not tied together.
When we clear up the todo around the dispatch, I think we should make sure to tie them together somehow.

src/rust/engine/src/tasks.rs
@@ -16,14 +17,12 @@ pub struct Task {
}
/**
- * Registry of tasks able to produce each type, along with a few fundamental python
- * types that the engine must be aware of.
+ * Registry of Tasks able to produce each type, and Singletons which are the default/only

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

I think this should just say 'only provider' since singletons dispatch before tasks.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

I think this should just say 'only provider' since singletons dispatch before tasks.

+ "More than one singleton rule was installed for the product {:?}: {:?} vs {:?}",
+ product,
+ existing_value,
+ value,

This comment has been minimized.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

I suspect that these Debug output's won't be very helpful to test against--assuming they use the derived Debug output.

@baroquebobcat

baroquebobcat Mar 24, 2017

Contributor

I suspect that these Debug output's won't be very helpful to test against--assuming they use the derived Debug output.

This comment has been minimized.

@stuhood

stuhood Mar 28, 2017

Member

True! But I think you're fixing that in your other review.

@stuhood

stuhood Mar 28, 2017

Member

True! But I think you're fixing that in your other review.

@stuhood stuhood merged commit 9f356a3 into pantsbuild:master Mar 29, 2017

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details

@stuhood stuhood deleted the twitter:stuhood/rules-as-decorators branch Mar 29, 2017

lenucksi added a commit to lenucksi/pants that referenced this pull request Apr 25, 2017

[engine] Rules as decorators (#4369)
### Problem

The "rule triple" / "task triple" syntax generally requires ~30 minutes worth of explanation for every new user. In particular, the triples _look_ like they are decoupled from individual functions, when they are generally defined 1-to-1.

### Solution

Switch the majority of rule declarations to a new `@rule` decorator which creates and annotates the function with a `TaskRule`. The decorator is usable in all cases where constraints are known statically... in about three cases (where constraints are computed dynamically: ie, types from the `SymbolTable`) it is necessary to explicitly create a `TaskRule`.

Additionally, removed `SelectLiteral` in favor of usage of `SingletonRule`s in all cases. When a `SingletonRule` is declared, callers can directly use `Select` to request the value for the singleton. Just like `SelectLiteral`, these continue to be an escape hatch for values which are not constructed from the ground up using files on disk (see #3941).

### Result

Fixes #4289; "task triples" are gone: long live the `@rule` decorator!
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment