Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

@union / UnionRule for letting the engine figure out paths to products not known in advance #7116

Merged
merged 19 commits into from Mar 6, 2019

Conversation

Projects
None yet
3 participants
@cosmicexplorer
Copy link
Contributor

commented Jan 20, 2019

Problem

Resolves #4535.

It's currently difficult for rules to tell the engine "give me an instance of y from this x according to the registered rules in the rule graph" if x is not a specific type known in advance (see #4535 for use cases for this). The alternative is that upstream rules have to use conditional logic outside of the engine, which is very difficult to write in a way that combines with arbitrary rulesets from downstream plugins, and code written to get around this can be error-prone. This is sad because the rule graph is generated by combining rules in pants core as well as in plugins, so all the necessary information is there, we just can't make use of it.

This PR introduces the @union decorator and UnionRule type to allow for static resolution of "union" types in yield Get(Product, UnionType, subject) statements as per this comment in #4535 and below.

Solution

Python

  • Add a subject_declared_type field to pants.engine.selectors.Get instead of inferring it from the subject type (the 2-argument form of Get is still supported).
  • Introduce Get.create_statically_from_rule_graph() classmethod to make it clear when the Get subject is populated or not.
  • Introduce @union and UnionRule to describe union types which can be requested with a yield Get(Product, UnionType, subject) in rule bodies (as per this comment on #4535).
    • Note that @union classes are not registered in def rules(): ... -- this distinction seems to make sense as union classes are never instantiated.
  • Create a really simple union_rules dict field in RuleIndex which registers @union_rule() decorators as a map from union base type -> [union members].
  • Propagate the union_rules dict to the scheduler, and when adding Get edges (in _register_task() in scheduler.py), check if the subject_declared_type is a union base, and if so add edges to all union members.
  • Create a HydrateableField union base which is used to resolve (for now) either SourcesField or BundlesField in a yield Get().

Result

Users can now dynamically add union members in plugins and backends to be processed by upstream rules using yield Get(...) which don't know anything about them, and with a static universe of known union members which the engine uses uses to type-check the subject of a yield Get(...) at rule execution time.

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:union-rules-idk branch from c57bfc7 to 166beac Jan 20, 2019

@cosmicexplorer cosmicexplorer requested a review from stuhood Jan 20, 2019

@cosmicexplorer cosmicexplorer changed the title [WIP] @union_rule (or something along those lines) @union_rule (or something along those lines) Jan 20, 2019

@cosmicexplorer cosmicexplorer requested a review from illicitonion Jan 20, 2019

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor Author

commented Jan 21, 2019

Fixed the unit test errors by moving Get typechecking to scheduler.py in 1dd0377 because we do that during rule execution now, and also added testing for Get.extract_constraints().

./pants --v2 --no-v1 list :: fails (see travis) which I will fix momentarily.

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:union-rules-idk branch from 1dd0377 to 78e5c31 Jan 21, 2019

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor Author

commented Jan 21, 2019

./pants --v2 --no-v1 list :: fails (see travis) which I will fix momentarily.

Fixed and added some argument checking in 78e5c31 to overcome the otherwise-ambiguous reasoning about what means what in Get() with 2 vs 3 args.

@stuhood
Copy link
Member

left a comment

Thanks! This looks neat. I think it's possible to make it simpler, and it would be good to see some actual usage within this PR.

@@ -57,6 +60,65 @@ def transitive_coroutine_rule(c):
yield D(b)


@union

This comment has been minimized.

Copy link
@stuhood

stuhood Jan 28, 2019

Member

Even seeing it sketched out, it's not quite clear what real world usage will look like. Would you mind updating the BundleField/SourceField usecase

class HydratedField(datatype(['name', 'value'])):
"""A wrapper for a fully constructed replacement kwarg for a HydratedTarget."""
@rule(HydratedTarget, [Select(TargetAdaptorContainer)])
def hydrate_target(target_adaptor_container):
target_adaptor = target_adaptor_container.value
"""Construct a HydratedTarget from a TargetAdaptor and hydrated versions of its adapted fields."""
# Hydrate the fields of the adaptor and re-construct it.
hydrated_fields = yield [(Get(HydratedField, BundlesField, fa)
if type(fa) is BundlesField
else Get(HydratedField, SourcesField, fa))
for fa in target_adaptor.field_adaptors]
kwargs = target_adaptor.kwargs()
for field in hydrated_fields:
kwargs[field.name] = field.value
yield HydratedTarget(target_adaptor.address,
TargetAdaptor(**kwargs),
tuple(target_adaptor.dependencies))
here, to see how that informs the rest of the PR?

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

Doing this part first after rebasing!

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

Did this and it seemed to mysteriously work? It essentially just let us remove the if in this comprehension you've highlighted.

This comment has been minimized.

Copy link
@stuhood

stuhood Mar 3, 2019

Member

Looks great!

One possibility (not sure whether it is an improvement though...) would be to do a function call rather than a decorator for union members.

That might look like:

@union
class HydrateableField(object): pass

class SourcesField(...): ...

def rules():
  return [
      union_rule(HydrateableField, SourcesField)
    ]

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

There ended up being a ridiculous amount of confusion when trying to support use of @union_rule as a decorator and as a function call, so I instead converted it to a datatype UnionRule like we do with e.g. RootRule, without any decorators, which makes much more sense anyway. I just didn't want to drop the possibility of it being a decorator too early.

@@ -706,9 +716,26 @@ def func(constraint):
return Function(self.context.to_key(constraint))
def tc(constraint):
return TypeConstraint(self.context.to_key(constraint))
def flatten_type_map_to_constraints(type_constraints_map):

This comment has been minimized.

Copy link
@stuhood

stuhood Jan 28, 2019

Member

I think that part of the elegance of the implementation you suggested is that it shouldn't be necessary for the native engine to know about the union (yet... we could decide to do that later)...

If at RuleIndex (or perhaps Scheduler?) creation time you:

# expand a Get for a Union type...
Get(TypeX, Union)

# ...into Gets for all of its members:
Get(TypeX, UnionMember1), Get(TypeX, UnionMember2), .. #etc

... then there is no need for the engine itself to know about the existence of the union.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

This is really good.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

I'm actually thinking now that we don't need to do any expansion, but rather just check all Gets for union types (I believe we can/should do this at RuleIndex creation time) are valid according to the registered union rules (not sure if this was your intended meaning). This gives us the guarantee you've described below of knowing that Gets are valid statically, which allows the engine to just generate a Get from the runtime type as usual. And that should allow us to remove any knowledge of unions from the engine.

This comment has been minimized.

Copy link
@stuhood

stuhood Mar 3, 2019

Member

I'm actually thinking now that we don't need to do any expansion, but rather just check all Gets for union types are valid according to the registered union rules

Not quite: I think all I was referring to was replacing each union-Get with all union-member-Gets. Assuming you do that, the RuleGraph handles all validation for you.

I believe we can/should do this at RuleIndex creation time

I don't know if we can assume that a RuleIndex always represents a closed world (right now). But we do know that we have a closed world at Scheduler creation time... so at Scheduler creation time you could rewrite the contents of the RuleIndex to replace a union-Get with union-member-Gets?

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

After diving into the implementation of removing all the union knowledge from the engine, I think that sounds correct (since that's where we're already linking up the Gets for unions in this PR anyway), and actually may already be done.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

It was! It seems like all the engine work was purely to provide a more specific error message on failing to resolve a union somehow. I did also convert a couple panic!s into throw()s, which enabled the technique you describe in another comment (just making a Get and letting the engine error out if there's no path).

interns.insert(response.values.unwrap_one()),
)))
let product = TypeConstraint(interns.insert(response.products.unwrap_one()));
let subject_declared_type =

This comment has been minimized.

Copy link
@stuhood

stuhood Jan 28, 2019

Member

Hm. So I guess this is because at runtime (rather than at @rule parse time), we'll see a Get with a declared type that might be a union, and then we might want to know the members of the union in order to validate that the given value is a member of that union?

It feels like not actually checking the declared type here, and just using exactly the type of the subject value would be fine. The result would still be the same as if there had been multiple literal Gets in the body of the @rule... and by this point we would already have validated that those are correct in the rule graph.

AFAICT, that would mean this method would not need to change at all?

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

If I'm parsing your comment correctly, after expanding the Gets to include all the registered constituent union members as you described in another comment, this method would indeed not need to change.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

I have implemented this, all I had to do was make gen_get() return an error instead of panicking!

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:union-rules-idk branch from ac37919 to c145847 Mar 3, 2019

@cosmicexplorer cosmicexplorer changed the title @union_rule (or something along those lines) @union / UnionRule for letting the engine figure out paths to products not known in advance Mar 3, 2019

@stuhood

stuhood approved these changes Mar 3, 2019

Copy link
Member

left a comment

Thanks! This looks really great.


def add_type_transition_rule(union_rule):
"""
NB: This does not require that union bases be supplied to `def rules():`! not sure if that's

This comment has been minimized.

Copy link
@stuhood

stuhood Mar 3, 2019

Member

Totally fine with that, for the same reason you put in the description.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

Have moved this to a comment noting that this is because the union type is never instantiated!

@contextmanager
def _assert_execution_error(self, expected_msg):
# TODO(#7303): use self.assertRaisesWithMessageContaining()!
with self.assertRaises(ExecutionError) as cm:

This comment has been minimized.

Copy link
@stuhood

stuhood Mar 3, 2019

Member

Could this be self.assertRaisesRegexp?

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 3, 2019

Author Contributor

It might be, but I'm realizing now that this probably would be difficult to convert to self.assertRaisesWithMessageContaining() anyway because of the way it preprocesses the exception message. That could be converted to a regexp match (which I tried to do at one point), but I think stripping the locations from the traceback is much more robust and was just going to remove this comment, actually.

hydrated_fields = yield [(Get(HydratedField, BundlesField, fa)
if type(fa) is BundlesField
else Get(HydratedField, SourcesField, fa))
hydrated_fields = yield [Get(HydratedField, HydrateableField, fa)

This comment has been minimized.

Copy link
@stuhood

stuhood Mar 3, 2019

Member

Awesome.

@cosmicexplorer

This comment has been minimized.

Copy link
Contributor Author

commented Mar 3, 2019

Closed #7117 since the engine doesn't know about union rules anymore!

cosmicexplorer added some commits Jan 20, 2019

test the Get typechecking which we now do during rule execution
also test Get.extract_constraints()!

cosmicexplorer added some commits Mar 3, 2019

@cosmicexplorer cosmicexplorer force-pushed the cosmicexplorer:union-rules-idk branch from a247b85 to 93e2f4e Mar 3, 2019

@illicitonion
Copy link
Contributor

left a comment

Woo!

elif hasattr(entry, '__call__'):
rule = getattr(entry, 'rule', None)
if rule is None:
raise TypeError("Expected callable {} to be decorated with @rule.".format(entry))
add_rule(rule)
else:
# TODO: update this message!

This comment has been minimized.

Copy link
@illicitonion

illicitonion Mar 5, 2019

Contributor

Either do this, or flesh out the TODO explaining what update should happen.

This comment has been minimized.

Copy link
@cosmicexplorer

cosmicexplorer Mar 5, 2019

Author Contributor

Oops! Will do!

cosmicexplorer added some commits Mar 5, 2019

@stuhood stuhood merged commit e8ccd90 into pantsbuild:master Mar 6, 2019

1 check passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.