Allow `@goal_rule`s to request `Targets` and demo with `./v2 binary run` #9345

Eric-Arellano · 2020-03-19T17:31:58Z

This adds rules to allow rules to request TargetWithOrigin, Targets, and TargetsWithOrigins, then hooks it up to ./v2 binary and ./v2 run.

Result: `./v2 binary run` works with custom target types

So long as those target types have the field PythonBinarySources and EntryPoint registered, we can now run ./v2 binary and ./v2 run on the targets.

Result: better error messages for invalid target types:

Before:

▶ ./v2 --no-print-exception-stacktrace run src/python/pants/util:strutil

ERROR: Type PythonTargetAdaptor is not a member of the BinaryTarget @union

After:

▶ ./v2 --no-print-exception-stacktrace run src/python/pants/util:strutil

ERROR: The run and binary goals only work with the following target types: ['python_binary']

You used src/python/pants/util:strutil with target type 'python_library'.

This improvement is made possible because of the Target API for two reasons:

Field-driven API. Rules now only say what fields they need and ignore all others.
We can access all registered target types by requesting the singleton RegisteredTargetTypes.

These two reasons allow us to iterate over every registered BinaryImplementation to determine which target types they work with, i.e. which target types have their required fields. We then flatten this into a single list of valid targets.

This error message works with custom target types, too!

▶ ./v2 --no-print-exception-stacktrace run src/python/pants/util:strutil

ERROR: The run and binary goals only work with the following target types: ['custom_python_binary', 'python_binary']

You used src/python/pants/util:strutil with target type 'python_library'.

Implementation: filtering targets in `@goal_rules`

An @goal_rule will typically request SourcesSnapshots, Addresses, AddressesWithOrigins, Targets, or TargetsWithOrigins. It hasn't yet filtered out which targets it actually cares about.

The downstream rules, meanwhile will already express which exact fields they need to operate through their rule signature, e.g. python_create_binary.py saying it needs PythonBinaryImplementation (previously PythonBinaryFields). The @goal_rule must filter out the relevant targets then pass those fields down to the rule.

--

Here, we use the same pattern used in fmt.py, lint.py, and test.py to do that filtering.

It no longer makes sense to have a union of target types, like UnionRule(BinaryTarget, PythonBinary). We no longer care about target types, but rather only the required fields for the rule to operate.

Instead, we use UnionRule(BinaryImplementation, PythonBinaryImplementation), where PythonBinaryImplementation is a subclass of BinaryImplementation that has the methods is_valid_target() and create(). For any given target, this allows us to check "are there any registered binary implementations that work with this target?" If so, we now have a standardized way of extracting the relevant Fields from that target by calling implementation.create(target) and using a union rule to call the appropriate downstream rule.

This is the equivalent of test.py having UnionRule(TestRunner, PytestRunner), lint.py having UnionRule(Linter, BanditLinter), and fmt.py having UnionRule(Formatter, IsortFormatter).

Eric-Arellano · 2020-03-19T17:33:40Z

src/python/pants/backend/python/rules/python_create_binary.py

-# TODO: consider replacing this with sugar like `SelectFields(EntryPoint, PythonBinarySources)` so
-#  that the rule would request that instead of this dataclass. Note that this syntax must support
-#  both optional_fields (see the below TODO) and opt-out `SentinelField`s
-#  (see https://github.com/pantsbuild/pants/pull/9316#issuecomment-600152573).


This TODO is superseded by a new TODO in binary.py. The abstraction has now moved one layer up.

Eric-Arellano · 2020-03-19T17:37:19Z

src/python/pants/backend/python/rules/python_create_binary.py

 @dataclass(frozen=True)
-class PythonBinaryFields:
+class PythonBinaryImplementation(BinaryImplementation):


Keep reading the PR, especially binary.py, for why I found BinaryImplementation to be much more natural than BinaryFields.

As mentioned in the PR description, another reason I think BinaryImplementation is better than BinaryFields is that it fits better with test.py, fmt.py, and lint.py, which use TestRunner, Formatter, and Linter, respectively, for this same construct.

Both this form and the prior seem like more boilerplate than needed. Could non Optional fields be used as the signal to an external tester / creator? Given a @dataclass, the meaning of Optional and a Target instance, it seems you could have one piece of code that knows if that @dataclass can be created from the Target and can then create an instance of the @dataclass if so.

Given a @DataClass, the meaning of Optional and a Target instance, it seems you could have one piece of code that knows if that @DataClass can be created from the Target and can then create an instance of the @DataClass if so.

I don't follow. Do you mean factoring this out into some abstraction that removes most the boilerplate here? Or do you mean a more fundamental redesign?

If the former, almost certainly, yes, we can reduce boilerplate by refactoring this. But, I'm intentionally punting on boilerplate reduction until we finish porting all of V2 Python to the Target API so that we better understand all the usages we need to handle.

OK. The change from PythonBinaryFields / PythonBinaryImplementation just looks like a non-obvious win and the refactor looks like the more obvious win just reading this PR. Happy to ride out the thrash to get more data.

The change from PythonBinaryFields / PythonBinaryImplementation just looks like a non-obvious win

My bad with a confusing PR comment at the start of this thread. There are two things going on with this PR in regards to PythonBinaryFields:

Rename PythonBinaryFields to PythonBinaryImplementation. This is solely a name change to reflect what ended up being far more natural in real-world usage, e.g. what sounded better in the error message in the PR description.

This is what the original comment in this thread was referring to.

Have PythonBinaryImplementation subclass BinaryImplementation, whereas previously it was a stand-alone dataclass.

This removes some boilerplate. We no longer need to define is_valid_target() and valid_target_types() because the superclass gives us those.

This all still works with MyPy. We could probably remove even more boilerplate by putting more logic into the superclass, e.g. create(), but MyPy wouldn't be able to know what each field is, anymore.

I think #1 is stable. I don't think we'll want to change the name again. (But, we can, need be).

#2 is very fluid. I expect for there to be much churn the next week as we encounter new real-world use cases.

Yeah - I think #2 can be completely ditched given a @dataclass, the meaning of Optional and a Target instance. Time will tell.

jsirois · 2020-03-19T18:05:01Z

src/python/pants/rules/core/binary.py

+            f"type {repr(target.alias)}). All registered binary implementations:\n\n"
+            f"{all_implementations}."
+        )
+    if len(valid_binary_implementations) > 1:


Does this pre-empt some less well displayed failure from somewhere else in the pipeline? If not it seems like an artificial restrction. It seems completely reasonable to have a python_binary in a BUILD file and 1 rule that uses a PythonPEXBinary to extract fields needed to build a PEX from that target and another rule that uses a PythonIPEXBinary to extract fields needed to build an iPEX from that target and have both rules run and produce two binaries for the python_binary input target.

This invariant was previously enforced because we had UnionRule(BinaryTarget, PythonBinaryTargetAdaptor) and it's impossible for a TargetAdaptor to be more than one specific TargetAdaptor type. It either is a PythonBinaryTargetAdaptor or it's a MyCustomPythonBinaryTargetAdaptor; it can't be both.

We had a rule that went from PythonBinaryTargetAdaptor -> CreatedBinary, so we knew when calling await Get[CreatedBinary](BinaryTarget, target_adaptor) that we could, by definition, only have one single rule that would go from PythonBinaryTargetAdaptor -> CreatedBinary.

Here, we're opening up the flood gates that now it's possible for multiple binary implementations for any given target.

--

Sometimes, we like that there may be multiple implementations. For example, with lint.py, we depend on that to be able to run, say, 4 linters on the exact same target:

pants/src/python/pants/rules/core/lint.py

Lines 100 to 105 in 72c4b93

results = await MultiGet(

Get[LintResult](Linter, linter((adaptor_with_origin,)))

for adaptor_with_origin in adaptors_with_origins

for linter in linters

if linter.is_valid_target(adaptor_with_origin)

)

In contrast, with ./v2 test --debug, we must only have one target, so we have this validation with --debug, but not normally:

pants/src/python/pants/rules/core/test.py

Lines 205 to 222 in 72c4b93

if options.values.debug:

target_with_origin = targets_with_origins.expect_single()

adaptor_with_origin = TargetAdaptorWithOrigin.create(

target_with_origin.target.adaptor, target_with_origin.origin

)

address = adaptor_with_origin.adaptor.address

valid_test_runners = [

test_runner

for test_runner in test_runners

if test_runner.is_valid_target(adaptor_with_origin)

]

if not valid_test_runners:

raise ValueError(f"No valid test runner for {address}.")

if len(valid_test_runners) > 1:

raise ValueError(

f"Multiple possible test runners for {address} "

f"({', '.join(test_runner.__name__ for test_runner in valid_test_runners)})."

)

--

Here, maybe we are okay with multiple binary implementations? I'm not sure if we'd want to run both Pex and IPex. Maybe, we'd instead want you to do something like ./v2 binary --implementation='ipex' to disambiguate (which is the current prototype for ./v2 repl). I'm not sure. (cc @gshuflin)

For this PR, I think we should punt on it and stick with the prior semantics. But rest assured that it is indeed possible to work with multiple binary implementations for a single target.

OK - sg for now.

It will be interesting to see whether the "filter targets and implementations, and limit to N" code that goals tend to need to write can be made generic (or perhaps even declarative). But walking the line of premature generalization is always challenging.

benjyw · 2020-03-19T18:26:33Z

The messaging improvement is great! Even greater would be something like:

The `run` goal requires one of the following target types: python_binary.
You provided `src/python/pants/util:strutil` which is of type python_library.

Just thinking about the end user here...

Eric-Arellano · 2020-03-19T18:28:22Z

Even greater would be something like:

I love it! I'll include that as part of this PR.

Eric-Arellano · 2020-03-19T19:44:13Z

src/python/pants/rules/core/binary.py

+        # TODO: this is a leaky abstraction that the error message knows this rule is being used
+        #  by `run` and `binary`. How should this be handled? A custom goal author could depend on
+        #  this error message too and the error would now be lying.


I don't know what to do about this TODO. We could rewrite this error message to remove the leak but still be meaningful, e.g. "Cannot create a binary for this target type...".

Although, that's not as helpful or direct as "The run goal only works with ...". It's an implementation detail that ./v2 run creates a binary to be able to then run that binary.

We could work around this by adding a new rule parameter with the error message so that call sites can customize the behavior. But, then ./v2 run couldn't leverage the work of ./v2 binary and vice versa.

Alternatively, we could somehow factor this coordinator_of_binaries rule out so that each call site can choose an error message that's meaningful to it? I'm not sure how trivial this would be.

--

Generally, I'd bias towards optimizing for the better error message and figuring out whatever gymnastics we need to support that great error message.

Could the active goal name or even the full parsed argv be installed as a root rule so that rules wanting to know about that data could just request it? If our invalidation no-opping cuts the mustard, this should be performant.

Yes, I believe at a minimum that we could have a singleton for all invoked goals. I don't think we could ask for the "active" goal because that requires accessing mutable global state.

I feel like it would be reasonably foolproof to add the context in run_goal_rules above any raised exception (when --no-print-exception-stacktrace)... it'd be on two lines, but:

Exception: While executing the `run` goal: # rendered by run_goal_rule This implementation can only be used with the following target types: # rendered by a rule

Passing down context is fine, but then you have to rely on rule implementers to actually use it.

I think I like that! It's probably only worth rendering that message when multiple goals were given, though. If you only run ./v2 run, that context wouldn't really help for anything.

I think that it would probably still help in that case. It adds context that differentiates this from pants "just being broken" in some way, before getting to the goal.

stuhood

Thanks!

stuhood · 2020-03-20T17:14:25Z

src/python/pants/rules/core/binary.py

+        # TODO: this is a leaky abstraction that the error message knows this rule is being used
+        #  by `run` and `binary`. How should this be handled? A custom goal author could depend on
+        #  this error message too and the error would now be lying.


I feel like it would be reasonably foolproof to add the context in run_goal_rules above any raised exception (when --no-print-exception-stacktrace)... it'd be on two lines, but:

Exception: While executing the `run` goal: # rendered by run_goal_rule This implementation can only be used with the following target types: # rendered by a rule

Passing down context is fine, but then you have to rely on rule implementers to actually use it.

stuhood · 2020-03-20T17:21:42Z

src/python/pants/rules/core/binary.py

+            f"type {repr(target.alias)}). All registered binary implementations:\n\n"
+            f"{all_implementations}."
+        )
+    if len(valid_binary_implementations) > 1:


It will be interesting to see whether the "filter targets and implementations, and limit to N" code that goals tend to need to write can be made generic (or perhaps even declarative). But walking the line of premature generalization is always challenging.

Eric-Arellano added 4 commits March 19, 2020 10:18

Add TargetWithOrigin, Targets, and TargetsWithOrigins

da81b2a

Rewrite binary and run to exclusively use the Target API

eea7ae3

Make the error message 10x better

0be5260

Fixup some naming

5b9d80f

Eric-Arellano requested review from stuhood, jsirois, benjyw, illicitonion, cosmicexplorer and gshuflin March 19, 2020 17:31

Eric-Arellano commented Mar 19, 2020

View reviewed changes

jsirois approved these changes Mar 19, 2020

View reviewed changes

Improve the error messages even more

954677e

Eric-Arellano commented Mar 19, 2020

View reviewed changes

Eric-Arellano added 3 commits March 19, 2020 12:48

Merge branch 'master' of github.com:pantsbuild/pants into targets

955f9cb

One more BUILD file fix :/

8bc0192

Merge branch 'master' of github.com:pantsbuild/pants into targets

2722102

Eric-Arellano merged commit 8a19737 into pantsbuild:master Mar 20, 2020

Eric-Arellano deleted the targets branch March 20, 2020 06:39

stuhood reviewed Mar 20, 2020

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow `@goal_rule`s to request `Targets` and demo with `./v2 binary run` #9345

Allow `@goal_rule`s to request `Targets` and demo with `./v2 binary run` #9345

Eric-Arellano commented Mar 19, 2020 •

edited

Eric-Arellano Mar 19, 2020

Eric-Arellano Mar 19, 2020

jsirois Mar 19, 2020 •

edited

Eric-Arellano Mar 19, 2020

jsirois Mar 19, 2020 •

edited

Eric-Arellano Mar 19, 2020

jsirois Mar 19, 2020

jsirois Mar 19, 2020

Eric-Arellano Mar 19, 2020

jsirois Mar 19, 2020

stuhood Mar 20, 2020

benjyw commented Mar 19, 2020

Eric-Arellano commented Mar 19, 2020

Eric-Arellano Mar 19, 2020

jsirois Mar 19, 2020 •

edited

Eric-Arellano Mar 19, 2020

stuhood Mar 20, 2020 •

edited

Eric-Arellano Mar 20, 2020

stuhood Mar 21, 2020

stuhood left a comment

stuhood Mar 20, 2020 •

edited

stuhood Mar 20, 2020

	results = await MultiGet(
	Get[LintResult](Linter, linter((adaptor_with_origin,)))
	for adaptor_with_origin in adaptors_with_origins
	for linter in linters
	if linter.is_valid_target(adaptor_with_origin)
	)

	if options.values.debug:
	target_with_origin = targets_with_origins.expect_single()
	adaptor_with_origin = TargetAdaptorWithOrigin.create(
	target_with_origin.target.adaptor, target_with_origin.origin
	)
	address = adaptor_with_origin.adaptor.address
	valid_test_runners = [
	test_runner
	for test_runner in test_runners
	if test_runner.is_valid_target(adaptor_with_origin)
	]
	if not valid_test_runners:
	raise ValueError(f"No valid test runner for {address}.")
	if len(valid_test_runners) > 1:
	raise ValueError(
	f"Multiple possible test runners for {address} "
	f"({', '.join(test_runner.__name__ for test_runner in valid_test_runners)})."
	)

Allow @goal_rules to request Targets and demo with ./v2 binary run #9345

Allow @goal_rules to request Targets and demo with ./v2 binary run #9345

Conversation

Eric-Arellano commented Mar 19, 2020 • edited

Result: ./v2 binary run works with custom target types

Result: better error messages for invalid target types:

Implementation: filtering targets in @goal_rules

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsirois Mar 19, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

jsirois Mar 19, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

benjyw commented Mar 19, 2020

Eric-Arellano commented Mar 19, 2020

Choose a reason for hiding this comment

jsirois Mar 19, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood Mar 20, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

stuhood left a comment

Choose a reason for hiding this comment

stuhood Mar 20, 2020 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Allow `@goal_rule`s to request `Targets` and demo with `./v2 binary run` #9345

Allow `@goal_rule`s to request `Targets` and demo with `./v2 binary run` #9345

Eric-Arellano commented Mar 19, 2020 •

edited

Result: `./v2 binary run` works with custom target types

Implementation: filtering targets in `@goal_rules`

jsirois Mar 19, 2020 •

edited

jsirois Mar 19, 2020 •

edited

jsirois Mar 19, 2020 •

edited

stuhood Mar 20, 2020 •

edited

stuhood Mar 20, 2020 •

edited