Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: get rid of UnexpandedTargets #13086

Closed
Eric-Arellano opened this issue Oct 2, 2021 · 3 comments
Closed

Proposal: get rid of UnexpandedTargets #13086

Eric-Arellano opened this issue Oct 2, 2021 · 3 comments

Comments

@Eric-Arellano
Copy link
Contributor

(Relates to the proposal for address-less target generators. This proposal makes that one even more compelling, but they are independent.)

Status quo

#10511 started treating target generators (aka "BUILD targets") as aliases in most situations. We replace the target generator with all of its generated targets, including here:

  1. CLI specs, ./pants test project:tests is shorthand for ./pants test project:tests#f1_test.ext project:tests#f2_test.ext etc
  2. dependencies field, depending on the target generator results in depending on everything it generates

But we don't always do this replacement, notably with project introspection (#12952) (#12951). ./pants list :: and ./pants peek :: will not output generated targets, which is limiting and violates the conceptual space that what Pants really cares about are atom targets like python_source and python_test (target generators are only for boilerplate reduction).

It also continues to be confusing to rule authors whether to use UnexpandedTargets vs. Targets, such as me accidentally breaking our Go code today by using Targets because it resulted in depending on any _go_external_package pulling in a dep on every one of them generated from the same go_mod target.

Proposal

Get rid of UnexpandedTargets and always have Targets. Remove the replacement/alias mechanism.

Impact to dependencies

For most contexts, there will be no impact. Target generators will still depend on their generated targets, so when you depend on a target generator, you transitively depend on everything you generate.

This proposal only impacts contexts that solely look at direct dependencies, which are currently:

  1. ./pants dependencies --no-transitive. That's a good change, the output becomes more predictable.
  2. Pylint. Would break explicit deps on python_library and python_tests targets - you'd have to switch to deps on the generated files.
    • To fix this, we could have Pylint start using transitive targets - or add special casing to expand deps on python_library, python_tests, and target .

If we have other cases like Pylint that only need direct deps, we can likewise choose to use transitive targets or add special casing to expand some targets.

Impact to CLI specs

Build-oriented goals like test, lint, and check will no longer operate when using a literal address spec on a target generator, like project:lib. Instead, they'll warn that the target generator cannot be operated on, and suggest other target types, e.g. use python_source instead of python_sources.

This is mitigated by our other CLI specs—you can use dir:, dir::, and filesystem specs to glob over all the generated targets. (Directory specs would make this even better #12286)

Project introspection goals will behave much more consistently:

$ ./pants list project:lib
project:lib
$ ./pants list project:
project:lib
project:lib#f1.ext
project:lib#f2.ext

./pants peek will give back the target you asked for, rather than replacing it with the generators. This is crucial because target generators might have different fields than the generated target, and users should be able to programmatically query the generator.

@stuhood
Copy link
Sponsor Member

stuhood commented Oct 3, 2021

Thanks a lot for writing this up: it's an important issue.

For most contexts, there will be no impact. Target generators will still depend on their generated targets, so when you depend on a target generator, you transitively depend on everything you generate.

This proposal only impacts contexts that solely look at direct dependencies, which are currently:

  1. ...
  2. ...

If we have other cases like Pylint that only need direct deps, we can likewise choose to use transitive targets or add special casing to expand some targets.

This also impacts compiled languages: both Go and Java will only want to look at direct dependencies (for Scala it will likely be a toggle to use direct/transitive). Also, we had discussed the possibility that MyPy might be able to operate on only direct dependencies: I'm curious whether it can.

As you say: @rules can implement their own alias expansion: #13087 does for CoarsenedTargets, for example (by walking through targets with no sources). But this was an annoyingly common bit of boilerplate in v1: you would see "if isinstanceof(target, Alias): .." and recursive expansion peppered around: it's a nice thing to have in the core, rather than scattered everywhere.

In general, I think that "Targets with aliases expanded" and "Targets without aliases expanded" are two reasonable states... there are cases that will want to operate on aliases, and that seems fine. Perhaps this would be less objectionable as a boolean flag on Specs, DependenciesRequest, CoarsenedTargets (i.e. expand_aliases, with a default of True) which declared whether you wanted to be able to observe aliases?

Impact to CLI specs

Build-oriented goals like test, lint, and check will no longer operate when using a literal address spec on a target generator, like project:lib. Instead, they'll warn that the target generator cannot be operated on, and suggest other target types, e.g. use python_source instead of python_sources.

This is mitigated by our other CLI specs—you can use dir:, dir::, and filesystem specs to glob over all the generated targets. (Directory specs would make this even better #12286)

Removing the ability to actually specify an alias on the CLI seems unnecessary. The vast majority of the time rule authors would want alias expansion (as demonstrated by ~all goals other than graph introspection currently using it): removing expansion seems like it disadvantages these goals in favor of graph introspection... when, really, it seems like we should be able to satisfy both.

Project introspection goals will behave much more consistently:

$ ./pants list project:lib
project:lib
$ ./pants list project:
project:lib
project:lib#f1.ext
project:lib#f2.ext

./pants peek will give back the target you asked for, rather than replacing it with the generators. This is crucial because target generators might have different fields than the generated target, and users should be able to programmatically query the generator.

This part makes sense, but as far as I can tell it has nothing to do with removing UnexpandedTargets... this is just changing the behavior of Specs expansion to include both the alias and the targets behind it. And that seems like a very logical thing to do, independent of the rest of the proposal.

I continue to wonder whether only adjusting the Specs expansion is sufficient: basically, graph introspection goals would continue to request UnexpandedTargets (and/or use a boolean flag if we think that makes sense) to preserve the aliases in the result, BUT we would change the semantics of SiblingAddresses and DescendantAddresses to always include BOTH the alias/generator and the generated targets (which makes sense: they exist within the directory). The effect would be:

  1. await Get(UnexpandedTargets, Specs(SiblingAddresses(..))): new: includes BOTH the generator and the generated, as in your example
  2. await Get(Targets, Specs(SiblingAddresses(..)): as now, includes only the generated targets, since the aliases will have been expanded and deduped with the generated
  3. await Get(UnexpandedTargets, Specs(SingleAddress(..)): as now, only the alias (as in your example)

In short, I think that it's fairly likely that just adjusting globby Specs to expand aliases (and actually match everything that lives in those locations) is sufficient to resolve this.

But I'm curious to better understand the Go case you encountered, and whether there is a useful way to resolve it. In #13087 at least, I think that I realized that having a flag on the request to include/exclude aliases (i.e. expand_aliases=True) works more generally across Targets/CoarsenedTargets/etc than having a different return type does.

@Eric-Arellano
Copy link
Contributor Author

This also impacts compiled languages: both Go and Java will only want to look at direct dependencies (for Scala it will likely be a toggle to use direct/transitive).

Looks like direct dependencies also impact archive(files=) and relocated_files(files=) targets, which only get the literal addresses specified. archive is using Get(Targets, UnparsedAddressInputs), which replaces the files target generator with the generated file targets. So, even after an upcoming PR to split generator target vs. atom target, the right thing happens. We do want alias replacement there.

I'm starting to warm up to your feedback that we may indeed want to keep alias expansion and that callers determine which semantics are used - albeit in a revamped way.

Also, we had discussed the possibility that MyPy might be able to operate on only direct dependencies: I'm curious whether it can.

Hm, I don't recall that? MyPy must have the transitive closure in the chroot to work, so I don't think it's possible.

@Eric-Arellano
Copy link
Contributor Author

Okay, yeah. Closing this particular proposal. It raises some important discussion like Stu's suggestion about Specs, but this proposal of removing the alias mechanism wholesale is likely a bad idea. It's very plausible we'll want to reform it, but not remove the notion of aliases entirely.

For me, the main reason I'm convinced is the handling of direct dependencies.

Eric-Arellano added a commit that referenced this issue Oct 9, 2021
… target (#13190)

Part of #12954 and the proposal at https://docs.google.com/document/d/1HpJn2jTWf5sKob6zhe4SqHZ7KWBlX4a6OJOwnuw-IHo/edit.

End functionality is the same, but we're now safe in followups to expose `files` vs. `file` targets, where the former has an `overrides` field.

This brings to light an issue explored in #13086: when looking at direct dependencies, should you replace the target generator with its generated targets? Here, we need to for the `archive(files=`) and `relocated_files(files_targets=)` fields to work when you put an address for a `files` target rather than a `file` target. As the author of those two targets, I did not realize this nuance, even though we had file targets in Pants 2.0+ and this dynamic still existed. This is subtle and awkward, but I think valuable to bring to light something we need to figure out vs. hiding it.  

[ci skip-rust]
[ci skip-build-wheels]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants