Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[internal] Python module mapping considers resolves #14034

Merged
merged 1 commit into from Jan 7, 2022

Conversation

Eric-Arellano
Copy link
Contributor

Part of #13621.

For third-party dependencies, we now associate each python_requirement with the resolves it is compatible with, rather than having a single global "universe". Soon, first-party targets like python_source will only be able to infer deps from resolves they themselves use.

This does not yet change user behavior. We still need to add compatible_resolves to the python_source target and then refactor our PythonModuleOwners rule to pass in this information.

[ci skip-rust]
[ci skip-build-wheels]

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
Comment on lines +209 to +214
If `resolves` is None, will not consider resolves, i.e. any `python_requirement` can be
consumed. Otherwise, providers can only come from `python_requirements` marked compatible
with those resolves.
"""
if resolves is None:
resolves = list(self.keys())
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'll continue to set resolves=None if --no-python-enable-resolves (the default).

Copy link
Sponsor Member

@stuhood stuhood left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks!

@@ -276,7 +321,7 @@ async def map_module_to_address(
third_party_mapping: ThirdPartyPythonModuleMapping,
) -> PythonModuleOwners:
providers = [
*third_party_mapping.providers_for_module(module.module),
*third_party_mapping.providers_for_module(module.module, resolves=None),
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For the next change: rather than using all of the resolves from compatible_resolves for a target, this will probably need to be exactly one resolve chosen by the caller: that's related to the "choose how many permutations of parameters to use" problem of #13882... but in the meantime, I think that we'll have to choose arbitrarily (first in the list or something).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I'm still trying to internalize what that means. Should we update JVM to do that?

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, I'm still trying to internalize what that means. Should we update JVM to do that?

Me too, heh. And yea, JVM should be updated as well once we have a better idea of what it means via #13882.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, okay, I think an example will help me to understand this. Is this right?

# BUILD
python_requirement(
  name='colors',
  requirements=['ansicolors'],
  compatible_resolves=['a', 'b'],
)

python_requirement(
  name='tensorflow-v1',
  requirements=['tensorflow==1.0'],
  compatible_resolves=['a'],
)

python_requirement(
  name='tensorflow-v2',
  requirements=['tensorflow==2.0'],
  compatible_resolves=['b'],
)

Meaning resolves are:

  • a: [ansicolors, tensorflow-v1]
  • b: [ansicolors, tensorflow-v2]

We have utils.py, where the Tensorflow import works with both 1.0 and 2.0, so we can mark the file as compatible with both resolves:

# f.py
import colors
from tensorflow import some_table_api
python_source(
  name="f",
  source="f.py",
  compatible_resolves=["a", "b"],
)

The dependency on ansicolors is fine, it's the same requirement in both resolves. But, TensorFlow is different across the two - if we matched against every compatible resolve, there would still be ambiguity, even though the caller will be able to disambiguate by choosing either "a" or "b".

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But, TensorFlow is different across the two - if we matched against every compatible resolve, there would still be ambiguity, even though the caller will be able to disambiguate by choosing either "a" or "b".

I believe that how this will need to work is that the resolve to use comes in as something baked into the Address of targets, such that f@compatible_resolve=a is a different target from f@compatible_resolve=b. How many of those there will be, and whether there are named collections of multiplexed arguments is an open question. At that point, we'd uniquely compute deps per target.

At a fundamental level, it's very similar to how you might do this with macros/target-generators... but my hope is that making the engine aware of the permutations makes it easier to configure the multiplexing and decide how many permutations to build locally vs in CI, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

such that f@compatible_resolve=a is a different target from f@compatible_resolve=b.

That makes me happy from a rule author perspective. That's easier to work with.

Although, this does introduce a problem with dependency inference for first-party targets, doesn't it? If there are two variants of f, then any import of f will be ambiguous which variant to use....unless! We update first-party dependency inference so you can only infer a dep on a target with your resolve? So if we have f1@{a,b} and f2@{a,b}, then f2@a infers a dep on f1@a and f2@b infers a dep on f1@a?

@Eric-Arellano Eric-Arellano merged commit ac246b4 into pantsbuild:main Jan 7, 2022
@Eric-Arellano Eric-Arellano deleted the dependency-inference branch January 7, 2022 00:20
Eric-Arellano added a commit that referenced this pull request Jan 27, 2022
Follow up to #14034. For `python_test` and `pex_binary`, we now use infer dependencies on `python_requirement` targets from the same `resolve`. (If `[python].enable_resolves = true`). This implements the feature sketched out at #13621.

Follow ups will apply this restriction to `python_source`, `python_aws_lambda`, and `python_google_cloud_function`. I have no idea how this should work with `python_distribution` though...

Note that you can only set one `resolve`, even though `python_source` for now has the `compatible_resolves` field to let you have >1. See discussion at #14034 (comment). 

[ci skip-rust]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants