Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change complete_platforms to use file paths rather than target addresses #15234

Open
Eric-Arellano opened this issue Apr 22, 2022 · 7 comments
Open
Assignees
Labels
backend: Python Python backend-related issues

Comments

@Eric-Arellano
Copy link
Contributor

A user was confused when using complete_platforms thinking they could use a file path.

I don't see the benefit of requiring a file target? It seems to me like unnecessary boilerplate, and requiring users to interact with some thing they often find confusing (targets).

If we change this to a file path, users can always still wrap that file in a file target for other uses.

@jsirois
Copy link
Contributor

jsirois commented Apr 22, 2022

Ignorance of "how we do things" and no push back in PR that added complete_platform support to pex_binary from you or Benjy. The pattern then just rolled forward to lambdas. I'd be happy to have less targets as you well know, so hack away!

@Eric-Arellano
Copy link
Contributor Author

Ignorance of "how we do things" and no push back in PR that added complete_platform support to pex_binary from you or Benjy.

Sounds good, no worries at all. My bad for not reviewing outside of a very cursory review a few days after you merged.

@Eric-Arellano Eric-Arellano changed the title Should complete_platforms be file paths rather than target addresses? Change complete_platforms to use file paths rather than target addresses Apr 22, 2022
@Eric-Arellano
Copy link
Contributor Author

Some thoughts on how to do this. The relevant rule:

@rule
async def digest_complete_platforms(
complete_platforms: PexCompletePlatformsField,
) -> CompletePlatforms:
original_file_targets = await Get(
Targets,
UnparsedAddressInputs,
complete_platforms.to_unparsed_address_inputs(),
)
original_files_sources = await MultiGet(
Get(
HydratedSources,
HydrateSourcesRequest(tgt.get(SourcesField), for_sources_types=(FileSourceField,)),
)
for tgt in original_file_targets
)
snapshot = await Get(
Snapshot, MergeDigests(sources.snapshot.digest for sources in original_files_sources)
)
return CompletePlatforms.from_snapshot(snapshot)

Naively, we can try to evaluate the list as both Get(Digest, PathGlobs) and target addresses...we simply resolve it all and merge the result with MergeDigests. To do that, we can set skip_invalid_addresses=True on the complete_platforms.to_unparsed_address_inputs() line, and then use the default of GlobMatchErrorBehavior.ignore for the PathGlobs.

The issue with that naive approach is we won't eagerly error on invalid entries. Pants tries hard to eagerly validate data so that we can raise likely issues in your code, e.g. typos.

So, the best idea I have so far is that we evaluate each element in the list one-at-a-time. Try to parse it as PathGlobs; if that fails, use await Get(MaybeAddress, AddressInput); and if both failed, then error. (That is less performant to evaluate one-at-a-time rather than batching, but realistically it shouldn't matter much and we can even parallelize it using MultiGet)

--

We need to decide if we want to deprecate using file targets. There are two downsides:

a) we break existing users, which @jriddy and @jsirois have been encouraging us to be more cautious #16547
b) we lose this nifty feature http_source, which lets you have a file target download a file. I could see that being useful to download platform files

So I bias towards keeping both file and also literal file paths. What do you think?

Either way, help will need to be updated.

@jsirois
Copy link
Contributor

jsirois commented Aug 17, 2022

Keeping file targets to support URL fetching will probably be useful, I think part of the complete platforms PEP will be a "database" of these out at some set of URLs. That said, the fact that a target is needed for URL fetching is itself archaic. Certainly ^https?:// is unambiguous and we could do better.

@huonw
Copy link
Contributor

huonw commented Oct 4, 2023

There are two downsides

I think there's an additional one: using a target means the platform can be codegen-ed, e.g. an adhoc_tool or shell_command that runs pex3 interpreter inspect --markers --tags within a docker image or similar.

(In this sort of case, it's likely often possible to run the build in the docker image directly with an environments (especially once they're stable), but running cross-builds natively can be much faster, so complete platforms are still useful to minimise much emulation is required. And, also if one is trying to build a multi-platform PEX.)

@huonw huonw added the backend: Python Python backend-related issues label Oct 4, 2023
@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 9, 2023

I don't think this is quite that simple - the PexCompletePlatformsField is a SpecialCasedDependencies field. There is code in various places that expects those to contain addresses.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 9, 2023

Maybe this would be better solved with a macro or something.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backend: Python Python backend-related issues
Projects
None yet
Development

No branches or pull requests

4 participants