Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow specifying external tool url template #11013

Conversation

thamenato
Copy link
Member

@thamenato thamenato commented Oct 22, 2020

Problem

An org's proxy would not allow external access to Github's URL when pants tries to download the pex binary. They can only download from their blessed host.

Solution

Allow the user to pass a new URL, using template language, that contains the binary

Result

The user can set a new url by using the property url_template and if the binary has a different version based on platform one can also set that using url_platform_mapping dictionary.

[ci skip-rust]
[ci skip-build-wheels]

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Thales!

Other reviewers, see https://pantsbuild.slack.com/archives/C046T6T9U/p1603294293445900 for more context:

So… I can’t access any external network (aka internet 😕 ) from my CI

Thales's org requires hosting the pex binary themselves and downloading from their URL.

src/python/pants/core/util_rules/external_tool.py Outdated Show resolved Hide resolved
src/python/pants/core/util_rules/external_tool.py Outdated Show resolved Hide resolved
default=cls.default_url_platform_mapping,
advanced=True,
help=(
"A dictionary mapping platforms to strings to be used when generating the URL "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Other reviewers, this is the best that Thales and I could come up with, but please feel free to tweak. It's dense.

"""
platform = self.url_platform_mapping[plat.value] if self.url_platform_mapping else ""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unclear to me how eagerly we should validate the --url-platform-mapping, e.g. ensure it aligns with the Platform enum.

Copy link
Sponsor Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great stuff! I might suggest a slight refactoring: I would leave the base class as-is, and have a subclass called TemplatedExternalTool or something like that, which implements generate_url() via this mechanism.

That way, subclasses that need logic that can't be easily expressed using this templating can still do so, by overriding generate_url() directly.

@tdyas
Copy link
Contributor

tdyas commented Oct 22, 2020

An alternative would be a single option that maps platform to a URL template (and then substitute version). That would reduce the number of options and no need to map platform names to a different name and then substitute. Thoughts?

@Eric-Arellano
Copy link
Contributor

Eric-Arellano commented Oct 22, 2020

subclasses that need logic that can't be easily expressed using this templating can still do so, by overriding generate_url() directly.

Hm, can you think of an example that needs more flexibility? My concern is that this might mean for that tool, an org like Thale's could never use the tool. They need the ability to specify the entire URL purely through the options system. We don't want to make them implement a Python plugin etc.

An alternative would be a single option that maps platform to a URL template (and then substitute version).

The challenge is that each platform mapping differs depending on the tool. Some use apple-darwin in the URL, some use osx, some use macOS, etc.

Edit: nvm, I misread your suggestion. I think that would work, with the downside that if platforms don't behave differently, you need to duplicate the URL twice.

@tdyas
Copy link
Contributor

tdyas commented Oct 22, 2020

The challenge is that each platform mapping differs depending on the tool. Some use apple-darwin in the URL, some use osx, some use macOS, etc.

Edit: nvm, I misread your suggestion. I think that would work, with the downside that if platforms don't behave differently, you need to duplicate the URL twice.

Or support a "default" key in the mapping that is used if a platform-specific key is not present.

@Eric-Arellano
Copy link
Contributor

Or support a "default" key in the mapping that is used if a platform-specific key is not present.

So you can either do:

{"default": "https://.../{version}/foo.zip"}

Or

{"darwin": "https://.../{version}/apple-darwin_foo.zip", "linux": "https://.../{version}/unknown-linux_foo.zip"}

That might make sense. Although, I don't love needing to duplicate the rest of the URL, as often it's only a small portion that changes. Right now, we only have two platforms, but we know from #10620 that that is not robust and we need more (as soon as ARM macOS laptops come out...)

@Eric-Arellano
Copy link
Contributor

Ah, Thales, you'll need to update the gRPC subsystem too: https://github.com/pantsbuild/pants/blob/master/src/python/pants/backend/codegen/protobuf/python/grpc_python_plugin.py. Looks like CI is failing because of that.

@Eric-Arellano Eric-Arellano mentioned this pull request Oct 22, 2020
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
@thamenato
Copy link
Member Author

thamenato commented Oct 22, 2020

Although, I don't love needing to duplicate the rest of the URL, as often it's only a small portion that changes.

@Eric-Arellano this is super hacky but...

_url_template = "http://.../{version}/{platform}/tool.zip"
_default_url_mapping = {
   "darwin": _url_template.format(version="{version}", platform="macos"),
   "linux": _url_template.format(version="{version}", platform="linux"),
}

that actually works.. lol

EDIT: Oh, I just thought about users adding it to the pants.toml.. which needs to be string... nvm!

@coveralls
Copy link

coveralls commented Oct 22, 2020

Coverage Status

Coverage remained the same at 0.0% when pulling 4a1d890 on thamenato:feature/url-template-to-external-tools into 8da1680 on pantsbuild:master.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 22, 2020

I can imagine needing to do more specific inspection than just "is this macos vs. linux". E.g., using network proximity to make decisions. Even if we don't use that internally, an external user might need it. In fact someone might already be using it and we just broke the API...

I would rather do the less present-breaking, more future-friendly thing.

@Eric-Arellano
Copy link
Contributor

Hm, if we do that, then the tool will not work for any user who needs to download the tool through their own blessed host. Again, they need to be able to change the URL purely through the options system.

How do you feel about updating this in the future if we do find the plugin author has that advanced requirement? We can identify either if we can tweak this templating mini-DSL, or allow them to ignore it and not support custom URLs. I'm concerned that this would otherwise be premature abstraction - we're speculating that someone needs that additional power.

In the meantime, plugin authors can override the private method _generate_url() to be whatever they want. There is an escape valve, only, private, because we want people to bias towards allowing for changing the URL via options.

@thamenato
Copy link
Member Author

If the worry is about backwards compatibility I think that adding back the generate_url() that simply calls the private _generate_url() and add a deprecation Warning would help users to keep using their current version and have time to adjust.

Now for the TemplatedExternalTools I sort of don't see the purpose considering that I can easily change the url template and mappings directly from options at this moment.

I'm biased about this but if we go this new class route I would have to re-write the Pex tool (I'm assuming that would work) using that new TemplatedExternalTools in order for pants to work on my CI, which from a user perspective kinda feels like it should just be already there available for an easy change.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 22, 2020

Well, I definitely don't want anyone to override _generate_url(), or any private method, ever. That is a nightmare...

I'm not sure what you mean by having to rewrite something to use TemplatedExternalTools? You would just subclass TemplatedExternalTools instead of ExternalTools, to get this template-based implementation. You already have to rewrite the subclasses to provide the templates, so making them extend a different class doesn't seem like an onerous change?

I see the point about this possibly being premature, BUT - the issue with deeming this a premature abstraction is the cost of introducing it if we needed to. How would we do that exactly? This change bakes the templating behavior into the base class, which is the interface to this functionality, and leaves no straightforward possibility for extension or differing implementation. So this is prematurely anti-abstracting...

I'm suggesting that ExternalTool is ~an interface, and TemplatedExternalTools would be its (currently only) implementation. If this were Scala and we had literally implemented it as a trait, this wouldn't be controversial. Python, unfortunately, muddies the waters... But note that generate_url() is an @abstractmethod, which helps imply this.

@Eric-Arellano
Copy link
Contributor

Hm, I could buy that. A lingering concern of mine is how to document both the interface and this new subclass. How would you feel about documenting? Would we prefer one vs the other? I don't want to robustly document both and make the user decide which they want.

@thamenato
Copy link
Member Author

I'm suggesting that ExternalTool is ~an interface, and TemplatedExternalTools would be its (currently only) implementation.

Just to be clear that I understood what you meant - As a Plugin developer I can either choose the current TemplatedExternalTools implementation (and if needed tweak the functions I need) OR I can come up with my own MyClassExternalTools based on ExternalTools and implement the functionalities.

ExternalTools wouldn't be used directly anymore.

@Eric-Arellano
Copy link
Contributor

As a Plugin developer I can either choose the current TemplatedExternalTools implementation (and if needed tweak the functions I need) OR I can come up with my own MyClassExternalTools based on ExternalTools and implement the functionalities.

Yeah, I think it would look like this:

  • Keep ExternalTool identical to what it was before this PR, including generate_url() having @abstractmethod.
  • Add TemplatedExternalTool (or another name), which subclasses ExternalTool and:
    • Implements register_options() to add the --version-template and --version-platform-mapping options.
    • Uses default_version_template and default_version_platform_mapping like you have it.
    • Implements generate_url like your _generate_url() does now. Mark this with @final from typing_extensions, because subclasses probably shouldn't mess with the implementation.
  • The 4 call sites you added will now subclass TemplatedExternalTool, and will otherwise keep the changes you made in this PR.

A benefit of this approach is that we don't break anything for plugin authors already using ExternalTool. You can ignore this whole --version-template feature. In reality, most plugin authors are writing plugins for their own codebases, so they don't need to worry about a customizable URL - they can just hardcode the value they want. This mechanism is only useful if you want your plugin to be consumed externally, in which case, you can use TemplatedExternalTool.

I think this is a good change.

@benjyw
Copy link
Sponsor Contributor

benjyw commented Oct 22, 2020

I'm fine with only documenting TemplatedExternalTools for now.

Also, @thamenato has just discovered that our use of @abstractmethod is bogus, because we're not extending abc.ABC or setting metaclass=ABCMeta

# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
# Rust tests and lints will be skipped. Delete if not intended.
[ci skip-rust]

# Building wheels and fs_util will be skipped. Delete if not intended.
[ci skip-build-wheels]
Copy link
Contributor

@Eric-Arellano Eric-Arellano left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for iterating on this, Thales!

@Eric-Arellano Eric-Arellano merged commit 28d5036 into pantsbuild:master Oct 22, 2020
Eric-Arellano pushed a commit to Eric-Arellano/pants that referenced this pull request Oct 22, 2020
…ild#11013)

### Problem

An org's proxy would not allow external access to Github's URL when `pants` tries to download the `pex` binary. They can only download from their blessed host.

### Solution

Allow the user to pass a new URL, using template language, that contains the binary 

### Result

The user can set a new url by using the property `url_template` and if the binary has a different version based on platform one can also set that using `url_platform_mapping` dictionary.

[ci skip-rust]
[ci skip-build-wheels]
Eric-Arellano added a commit that referenced this pull request Oct 22, 2020
Copy link
Sponsor Contributor

@benjyw benjyw left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants