Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix webhook API to also support URLs #54889

Merged
merged 2 commits into from
Nov 12, 2017

Conversation

lavalamp
Copy link
Member

@lavalamp lavalamp commented Oct 31, 2017

ref: kubernetes/enhancements#492

The dynamic admission webhook now supports a URL in addition to a service reference, to accommodate out-of-cluster webhooks.

@k8s-ci-robot k8s-ci-robot added do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. size/M Denotes a PR that changes 30-99 lines, ignoring generated files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. labels Oct 31, 2017
@lavalamp
Copy link
Member Author

Also @liggitt

// webhook, for example, a cluster identifier.
//
// Required.
URL string
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like this change.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately I'm hearing from @bowei that there is no safe standard DNS form that we could parse? Bowei, please say I misunderstood :( :(

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm unsure what is meant by that. I'd expect to fully specify the URL, including the scheme: scheme://host[:port][/path]. What is ambiguous about that?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to distinguish between a service exported by the cluster vs a server with a DNS name reachable on the network?

What Daniel is referring to is that we do not want to bake in the current DNS schema into the system in case the schema changes. This is a particularly subtle place to have to be maintained if any changes are done.

Any alternative is to have a different scheme for identifying cluster only services, something like: k8s-service://.

Copy link
Member Author

@lavalamp lavalamp Nov 1, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt, kube-apiserver must be able to tell whether the DNS name refers to something in the cluster or out of the cluster, because it can't assume the presence of DNS resolution for in-cluster things (or, currently, routes, though that will change in a few quarters).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've pushed a commit that keeps both. I don't like it very much. Why not maintain the Service field, and make the external hostname field of that work correctly?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lavalamp @deads2k

I'm in favor of only a service reference, and opposed to a URL. As we can see from this discussion, URLs obscure information that needs to be first-class. In particular, if something generates or munges the Service name, it needs to know how to populate this field. Everywhere else in the API we use object references or names.

So, option 3.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For self-hosted webhooks, I can live with service references, resolved as they are today.

For non-cluster-hosted webhooks, it doesn't make sense to require the indirection of using a service reference pointing to a service which itself points to an external hostname. That still requires the external hostname to be resolvable, and requires all consumers of admission webhook config (even ones in pods that could have used DNS for in- and out- of cluster resolution) to have custom dialers and read permission on the service API objects.

@erictune
Suggestion:
at most one of Service ServiceReference or URL string may be set.
if URL string is used, normal DNS resolution is used, then no magic happens, and <name>.<namespace>.svc. is not expected to work.
This lets the user chose which is the lesser evil: layering violation or non-portable configuration.

That seems reasonable to me (and it's not a layer violation to use DNS to resolve off-cluster webhook hostnames)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like I never submitted my reply to eric's comment above, but Jordan and Eric's approach seems to be reasonable.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What I care most about is that we keep service/object references for self-hosted webhooks.

// render your config non-portable; apiservers will route to the
// service correctly.
//
// If the scheme is present, it must be "https://".
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why isn't http:// permissible?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If it were, then the CABundle field would have to be optional, but it isn't.

I don't mind changing this--was just keeping the change limited.

@deads2k
Copy link
Contributor

deads2k commented Nov 2, 2017

I think using name.namespace.svc as the trigger for your dialer treating it as an internal name is your best option. All in-cluster DNS will resolve that and having the kube-apiserver be consistent with in-cluster DNS seems like a very good idea.

You were concerned about .svc becoming a TLD, but even if that happens people still have a fairly simple way to specify that using the DNS standard trailing dot to indicate a complete name (external.svc. is fully qualified, name.namespace.svc is not fully qualified). In addition, that change would be consistent for things using in-cluster DNS and things not using it.

I would not use option 1 because of the extra DNS changes.

I'm not a fan of allowing multiple options in the API type. It seems like an unnecessary layer. Expecting the kube-apiserver to be able to use DNS names directly and resolve in-cluster names seems reasonable and doable.

@deads2k
Copy link
Contributor

deads2k commented Nov 3, 2017

That seems reasonable to me (and it's not a layer violation to use DNS to resolve off-cluster webhook hostnames)

Not very pretty, but it at least makes it possible to use externally hosted hooks.

@lavalamp
Copy link
Member Author

lavalamp commented Nov 3, 2017

Thank you everyone.

That extension apiservers may not have permission to read arbitrary services seems like a good reason to include the URL. Or at least, I'm not in favor of forcing the system administrator to set up RBAC rules to make sure every apiserver can read every webhook's service. And I agree that relying on external DNS is not a layer violation.

So, I now believe that both are necessary, sadly.

@bgrant0607
Copy link
Member

I'm fine with keeping both.

@lavalamp
Copy link
Member Author

lavalamp commented Nov 4, 2017

To be my own devil's advocate, it's worth noting we'll already have to make sure extension apiservers can read all namespaces.

@liggitt
Copy link
Member

liggitt commented Nov 4, 2017

To be my own devil's advocate, it's worth noting we'll already have to make sure extension apiservers can read all namespaces.

That was always the case in order to correctly allow/prevent creation of namespaced resources. I'd really avoid requiring visibility into unrelated namespace content like services.

@thockin
Copy link
Member

thockin commented Nov 4, 2017

On mobile, apologies for mistakes.

I am very not fond of ad hoc syntax definitions, and that's what using "foo.bar.svc" is. It is not a real host name, but it could become one. It is not an in-cluster FQDN. It is just a syntax that happens to lean a bit on DNS, as defined today.

DNS will change eventually (though it will be a long transition). This is one more place to adapt or else you carry your own syntax.

If you need a URL,why not k8s-svc://service.namespace ?

@bowei
Copy link
Member

bowei commented Nov 4, 2017

@thockin -- is there a place where we can make something like k8s-service:// (or kubernetes-services://) official? Seems the "proper" way to go...

@lavalamp
Copy link
Member Author

lavalamp commented Nov 4, 2017

Yeah, the custom scheme mechanism doesn't work because there's no programatic way for extension apiservers to turn that into a proper string they can DNS resolve. So both it is :(

@bgrant0607
Copy link
Member

@bowei A string with a custom format is not the right way to go. The syntax would be less discoverable, it would be different from all other references, defaulting from context would be harder, ...

@k8s-ci-robot k8s-ci-robot added size/L Denotes a PR that changes 100-499 lines, ignoring generated files. and removed size/M Denotes a PR that changes 30-99 lines, ignoring generated files. labels Nov 9, 2017
@k8s-github-robot k8s-github-robot added the kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API label Nov 9, 2017
@lavalamp lavalamp changed the title RFC: Remove service reference and combine with URLPath Fix webhook API to also support URLs Nov 9, 2017
// webhook, for example, a cluster identifier.
//
// +optional
URL *string `json:"url,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

internal types don't get tags

ClientConfig: admissionregistration.AdmissionHookClientConfig{},
},
}),
expectedError: `exactly one of`,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

get slightly more specific so you include the fields too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I intended to copy/paste from the failing test to put in the thing to check, but then the test unexpectedly passed and I didn't...

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I include each field in at least one of the test cases now, I don't like making change-detector tests so I don't think it needs to be universal.

// `path` is an optional URL path which will be sent in any request to
// this service.
// +optional
Path *string `json:"path,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

missing proto tag

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought the proto generator was supposed to do that, I guess make generated_files didn't actually run that (WHY?), which explains some other things.

// webhook, for example, a cluster identifier.
//
// +optional
URL *string `json:"url,omitempty"`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

prototag

}

// TODO: cache these instead of constructing one each time
restConfig, err := a.authInfoResolver.ClientConfigFor(serverName)
restConfig, err := a.authInfoResolver.ClientConfigFor(u.Host)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn't host include port? I wonder what the clientConfigFor code does in that case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure if I should split here or let authInfoResolver worry about it-- what if people run two different things on two ports on the same host? (parameter name is "server" which is pretty ambiguous as to whether it should include a port)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously we mandated 443, so at least this doesn't make it worse.

@k8s-ci-robot k8s-ci-robot added release-note Denotes a PR that will be considered when it comes time to generate release notes. and removed do-not-merge/release-note-label-needed Indicates that a PR should not merge because it's missing one of the release note labels. labels Nov 11, 2017
@k8s-github-robot k8s-github-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Nov 11, 2017
@lavalamp
Copy link
Member Author

/retest pull-kubernetes-bazel-test

@lavalamp
Copy link
Member Author

/retest

Copy link
Member

@caesarxuchao caesarxuchao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A nit. Otherwise lgtm.

//
// If the webhook is running within the cluster, then you should use `service`.
//
// If there is only
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: premature new line

@caesarxuchao
Copy link
Member

/lgtm

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Nov 12, 2017
@k8s-github-robot
Copy link

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: caesarxuchao, lavalamp

Associated issue: 492

The full list of commands accepted by this bot can be found here.

Needs approval from an approver in each of these OWNERS Files:

You can indicate your approval by writing /approve in a comment
You can cancel your approval by writing /approve cancel in a comment

@caesarxuchao
Copy link
Member

/retest

// webhook, for example, a cluster identifier.
//
// +optional
URL *string `json:"url,omitempty" protobuf:"bytes,3,opt,name=url"`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not require scheme? Making it optional means url.Parse can't be used on the value as specified, right? (I think Parse treats a schemeless host as a relative path)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

validation does require the scheme, I forgot to fix the comment.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like it allows "". Should just require https

@@ -395,47 +400,71 @@ func toStatusErr(name string, result *metav1.Status) *apierrors.StatusError {

func (a *GenericAdmissionWebhook) hookClient(h *v1alpha1.Webhook) (*rest.RESTClient, error) {
cacheKey, err := json.Marshal(h.ClientConfig)
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If an error occurs, we shouldn't use the cacheKey to look up a client

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Or store a client)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this was a rebase problem, oops.

return nil, &ErrCallingWebhook{WebhookName: h.Name, Reason: ErrNeedServiceOrURL}
}

u, err := url.Parse(*h.ClientConfig.URL)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does this do the right thing with schemeless urls?

cfg := rest.CopyConfig(restConfig)
cfg.Host = u.Host
cfg.APIPath = u.Path
// TODO: test if this is needed: cfg.TLSClientConfig.ServerName = u.Host
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not needed if host and serverName are the same

}
if len(u.Host) == 0 {
allErrors = append(allErrors, field.Required(fldPath.Child("url"), "host must be provided"+form))
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure there's no query or user info?

@liggitt
Copy link
Member

liggitt commented Nov 12, 2017

/hold
Just a couple comments on the handling of an error building the cacheKey, and whether handling of a schemeless host is correct/necessary

@k8s-ci-robot k8s-ci-robot added the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 12, 2017
@lavalamp lavalamp removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Nov 12, 2017
@lavalamp
Copy link
Member Author

chatted w/ @liggitt, going to fix comments in a followup. Don't want to block the merge train.

@liggitt
Copy link
Member

liggitt commented Nov 12, 2017

Talked offline. Will take a follow up on the scheme and cacheKey issues

@lavalamp
Copy link
Member Author

/retest

@lavalamp
Copy link
Member Author

Followup is #55534

@lavalamp
Copy link
Member Author

/retest

1 similar comment
@caesarxuchao
Copy link
Member

/retest

@lavalamp
Copy link
Member Author

I know the e2e isn't due to this PR because #55534 passes...

@k8s-github-robot
Copy link

/test all [submit-queue is verifying that this PR is safe to merge]

@k8s-github-robot
Copy link

Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions here.

@k8s-github-robot k8s-github-robot merged commit e938190 into kubernetes:master Nov 12, 2017
k8s-github-robot pushed a commit that referenced this pull request Nov 12, 2017
Automatic merge from submit-queue. If you want to cherry-pick this change to another branch, please follow the instructions <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/cherry-picks.md">here</a>.

Tighten webhook client config validation

ref kubernetes/enhancements#492

Fix up some nits left from #54889.

```release-note
NONE
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/api-change Categorizes issue or PR as related to adding, removing, or otherwise changing an API lgtm "Looks good to me", indicates that a PR is ready to be merged. release-note Denotes a PR that will be considered when it comes time to generate release notes. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants