Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Network Proxy KEP. #872

Merged
merged 4 commits into from
Apr 9, 2019
Merged

Network Proxy KEP. #872

merged 4 commits into from
Apr 9, 2019

Conversation

cheftako
Copy link
Member

@cheftako cheftako commented Mar 5, 2019

Adding KEP based on https://goo.gl/qiARUK.
Plan to add support for a network proxy between Kubernetes API Server and cluster/nodes.
Supports the removal of the deprecated SSH Tunnel system.
Adds supports for certain security features.

@k8s-ci-robot k8s-ci-robot added do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. labels Mar 5, 2019
@cheftako
Copy link
Member Author

cheftako commented Mar 5, 2019

Copy link
Member

@justinsb justinsb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great - nit picks but I think the design is sound

keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved
keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved
keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved
Each connectivity proxy allows secure connections to one or more cluster networks.
Any network addressed by a connectivity proxy must be flat.
Currently the only mechanism for handling overlapping IP ranges in Kubernetes is the Proxy.
As we are passed the proxy there would need to be a non Kubernetes mechanism
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was not clear to me. I think you mean "the Cluster Network must have a mechanism to reach the proxy, and the the KAS must have a mechanism to reach the proxy. This would be configured as part of cluster creation, not by Kubernetes itself" (or something like that?)

keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved
in a “legacy” mode or as a fallback. It also allows for very simple clusters
which do not need the concept of network segregation to be run
without the overhead (either in extra hop latency or configuration complexity)
of using the connectivity proxy.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this be the implicit configuration if proxies were not configured? (Less secure, but I'm guessing so for compatibility)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so yes.

keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved
keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved
- **Admission Webhooks**
Admission webhooks can either be destined for a service or a URL.
If they are destined for a service then the service rules apply.
If they are destined for a URL then we will use the ‘master’ NetworkContext.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can I not address a service by URL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have never tried addressing a K8s service via the URL. If someone wants to send to a K8s service, they should use the service configuration. Given that the proxy used in these two examples will be different, they need to use the correct configuration.

keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved
Added details including prototype grpc spec.
Added NetworkContext definition.
Added detailed explanations of use cases.
Includes feedback from Lavalamp, Justinsb and Anfernee.
More feedback from lavalamp and justinsb.
@cheftako cheftako changed the title [WIP] Network Proxy KEP. Network Proxy KEP. Mar 8, 2019
@k8s-ci-robot k8s-ci-robot removed the do-not-merge/work-in-progress Indicates that a PR should not merge because it is a work in progress. label Mar 8, 2019
// side of the tunnel. The format is inspired by golang's net interface
// https://golang.org/pkg/net/#Dial
message DialRequest {
// network representing a named network. "Tcp" is the only supported
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you plan to support more than just TCP?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, maybe UDP, not the highest priority at this point of time tho.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The API server doesn't do anything with UDP today. For a while now, we've discussed breaking the proxy functionality out of the API server. I would be pretty hesitant to add more functionality at this point.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exactly, to be clear, we don't have a plan to support UDP. the field is there just for the sake of extensibility. it's okay to me to remove it and assume all traffic is tcp, but in future if there is need, there will be a proto change.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding a field later that defaults to tcp is a backwards compatible change.

// the number is kept unchanged across the proxies, and copied to DialResponse
// from the other side of the tunnel. The number cannot be reused across
// different DialRequests.
bytes random = 3;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How is this used? Why is the grpc stream id not sufficient?

Copy link
Member

@anfernee anfernee Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to multiplex more than one connections through the same grpc stream. Once a grpc stream is established, packets can flow back and forth between both ends of the stream. There also could be multiple dial requests sent out at the same time, random is used to match between response and request.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guessed that, but why is it useful to multiplex connections over a single grpc stream? Grpc streams are already multiplexed over a single tcp connection.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see what you mean. Good question. Actually there is another option to avoid this, but it would be 2 method calls. Mostly because the grpc stream doesn't allow you to specify a destination. so you have to Dail then Connect. It's similar (not exactly) to socket, where you need to call socket to get a fd then connect to get another one for streaming.

service ProxyService {
   // DialRequest and DialResponse is the same, but without random.
   rpc Dial(req DialRequest) DialResponse {}

   rpc Connect(stream Packet) (stream Packet) {}
}

This potentially moves the stream control out, and we probably need another method Close as well. The problem of this approach is that it's hard to define the order of multiple data and control flows in different serving threads. e.g. say when a close request and data comes in at the same time, it's hard to tell who's coming first. The right behavior is pretty much hard to define. While in the same stream, it's all sequential, then it's just a matter of whoever arrives first.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The problem of this approach is that it's hard to define the order of multiple data and control flows in different serving threads. e.g. say when a close request and data comes in at the same time, it's hard to tell who's coming first. The right behavior is pretty much hard to define. While in the same stream, it's all sequential, then it's just a matter of whoever arrives first.

Assume packets arrive reliably and in order, send dial parameters in a message extension or header and you have yourself a drop in replacement for net.Dial with all the hard stuff handled by the lower layers :)

service ProxyService {
// Proxy connects to a remote address by stream id, and establish
// a bi-directional stream of packet exchange.
rpc Proxy(stream Packet) returns (stream Packet) {}
Copy link
Member

@mikedanese mikedanese Mar 10, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why wouldn't a simpler interface work? e.g.

rpc Proxy(stream Bytes) (stream Bytes) {}

With parameters like IP passed in headers or a request extension?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe it could. I thought stream as a way of a reliable or unreliable transport, like IP network. then I thought of packet as a TCP/UDP packet, where the functionality is self-contained, not relying on anything else.

Copy link
Member

@mikedanese mikedanese Mar 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It depends on if you ever want to support more than the https://godoc.org/net#Conn interface. If net.Conn (which supports UDP connections) is sufficient then this is a massive simplification.

functionality is self-contained, not relying on anything else.

With the downside that we have to reimplement stream reassembly, queue discipline/priority, etc... all the things that are already handled by http/2 and TCP.

@mikedanese
Copy link
Member

Why would we go this route vs implementing it outside of the API server with tunneling software and a TAP device?

apiVersion: connectivityservice.k8s.io/v1alpha1
kind: ConnectivityServiceConfiguration
connectionService:
name: Master
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on s/Master/ControlPlane?

Copy link
Member

@andrewsykim andrewsykim Mar 11, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(realizing now that the name of the resource is arbitrary)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This connectionService and the etcd connectionService may not be as arbitrary as the rest.

@cheftako
Copy link
Member Author

@mikedanese I do not believe this can be implemented completely outside of the KAS. The network context is fundamental to what is happening here and would require we export that data to hand it off. We know that two requests for the same IP should be routed to different machines based on the network context. The current design is trying to acknowledge this requirement while minimizing how much needs to change inside of the KAS.

@dims
Copy link
Member

dims commented Mar 13, 2019

@cheftako sorry for a dumb question - how does the kubelet reach the api server (during kubelet startup)? interested both in the current SSH based scenario as well as after this KEP is implemented

@cheftako
Copy link
Member Author

@dims Not a dumb question at all, though somewhat orthogonal. I'm assuming here that you are referring to cluster-side initiated requests. (Could be Kubelet, a controller/webhook etc running on a managed node, ...) For all these cases the KAS has a publicly advertised IP address (this may be to a LB in front of a set of HA KASs or the address of a single KAS). The cluster-side components should be making HTTPS requests and providing some sort of authentication/authorization materials to the KAS. None of the cluster-side initiated requests go over the SSH Tunnel today. This KEP does not cover switching any of these requests to go over the Network Proxy.

It would theoretically be possible to switch cluster-side initiated requests to go over the Proxy Tunnel. That is not in scope of this KEP. It also has a problem of how you initiate that tunnel (again not in scope of this KEP).

@dims
Copy link
Member

dims commented Mar 13, 2019

Ack thanks @cheftako we could add that info in the "out of scope" section :)

connection:
type: grpc
url: grpc://1.2.3.4:5678
caBundle: file1.ca
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In apiregistration/v1's APIService and other webhooks, this is a PEM encoded CA bundle.
Maybe it's better to keep it the same?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was always intended to be a pem ENCODED CA bundle.


One motivation for network proxy is providing a mechanism to secure
https://groups.google.com/d/msg/kubernetes-security-announce/tyd-MVR-tY4/tyREP9-qAwAJ.
This, in conjunction with a firewall or other network isolation, fixes the security concern.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about authentication / authorization? How does the proxy validates that it's receiving traffic from the K8S API server? Maybe the same mechanism as in /pull/658 can be used here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Authentication added. Modeling after the aggregator flags on the api-server. Authorization is left as an exercise to the proxy writer.

Copy link
Member

@tallclair tallclair left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm still a little confused about how the proxy <-> cluster (node) network connection is established.

It listens on a port for gRPC connections from the KAS.
This port would be for forwarding traffic to the appropriate cluster.
It should have an administrative port speaking https.
The administrative port serves the liveness probe and metrics.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that liveness probes don't currently support mutual TLS.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. My thought was that as this is a third port (master side being the first and node/cluster/agent side being the second, both of which should have mTLS) that its restrictions could be different. I was also thinking that it would probably only be listening on localhost. If you feel that it still needs mTLS I can seperate the probe ports from the rest of the admin ports.

## Proposal

We will run a connectivity proxy inside the master network.
The connectivity proxy exposes a gRPC interface to the KAS.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know much about network proxies, but is there not a solid opensource solution that already exists? Envoy comes to mind as something that might be able to perform the required functions (https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/tcp_proxy)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually looked at this. It supports tunneled tcp over h2 connection pools, which is essentially what we would be doing, but it doesn't support reverse tunnels yet. It's probably something that could be added though.

- **Admission Webhooks**
Admission webhooks can be destined for a service or a URL.
If destined for a service then the service rules apply (send to 'cluster').
If destined for a URL then we will use the ‘master’ NetworkContext.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems like a security risk if users shouldn't have access to the master network context.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Access to the master network via a URL on an admission webhook is an already existing behavior today, even when SSH Tunnnels have been configured.
Several features/customers rely on this behavior to access cloud provider specific webhooks running on some form of WebhookManager.
So as it stands it is no worse than today and forwarding this traffic to the node network would break those features.
In addition the cloud providers can add filters to the master proxy to prevent access to things like etcd.
Lastly the traffic being sent is not arbitrary. The requests are admission requests formed by the API Server itself. We should not rely on being well formed admission requests but it does provide a last line of defense if something does get by the filter.

Aggregated API Servers can be destined for a service or a URL.
If destined for a service then the service rules apply.
If destined for a URL then we will use the ‘master’ NetworkContext.
- **Authentication, Authorization and Audit Webhooks**
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Dynamic audit webhooks should be treated similarly to dynamic admission controllers.

keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved

// Payload defines a TCP payload.
message Payload {
bytes data = 1;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we don't have any provision for errors?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working through this item. We expect to have more details on this later.

## Proposal

We will run a connectivity proxy inside the master network.
The connectivity proxy exposes a gRPC interface to the KAS.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's my understanding that the current SSH tunnel implementation just replaces the http.Transport.Dial function member. Would it be possible to have something that uses a standard HTTP/HTTPS proxy by replacing the http.Transport.Proxy function member and not define a custom gRPC service? I think using an existing standard would be preferred and make existing proxies usable.

If there's a technical reason why HTTP/HTTPS proxies are insufficient, can we document why in the "Alternatives Considered" section?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had the same question... I think it was discussed in the original doc (https://docs.google.com/document/d/1djkroHhbhdZATdtZ3UBRxCR0TG92kKCrTmHib86soE4/edit#heading=h.quk4manl97c) but I agree it should be hoisted into this doc

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 and hopefully support for socks proxies too. Have struggled with kubectl's lack of socks support for a while.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@liggitt That only seems to address SOCKS5 proxies. Do the same limitations apply to HTTPS CONNECT proxies? Do we need the general connectivity of SOCKS5? I was under the impression that all connections were HTTP 2 and then hijacked if more flexibility is needed.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@cheftako can confirm, but my understanding was that wanting to run mutual auth (e.g. mTLS)--both to the proxy and through the proxy--ruled out everything he looked at.

@kfox1111 this KEP will not affect kubectl at all, it is only about apiserver.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

HTTP connect proxy is an interesting option. It seems to work, but will be non-extensible. It does not really have a mechanism for propagating errors/information about the stream and is much less transparent/debuggable (the RFC itself is somewhat underwhelming in specification).

I would try to avoid doing very funky things with network namespaces and interfaces. These implementations end up being boxed in by what can be done with Linux and quite obscure config-wise. They can also be very had to debug.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looking into using the HTTP Connect proxy. Definitely a simpler solution. A little concerned with A) its ability to multiplex. Will proto-type and see if we have scaling issue for blow up the number of connections. Increasing reliance on admission webhooks make this more of a concern. B) Not clear how varying failure/error cases such as EOF and proxied connection closes will work on the tunnel. Will add some info on this.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would try to avoid doing very funky things with network namespaces and interfaces. These implementations end up being boxed in by what can be done with Linux and quite obscure config-wise. They can also be very had to debug.

If Bowei feels this way too, can we lay this idea (of using multiple interfaces in a specially configured network namespace) to rest as a rejected alternative?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These implementations end up being boxed in by what can be done with Linux and quite obscure config-wise.

My impression was that a TUN / TAP interface can do anything at all — it's just a userspace program that pushes and pulls packets from the corresponding virtual interface.

They can also be very had to debug.

No argument there :)


I think it's important to minimize the size of the change in Kubernetes necessary to support this use case. Whether that's by using an existing HTTP proxy or by using Linux network interfaces doesn't make a big difference.

Looking at the organizational affiliations of everyone participating on the KEP PR, it seems clear that this is, like SSH tunnels, a feature where everyone but Google is going to be choosing the default option. Minimizing the burden on upstream Kubernetes makes sure that we won't be reading another KEP in a few years time with a justification like

They complicate KAS code and only one cloud provider implemented them. After a year of deprecation time, they will be removed in an upcoming release.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, lets back up a little bit to some higher level questions. What is going to be the common case?
kubeadm deployed cluster? The proxy really needs to be implemented on all clusters to solve CVE-2018-1002105?

So, deployment of the proxy, whatever the default is, needs to be easy to deploy as part of kubeadm, and work in the 1 and 3 controller configurations that its trying to support?

How would the proxy be restricted just to the external network and not any internal networks?


Delete the SSH Tunnel/Node Dialer code from Kube APIServer.
Enable admins to fix https://groups.google.com/d/msg/kubernetes-security-announce/tyd-MVR-tY4/tyREP9-qAwAJ.
Allow isolation of the Control network from the Cluster network.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't want to pile on here, but could this also allow the deprecation of https://github.com/kubernetes/kubernetes/blob/d1031b18fc83146dd8418c17df289d09d10d29e5/staging/src/k8s.io/apimachinery/pkg/util/net/http.go#L273-L309

Or is this used by more than the API server?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree that this is a further benefit for doing this work.

keps/sig-api-machinery/20190226-network-proxy.md Outdated Show resolved Hide resolved
- **Connectivity service** A component built into the KAS which provides a golang dialer for outgoing connection requests.
The dialer provided depends on NetworkContext information.
- **Connectivity proxy** The proxy that runs in the master network.
It has a secure channel established to the cluster network.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

following the same line of thought of nodes spanning multiple non-interconnected networks. it has a secure channel established to cluster network, or cluster networks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see above. In addition for cases like this I think we are likely to want to add some level of metadata on the request. (Eg. node name or failure domain/zone) So getting to the right network could either be by more fine grained ConnectivityServiceConfiguration or it would be that the proxy has enough metadata to make the call. Again though I think this is more part of beta than alpha.

NetworkContext could be extended to contain more contextual information.
This would allow smarter routing based on the k8s object KAS is processing
or which user/tenant tries to initiate the request, etc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How are we going to do load balancing api-server -> connectivity proxy ?
Option 1: Urls becomes a list, users can run the proxy on per master node bases or 2 or just one. That means the api server needs to maintain rr state of last connection
Option 2: Forcing the user to run an internal load balancer.

I vote option 1

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

never mind. below you are asserting same node as api-server.

The liveness probe prevents a partially broken cluster
where the KAS cannot connect to the cluster. This port also serves
pprof debug commands and monitoring data for the proxy.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

early on the discussion was, allow the api-server to serve requests even if the proxy service. How will this work if an api-server instance is healthy yet the connectivity proxy isn't? We will get weird behavior where some api server calls succeed and some fail depending on which instance picked up the request and the type of that particular request. I believe we either find a way to load balance api-server --> connectivity proxy or we tie the health of api-server to the proxy by design (this will introduce a catch-22 situation where all nodes are unhealthy taking proxies down, and accordingly api-server).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. A few thoughts. 1) Proxy service should have a liveness check which means the kubelet shoudl be getting it to recover. 2) The proxy service is significantly simpler in scope, responsibility and code than the API Server, so it should be easier to make it more resilient. 3) In a HA environment we should absolutely consider having the API Servers readiness probe dependent on the proxy and thus control which API servers get traffic. 4) The design deliberately suggests configuration options rather than mandating them. A provider is welcome to load-balance api-server to connectivity proxy.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree it makes sense to operate the proxy as an HA LB service. I don't think the default setup needs to do that.

It's important to note that if this proxy is down, the cluster is only partially broken. Only master-> node calls stop working. Which is not a large fraction of requests. So, I don't actually agree that apiserver's readiness probe should depend on the proxy's readiness: that would make a minor problem into a cluster-breaking problem.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I don't actually agree that apiserver's readiness probe should depend on the proxy's readiness: that would make a minor problem into a cluster-breaking problem.

+1, that sounds like a recipe for cascading failure

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks. I Agree, let us decouple the health of api-server from the proxy.

The HA scenario is quite interesting. i would love to see the details around the proposed nodes->proxy connectivity as requested in #872 (comment)

This will add an extra process hop to our main scaling axis.
We will scale test the impact and publish the results. As a precaution
we will add an extra network configuration 'etcd' separate from ‘master’.
Etcd requests can be configured separately from the rest of 'master'.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the use case for that?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see two use cases. 1) By dictating what is supposed to be etcd destined traffic you can configure that proxy as the only one with access to etcd. This can be another layer of protection for etcd from things like our proxy sub resources. 2) Etcd is a primary dimension of scalability. This allows us to do things like bypass proxy specifically for etcd traffic if you are running close to the edge.

- **ImagePolicyWebhook**
The image policy webhook uses a kube config file to determine destination.
Given that we use a ‘master’ NetworkContext.
- **KMS GRPC Service**
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:-) just a kudos on remembering this one, thanks!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks 😃

Updated proxy kep.
Further details.
Factored in naming and formating changes.

Cahnges suggested by dcbw, mvladev, dims, andrewsykim, mikedanese,
justinsb, anfernee
Added HTTP Connect option.
Also fixed network name.
Fixed approver/reviewer lines.
Elaborated on some/all nodes for cluster network definition.
Fixed spelling mistake.
@deads2k
Copy link
Contributor

deads2k commented Apr 5, 2019

I understand why we want the kube-apiserver intelligent enough to choose different proxies for different destinations and the description of how to choose in the kube-apiserver and the subdivisions provided lgtm.

I do not know enough to comment on the actual proxy and networking configuration being proposed here. I encourage those reviewers to choose a solution that has off-the-shelf, opensource implementations.

@kfox1111
Copy link

kfox1111 commented Apr 5, 2019

I encourage those reviewers to choose a solution that has off-the-shelf, opensource implementations.

+1. and I'd also suggest that whatever the default chosen should be very easy to deploy and maintain. kubeadm should be able to easily plop down the proxy and manage it.

@bowei
Copy link
Member

bowei commented Apr 8, 2019

@cheftako @anfernee you have my approval for the proposal regarding general (don't block on me for getting the other approvals)

@lavalamp
Copy link
Member

lavalamp commented Apr 8, 2019

Looks like we have sign-off from @deads2k and @bowei for their respective parts of the system. Perhaps we can merge this and clear up any other feedback when we're ready to set the status to implementable?

/lgtm
/approve

I'll keep it open for just a bit longer just in case someone wants to argue with that assessment :)
/hold

@k8s-ci-robot k8s-ci-robot added lgtm "Looks good to me", indicates that a PR is ready to be merged. do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. approved Indicates a PR has been approved by an approver from all required OWNERS files. labels Apr 8, 2019
@andrewsykim
Copy link
Member

/lgtm

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, cheftako, lavalamp

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@lavalamp
Copy link
Member

lavalamp commented Apr 9, 2019

/hold cancel

@k8s-ci-robot k8s-ci-robot removed the do-not-merge/hold Indicates that a PR should not merge because someone has issued a /hold command. label Apr 9, 2019
@k8s-ci-robot k8s-ci-robot merged commit 8425d7b into kubernetes:master Apr 9, 2019
@mcrute
Copy link
Contributor

mcrute commented Apr 24, 2019

Thanks @cheftako! We're really excited at AWS to see this start to take shape and look forward to helping push this forward to implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/kep Categorizes KEP tracking issues and PRs modifying the KEP directory lgtm "Looks good to me", indicates that a PR is ready to be merged. sig/api-machinery Categorizes an issue or PR as relevant to SIG API Machinery. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.