Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

K8s can't live without a default route #123120

Closed
uablrek opened this issue Feb 4, 2024 · 26 comments
Closed

K8s can't live without a default route #123120

uablrek opened this issue Feb 4, 2024 · 26 comments
Assignees
Labels
area/kube-proxy kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. sig/network Categorizes an issue or PR as relevant to SIG Network. triage/accepted Indicates an issue or PR is ready to be actively worked on.

Comments

@uablrek
Copy link
Contributor

uablrek commented Feb 4, 2024

What happened?

Derived from projectcalico/calico#8481

I use a virtual cluster with router VMs. When I start without any router VM, no default route is setup on the K8s nodes. This makes load-balancing to services to fail, at least with proxy-mode=iptables/nftables, and just about all CNI-plugins to fail. In short, the cluster is dead.

With proxy-mode=ipvs service routing works, but there are more subtle problems, e.g. Calico doesn't start. I haven't investigated further.

What did you expect to happen?

Well, to me it seems OK to require a default route, but the problem must be documented somewhere where cluster admins will see it.

I can't really see any use-case where a default route is not set. Maybe when K8s is only used for SW-management, or security reasons.

How can we reproduce it (as minimally and precisely as possible)?

In a test cluster:

  1. curl -k https://kluster-svc-address
  2. ip ro delete default
  3. curl -k https://kluster-svc-address

For instance on KinD:

root@default-control-plane:/# curl -k https://10.96.0.1
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
}root@default-control-plane:/# ip ro del default
root@default-control-plane:/# curl -k https://10.96.0.1
curl: (7) Couldn't connect to server
root@default-control-plane:/# 

Anything else we need to know?

IMO this is a documentation issue.

/sig network
/area kube-proxy
/area documentation

Kubernetes version

All?

Cloud provider

N/A

OS version

N/A

Install tools

N/A

Container runtime (CRI) and version (if applicable)

crio version 1.28.1

Related plugins (CNI, CSI, ...) and versions (if applicable)

Tested with Calico and Flannel (neither works without a default route)

@uablrek uablrek added the kind/bug Categorizes issue or PR as related to a bug. label Feb 4, 2024
@k8s-ci-robot k8s-ci-robot added sig/network Categorizes an issue or PR as relevant to SIG Network. area/kube-proxy labels Feb 4, 2024
@k8s-ci-robot
Copy link
Contributor

@uablrek: The label(s) area/documentation cannot be applied, because the repository doesn't have them.

In response to this:

What happened?

Derived from projectcalico/calico#8481

I use a virtual cluster with router VMs. When I start without any router VM, no default route is setup on the K8s nodes. This makes load-balancing to services to fail, at least with proxy-mode=iptables/nftables, and just about all CNI-plugins to fail. In short, the cluster is dead.

With proxy-mode=ipvs service routing works, but there are more subtle problems, e.g. Calico doesn't start. I haven't investigated further.

What did you expect to happen?

Well, to me it seems OK to require a default route, but the problem must be documented somewhere where cluster admins will see it.

I can't really see any use-case where a default route is not set. Maybe when K8s is only used for SW-management, or security reasons.

How can we reproduce it (as minimally and precisely as possible)?

In a test cluster:

  1. curl -k https://
  2. ip ro delete default
  3. curl -k https://

For instance on KinD:

root@default-control-plane:/# curl -k https://10.96.0.1
{
 "kind": "Status",
 "apiVersion": "v1",
 "metadata": {},
 "status": "Failure",
 "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
 "reason": "Forbidden",
 "details": {},
 "code": 403
}root@default-control-plane:/# ip ro del default
root@default-control-plane:/# curl -k https://10.96.0.1
curl: (7) Couldn't connect to server
root@default-control-plane:/# 

Anything else we need to know?

IMO this is a documentation issue.

/sig network
/area kube-proxy
/area documentation

Kubernetes version

All?

Cloud provider

N/A

OS version

N/A

Install tools

N/A

Container runtime (CRI) and version (if applicable)

crio version 1.28.1

Related plugins (CNI, CSI, ...) and versions (if applicable)

Tested with Calico and Flannel (neither works without a default route)

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. label Feb 4, 2024
@k8s-ci-robot
Copy link
Contributor

This issue is currently awaiting triage.

If a SIG or subproject determines this is a relevant issue, they will accept it by applying the triage/accepted label and provide further guidance.

The triage/accepted label can be added by org members by writing /triage accepted in a comment.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@uablrek
Copy link
Contributor Author

uablrek commented Feb 4, 2024

/kind documentation

@k8s-ci-robot k8s-ci-robot added the kind/documentation Categorizes issue or PR as related to documentation. label Feb 4, 2024
@uablrek
Copy link
Contributor Author

uablrek commented Feb 4, 2024

to me it seems OK to require a default route, but the problem must be documented somewhere where cluster admins will see it.

@sftim

@cyclinder
Copy link
Contributor

cyclinder commented Feb 4, 2024

I don't think this has anything to do with the default route, but actually, a route is needed on the host to allow access to service packets to be accepted by netfilter.

Before the packet arrives in the PREROUTING chain, there is a routing judgment there, and if the destination is found to be unreachable (similar to ip r get <dst_ip>), the packet will be dropped.

➜  test git:(e687fb89) ✗ curl -k https://10.233.0.1:443
curl: (7) Couldn't connect to server
➜  test git:(e687fb89) ✗
➜  test git:(e687fb89) ✗ ip r get 10.233.0.1
RTNETLINK answers: Network is unreachable
➜  test git:(e687fb89) ✗
➜  test git:(e687fb89) ✗ ip r add 10.233.0.1 dev eth0
➜  test git:(e687fb89) ✗ curl -k https://10.233.0.1:443
{
  "kind": "Status",
  "apiVersion": "v1",
  "metadata": {},
  "status": "Failure",
  "message": "forbidden: User \"system:anonymous\" cannot get path \"/\"",
  "reason": "Forbidden",
  "details": {},
  "code": 403
}#

@cyclinder
Copy link
Contributor

ipvs is working properly because the host's kube-ipvs0 interface is configured with the address of the service, so it allows the packets accessing the service to arrive at the PREROUING chain and then passes them to the IPVS program on the INPUT chain for processing.

@uablrek
Copy link
Contributor Author

uablrek commented Feb 4, 2024

Good observation. Any route that matches the ClusterIP range will do it seems. I tested on KinD:

ip ro del default
ip ro add 10.96.0.0/16 dev lo
curl -k https://10.96.0.1  # (works)

But then externalIP's will not be routed, and since they are hard to predict, I think a default route really is a requirement.

@uablrek
Copy link
Contributor Author

uablrek commented Feb 4, 2024

ip ro add default dev lo

works 😄 And should be good enough for security, since stray packets will be discarded as martians

@cyclinder
Copy link
Contributor

Yeah, we may need to document that the default route is a basic requirement, and most of the time we don't create special routes for the service.

@aojea
Copy link
Member

aojea commented Feb 7, 2024

hmm, we may need to document that nodes need IPs to communicate too :)

People may want to use custom routes and this still work, I don't think we should make this a requirements, is a network design of the cluster and is not kubernetes the one to define this

@uablrek
Copy link
Contributor Author

uablrek commented Feb 8, 2024

But the custom route must be to the ClusterIP CIDR, of which a network admin would not be aware.

@aojea
Copy link
Member

aojea commented Feb 8, 2024

But the custom route must be to the ClusterIP CIDR, of which a network admin would not be aware.

this is an installation and routing problem, why should not be aware of that?

in addition, there are deployments that use multihoming on nodes, one interface facing the internet and other interface for internal routing, and the ServiceCIDR should be routed through the internal interface for their services to work.

What I'm trying to point out is that one thing is the cluster setup and installation, that is a responsibility of cluster-api, kubeadm, kube-spray, home made scripts ... and these are the responsibles of designing the cluster and its routing, and other thing is the behavior of the components, those are different things.

@neolit123
Copy link
Member

Well, to me it seems OK to require a default route, but the problem must be documented somewhere where cluster admins will see it.

related:
https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#network-setup

Note: If the host does not have a default gateway and if a custom IP address is not passed to a Kubernetes component, the component may exit with an error.

same page recommends to have a default route vs passing custom IPs to all components.

@uablrek
Copy link
Contributor Author

uablrek commented Feb 10, 2024

same page recommends to have a default route vs passing custom IPs to all components.

It's a bit worse actually. If you don't have a default route, but you do pass a custom IPs to all components, the installation with kubeadm will work (according to the docs), but K8s will not.

@aojea
Copy link
Member

aojea commented Feb 10, 2024

My point is that k8s should not mandate on how cluster should be implemented, just define the architecture and best practices, but asserting that k8s can not live without a default route is not true, is complex to setup but is feasible, and in depending how much complex scenarios, is sometimes required

@uablrek
Copy link
Contributor Author

uablrek commented Feb 10, 2024

So, an acceptable best practice recommendation would be something like:

Unless the K8s nodes have a default route for the base family of the K8s cluster, a specific route has to be specified for the ClusterIP CIDR.

@aojea
Copy link
Member

aojea commented Feb 10, 2024

or something like "when setting up your cluster, you must ensure that your nodes are able to forward packets to the Service CIDR, this is commonly done by defining a default route, but in more complex network setups you must want to set it explicitly"

@MikeZappa87
Copy link

What are we going to do with this issue? It sounds like we are going to update documentation?

@uablrek
Copy link
Contributor Author

uablrek commented Feb 15, 2024

Yes, that's my proposal. But we can also just ignore it since no default route is rare. I can only think of security reasons.

@thockin thockin added triage/accepted Indicates an issue or PR is ready to be actively worked on. and removed needs-triage Indicates an issue or PR lacks a `triage/foo` label and requires one. labels Feb 15, 2024
@uablrek
Copy link
Contributor Author

uablrek commented Feb 15, 2024

The thing that breaks without a default route is load-balancing with kube-proxy (replacements may work). That in itself may not be a big deal since K8s doesn't require load-balancing afaik, and other K8s components works, but...

Many crucial components, like CNI-plugins, assume that the "kubernetes" service can be used to access the API-server.

I can explain why it doesn't work, and what routes that must be setup if a default route doesn't exist, but this should be done in kube-proxy documentation rather than generic K8s. Then the kubeadm documentation can cross-reference to it.

@danwinship Is there documentation for kube-proxy that can be extended, and referred from kubeadm documentation?

@aojea
Copy link
Member

aojea commented Feb 17, 2024

K8s doesn't require load-balancing afaik, and other K8s components works, but...

it does require, Services is part of Conformance

I can explain why it doesn't work, and what routes that must be setup if a default route doesn't exist, but this should be done in kube-proxy documentation rather than generic K8s

I just think the opposite, it should be great addition to the generic documentations in

https://kubernetes.io/docs/concepts/cluster-administration/networking/ , you already have a diagram there with the 3 conceptual networks, the nodes, pods and services ...

@uablrek
Copy link
Contributor Author

uablrek commented Feb 17, 2024

it does require, Services is part of Conformance

Yes, if you want to sell a conformant K8s platform. But the slogan on github is:

Production-Grade Container Scheduling and Management

And I have heard about installations where K8s is used for SW-management only. (but I haven't actually seen it I admit)

@aojea
Copy link
Member

aojea commented Feb 17, 2024

Is not about selling, the project defines APIs and behaviors, that is why e2e tests are so important, those are the ones that explain and assert how the APIs behave, so the platforms have consistency ...

Services is an integral part of the kubernetes, if people want to do custom things with pieces or strip parts of course they can, no problem at all, but all bugs are for them ;)

@uablrek
Copy link
Contributor Author

uablrek commented Apr 24, 2024

As @neolit123 points out in #123120 (comment), the kubeadm documentation is probably sufficient (nobody would probably look elsewhere).

Ref https://kubernetes.io/docs/setup/production-environment/tools/kubeadm/create-cluster-kubeadm/#network-setup

@uablrek
Copy link
Contributor Author

uablrek commented Apr 24, 2024

/close

@k8s-ci-robot
Copy link
Contributor

@uablrek: Closing this issue.

In response to this:

/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/kube-proxy kind/bug Categorizes issue or PR as related to a bug. kind/documentation Categorizes issue or PR as related to documentation. sig/network Categorizes an issue or PR as relevant to SIG Network. triage/accepted Indicates an issue or PR is ready to be actively worked on.
Projects
None yet
Development

No branches or pull requests

7 participants