-
Notifications
You must be signed in to change notification settings - Fork 35
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Control plane load balancer SSL health check fails #17
Comments
{
"LoadBalancerName": "k3-test-8-apiserver",
"DNSName": "k3-test-8-apiserver-2070937331.us-east-1.elb.amazonaws.com",
"CanonicalHostedZoneName": "k3-test-8-apiserver-2070937331.us-east-1.elb.amazonaws.com",
"CanonicalHostedZoneNameID": "Z35SXDOTRQ7X7K",
"ListenerDescriptions": [
{
"Listener": {
"Protocol": "TCP",
"LoadBalancerPort": 6443,
"InstanceProtocol": "TCP",
"InstancePort": 6443
},
"PolicyNames": []
}
],
"Policies": {
"AppCookieStickinessPolicies": [],
"LBCookieStickinessPolicies": [],
"OtherPolicies": []
},
"BackendServerDescriptions": [],
"AvailabilityZones": [
"us-east-1a"
],
"Subnets": [
"subnet-0b9782cedb617555a"
],
"VPCId": "vpc-06f0de2c855d7990c",
"Instances": [
{
"InstanceId": "i-0fe77be91e59bf886"
}
],
"HealthCheck": {
"Target": "SSL:6443",
"Interval": 10,
"Timeout": 5,
"UnhealthyThreshold": 3,
"HealthyThreshold": 5
},
"SourceSecurityGroup": {
"OwnerAlias": "908067222188",
"GroupName": "k3-test-8-apiserver-lb"
},
"SecurityGroups": [
"sg-0cf2f8a9f723f1240"
],
"CreatedTime": "2022-12-12T14:24:33.550000+00:00",
"Scheme": "internet-facing"
} I can successfully create aws+kubeadm clusters, which use a similarly configured LB:
{
"LoadBalancerName": "test4-apiserver",
"DNSName": "test4-apiserver-833021180.eu-central-1.elb.amazonaws.com",
"CanonicalHostedZoneName": "test4-apiserver-833021180.eu-central-1.elb.amazonaws.com",
"CanonicalHostedZoneNameID": "Z215JYRZR1TBD5",
"ListenerDescriptions": [
{
"Listener": {
"Protocol": "TCP",
"LoadBalancerPort": 6443,
"InstanceProtocol": "TCP",
"InstancePort": 6443
},
"PolicyNames": []
}
],
"Policies": {
"AppCookieStickinessPolicies": [],
"LBCookieStickinessPolicies": [],
"OtherPolicies": []
},
"BackendServerDescriptions": [],
"AvailabilityZones": [
"eu-central-1b",
"eu-central-1c",
"eu-central-1a"
],
"Subnets": [
"subnet-02183db7a7be39f9f",
"subnet-097f8c9c6eabd1b1e",
"subnet-0ee4ca0f1bd467507"
],
"VPCId": "vpc-0ad0425bf1e41496c",
"Instances": [
{
"InstanceId": "i-01a294ee6d59db373"
}
],
"HealthCheck": {
"Target": "SSL:6443",
"Interval": 10,
"Timeout": 5,
"UnhealthyThreshold": 3,
"HealthyThreshold": 5
},
"SourceSecurityGroup": {
"OwnerAlias": "908067222188",
"GroupName": "test4-apiserver-lb"
},
"SecurityGroups": [
"sg-0fc014bfd24c431e6"
],
"CreatedTime": "2022-12-07T16:30:18.010000+00:00",
"Scheme": "internet-facing"
} |
wireshark: request from AWS load balancer:
response from k3s apiserver:
For comparison this is what kubeadm based deployment responds to the AWS LB:
|
this correlates with the k3s logs: $ journalctl -u k3s
....
Dec 12 16:06:52 ip-10-0-149-210 k3s[1337]: time="2022-12-12T16:06:52.065175784Z" level=info msg="Cluster-Http-Server 2022/12/12 16:06:52 http: TLS handshake error from 10.0.11.159:9252: tls: no cipher suite supported by both client and server" |
# cat /etc/rancher/k3s/config.yaml
cluster-init: true
disable-cloud-controller: true
kube-apiserver-arg:
- anonymous-auth=true
- tls-cipher-suites=TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,TLS_RSA_WITH_AES_128_GCM_SHA256,TLS_RSA_WITH_AES_256_GCM_SHA384 |
so, the k3s api server is explicitly configured to accept The same cipher is proposed by the client handshake There must be something else. |
another example; $ curl -k https://localhost:6443 --ciphers DHE-RSA-AES128-GCM-SHA256 but it fails with: $ curl -k https://localhost:6443 --ciphers DHE-RSA-AES128-GCM-SHA256 --tlsv1.2 --tls-max 1.2
curl: (35) error:14094410:SSL routines:ssl3_read_bytes:sslv3 alert handshake failure I find the TLS cipher/version matrix very hard to keep up with; I'll try to figure out which ciphers need to be enabled so that it k3s works with TLS1.2 |
Thanks @mkmik, I got caught on this one as well. I meant to post an issue when I was first looking into running this on aws. The only workaround I got working was changing the healthcheck from ssl to tcp after cluster creation. I started a thread in slack about this https://kubernetes.slack.com/archives/CD6U2V71N/p1637946333265800. Never cracked this ... had to move on to other things but it looks like you are already at the point where I gave up. @richardcase works on cluster-api aws and is building a rke2 provider. He may have some insights into this issue. |
According to sslyze the apiserver is only supporting the elliptic curve stuff (possibly because the cert has been created that way?)
|
I methodically compared the difference in behavior between a CAPA kubeadm based deployment and a cluster-api-k3s based one. I ran a sample Go program with config generated by https://ssl-config.mozilla.org/#server=go&version=1.19&config=intermediate&hsts=false&guideline=5.6 which is the same config generator used by k8s itself, see https://github.com/k3s-io/k3s/blob/f8b661d590ecd1ed2ed04b3c51ff5e6d67cb092b/pkg/cli/server/server.go#L380 // generated 2022-12-13, Mozilla Guideline v5.6, Go 1.14.4, intermediate configuration, no HSTS
// https://ssl-config.mozilla.org/#server=go&version=1.14.4&config=intermediate&hsts=false&guideline=5.6
package main
import (
"crypto/tls"
"fmt"
"log"
"net/http"
)
func main() {
mux := http.NewServeMux()
mux.HandleFunc("/", func(w http.ResponseWriter, req *http.Request) {
w.Write([]byte("This server is running the Mozilla intermediate configuration.\n"))
})
cfg := &tls.Config{
MinVersion: tls.VersionTLS10,
PreferServerCipherSuites: true,
CipherSuites: []uint16{
tls.TLS_ECDHE_ECDSA_WITH_AES_128_GCM_SHA256,
tls.TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256,
tls.TLS_ECDHE_ECDSA_WITH_AES_256_GCM_SHA384,
tls.TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384,
tls.TLS_ECDHE_ECDSA_WITH_CHACHA20_POLY1305,
tls.TLS_ECDHE_RSA_WITH_CHACHA20_POLY1305,
tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA256,
tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256,
tls.TLS_ECDHE_ECDSA_WITH_AES_128_CBC_SHA,
tls.TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA,
tls.TLS_ECDHE_ECDSA_WITH_AES_256_CBC_SHA,
tls.TLS_ECDHE_RSA_WITH_AES_256_CBC_SHA,
tls.TLS_RSA_WITH_AES_128_GCM_SHA256,
tls.TLS_RSA_WITH_AES_256_GCM_SHA384,
tls.TLS_RSA_WITH_AES_128_CBC_SHA256,
tls.TLS_RSA_WITH_AES_128_CBC_SHA,
tls.TLS_RSA_WITH_AES_256_CBC_SHA,
tls.TLS_RSA_WITH_3DES_EDE_CBC_SHA,
},
}
srv := &http.Server{
Addr: ":6443",
Handler: mux,
TLSConfig: cfg,
// Consider setting ReadTimeout, WriteTimeout, and IdleTimeout
// to prevent connections from taking resources indefinitely.
}
log.Fatal(srv.ListenAndServeTLS(
"/root/apiserver.crt",
"/root/apiserver.key",
))
} I shut down the api-server and ran it on port 6443, looking at the TLS handshakes performed by the AWS LB using tcpdump. Let's call The same program worked on HostA and didn't work on Host3. tcpdump revealed that the negotiated cipher suite was I modified the test program to include the ciphers included by default by k3s:
It didn't work until I added This cipher is not chosen when I use the certificate created during k3s initialization. I conclude the certificate is not compatible with ECDHE.
I have layperson knowledge about how TLS works and I assumed that the DH exchange didn't depend on the asymmetric crypto used to verify the certificate signature (here RSA I assume). Yesterday I did quickly check the CA certificate and I found nothing unusual:
However, when looking at the actual certificate that used by the server (which is signed by the CA but is not the CA certificate), I can see it's using elliptic curve crypto:
The file name is This to investigate:
|
kubernetes-sigs/cluster-api-provider-aws#3124 implemented the |
trying out apiVersion: infrastructure.cluster.x-k8s.io/v1beta1
kind: AWSCluster
metadata:
name: k3-test-8
spec:
# bastion:
# enabled: true
network:
vpc:
availabilityZoneUsageLimit: 1
region: us-east-1
sshKeyName: default
+ controlPlaneLoadBalancer:
+ healthCheckProtocol: TCP |
confirmed: no |
Ok successfully created a k3s cluster! $ clusterctl describe cluster k3-test-9
NAME READY SEVERITY REASON SINCE MESSAGE
Cluster/k3-test-9 True 113s
├─ClusterInfrastructure - AWSCluster/k3-test-9 True 3m1s
├─ControlPlane - KThreesControlPlane/k3-test-9-control-plane True 113s
│ └─Machine/k3-test-9-control-plane-8lgzc True 2m35s
└─Workers
└─MachineDeployment/k3-test-9-md-0 True 47s
└─Machine/k3-test-9-md-0-f8c778d8f-mp4qb True 83s |
@zawachte thanks for the links; they contained the necessary references to learn that CAPA can actually control the healthcheck protocol. I propose closing this issue with https://github.com/zawachte/cluster-api-k3s/pull/18 |
Oh! This was the feature I was looking for years ago! Happy to see they implemented it and everything is working smoothly. Great find! |
after applying the sample config
$ kubectl apply -f samples/aws/k3s-cluster.yaml
cluster-api-k3s successfully creates the vpc, control plane instance and load balancer.
However the load balancer doesn't like how the apiserver on the control plane machine is talking https:
When I change the health check type to
TCP
it works just fine. The rest of the CAPI machinery successfully connects to the apiserver and proceeds with the bootstrap of the worker node just fine. The CA and the certificates appear to me to be correct.The text was updated successfully, but these errors were encountered: