docs: Initial docs for bare-metal support

kubernetes · Nov 30, 2023 · b1349c6 · b1349c6
1 parent 3cf96b5
commit b1349c6
Showing 1 changed file with 159 additions and 0 deletions.
diff --git a/docs/metal.md b/docs/metal.md
@@ -0,0 +1,159 @@
+# Bare Metal Support
+
+***Bare metal support is experimental, and may be removed at any time***
+
+## Introduction
+
+kOps has some experimental bare-metal support, specifically for nodes.  The idea
+we are exploring is that you can run your control-plane in a cloud, but you can
+join physical machines as nodes to that control-plane, even though those nodes
+are not located in the cloud.
+
+This approach has some limitations and complexities - for example the
+cloud-controller-manager for the control plane won't be able to attach volumes
+to the nodes, because they aren't cloud VMs.  The advantage is that we can first
+implement node bare-metal support, before tackling the complexities of the
+control plane. 
+
+## Walkthrough
+
+Create a "normal" kOps cluster; here we are using GCE:
+
+```
+kops create cluster foo.k8s.local --cloud gce --zones us-east4-a
+kops update cluster --yes --admin
+```
+
+Create a kops-system namespace, to hold secrets that are generated as part
+of joining the machine.  Technically although these are sensitive, they aren't
+secrets, because they only hold public keys, so we will create a CRD in future:
+
+```
+kubectl create ns kops-system
+```
+
+Create a RoleBinding and Role to allow kops-controller to read secrets:
+```
+kubectl apply --server-side -f - <<EOF
+apiVersion: rbac.authorization.k8s.io/v1
+kind: RoleBinding
+metadata:
+  name: kops-controller
+  namespace: kops-system
+roleRef:
+  apiGroup: rbac.authorization.k8s.io
+  kind: Role
+  name: kops-controller:pki-verifier
+subjects:
+- apiGroup: rbac.authorization.k8s.io
+  kind: User
+  name: system:serviceaccount:kube-system:kops-controller
+---
+apiVersion: rbac.authorization.k8s.io/v1
+kind: Role
+metadata:
+  name: kops-controller:pki-verifier
+  namespace: kops-system
+rules:
+- apiGroups:
+  - ""
+  resources:
+  - secrets
+  verbs:
+  - get
+  - list
+  - watch
+EOF
+```
+
+
+### Create a VM
+
+When first trying this out, we recommend creating a local VM instead of a true
+bare-metal machine.
+
+```
+mkdir vm1
+cd vm1
+wget -O debian11.qcow2 https://cloud.debian.org/images/cloud/bullseye/20231013-1532/debian-11-nocloud-amd64-20231013-1532.qcow2
+
+qemu-img create -o backing_file=debian11.qcow2,backing_fmt=qcow2 -f qcow2 vm1-root.qcow2 10G
+
+qemu-system-x86_64 \
+  -smp 2 \
+  -enable-kvm \
+  -netdev user,id=net0,net=192.168.76.0/24,dhcpstart=192.168.76.9,hostfwd=tcp::2222-:22 \
+  -device rtl8139,netdev=net0 \
+  -m 4G \
+  -drive file=vm1-root.qcow2,if=virtio,format=qcow2 \
+  -nographic -serial mon:stdio
+```
+
+Now login as root (with no password, and set up SSH and the machine name):
+
+```
+ssh-keygen -A
+systemctl restart sshd
+echo "vm1" > /etc/hostname
+hostname vm1
+```
+
+Currently the `kops toolbox enroll` command only supports SSH agents for
+the private key; so get your public key from `ssh-add -L`, and then you must
+currently manually add it to the `authorized_keys` file on the VM.
+
+```
+mkdir ~/.ssh/
+vim ~/.ssh/authorized_keys
+```
+
+After you've done this, open a new terminal and SSH should now work
+from the host: `ssh root@127.0.0.1 -p 2222 uptime`
+
+
+### Joining the VM to the cluster
+
+```
+go run ./cmd/kops toolbox enroll --cluster foo.k8s.local --instance-group nodes-us-east4-a --ssh-user root --host 127.0.0.1 --ssh-port 2222
+```
+
+Within a minute or so, the node should appear in `kubectl get nodes`. 
+If it doesn't work, first check the kops-configuration log:
+`ssh root@127.0.0.1 -p 2222 journalctl -u kops-configuration`
+
+And then if that looks OK (ends in "success"), check the kubelet log:
+`ssh root@127.0.0.1 -p 2222 journalctl -u kubelet`.
+
+### The state of the node
+
+You should observe that the node is running, and pods are scheduled to the node.
+
+```
+kubectl get pods -A --field-selector spec.nodeName=vm1
+```
+
+Cilium will likely be running on the node.
+
+The GCE PD CSI driver is scheduled, but is likely crash-looping
+because it can't reach the GCE metadata service.  You can see this from the
+logs on the VM in `/var/log/container`
+(e.g. `ssh root@127.0.0.1 -p 2222 cat /var/log/containers/*gce-pd-driver*.log`)
+
+If you try to use `kubectl logs`, you will see an error like the below, which
+indicates another problem - that the control plane cannot reach the kubelet:
+`Error from server: Get "https://192.168.76.9:10250/containerLogs/gce-pd-csi-driver/csi-gce-pd-node-l2rm8/csi-driver-registrar": dial tcp 192.168.76.9:10250: i/o timeout`
+
+### Cleanup
+
+Quit the qemu VM with Ctrl-a x.
+
+Delete the node and the secret
+```
+kubectl delete node delete
+kubectl delete secret -n kops-system debian
+```
+
+If you're done with the cluster also:
+```
+kops delete cluster foo.k8s.local --yes
+```