Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rolling update for Talos nodes #20

Open
kvaps opened this issue Feb 19, 2024 · 1 comment
Open

Rolling update for Talos nodes #20

kvaps opened this issue Feb 19, 2024 · 1 comment
Labels
help wanted Extra attention is needed

Comments

@kvaps
Copy link
Member

kvaps commented Feb 19, 2024

During cluster setup user have to upload secrets.yaml and cluster.conf into Kubernetes. eg:

kubectl create secret generic -n cozy-system cozy-talos-bootstrap --from-file secrets.yaml --from-file cluster.conf

This will start a reconcilation controller, which checks all the nodes in a cluster, and performs their rolling update:

talosctl -e <node_address> -n <node_address> upgrade --preserve=<true|false> -i <image>

During the upgrade talos config on the node should also be updated to contain the new image.
In talos-bootstrap script we usually do that immediately before the update operation.

@kvaps
Copy link
Member Author

kvaps commented Feb 19, 2024

I think we need to introduce a new resource to track the process of updates:

apiVersion: cozystack.io/v1alpha1
kind: TalosNode
metadata:
  name: srv1
  ownerReferences:
  - apiVersion: v1
    kind: Node
    name: srv1
    uid: d7ba2238-d45a-4edc-a348-aa5157afc730
spec:
  image: ghcr.io/aenix-io/cozystack/installer:v0.0.2
  suspend: false
status:
  image: ghcr.io/aenix-io/cozystack/installer:v0.0.2
  lastAppliedPatchHash: 01ba4719c80b6fe911b091a7c05124b64eeece964e09c058ef8f9805daca546b

talos updater controller should take the folowing arguments:

--image=ghcr.io/aenix-io/cozystack/installer:v0.0.2
--preserve=false
--patch-file=common-node-parameters.yaml

When new version of cozystack installed, it will update talos-updater to run with new arguments, then it should go and update all the nodes in a cluster one-by-one.

common-node-parameters.yaml contain parameters that must be applied as merge patch, example:

machine:
  kubelet:
    nodeIP:
      validSubnets:
      - 192.168.100.0/24
  kernel:
    modules:
    - name: openvswitch
    - name: drbd
      parameters:
        - usermode_helper=disabled
    - name: zfs
  install:
    image: ghcr.io/aenix-io/cozystack/talos:v1.6.4
  files:
  - content: |
      [plugins]
        [plugins."io.containerd.grpc.v1.cri"]
          device_ownership_from_security_context = true      
    path: /etc/cri/conf.d/20-customization.part
    op: create

cluster:
  network:
    cni:
      name: none
    podSubnets:
    - 10.244.0.0/16
    serviceSubnets:
    - 10.96.0.0/16
  allowSchedulingOnControlPlanes: true
  controllerManager:
    extraArgs:
      bind-address: 0.0.0.0
  scheduler:
    extraArgs:
      bind-address: 0.0.0.0
  proxy:
    disabled: true
  discovery:
    enabled: false
  etcd:
    advertisedSubnets:
    - 192.168.100.0/24

Only one node per cluster allowed for upgrade in time

@kvaps kvaps added the help wanted Extra attention is needed label Apr 17, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

1 participant