Skip to content

Enable the ksail floating-IP stable API endpoint on prod (one-line ksail.prod.yaml change) #2492

Description

@devantler

🤖 Generated by the Daily AI Assistant

Problem. Part of #2120: prod's kubeconfig/talosconfig point at the first control-plane node's public IP, so a CP-node recreate breaks the API endpoint (DR runbook Scenario 9 exists because of this). The ksail-side work is complete and released: floatingIPEnabled shipped in ksail v7.120.0 (PRs devantler-tech/ksail#5699, devantler-tech/ksail#5719, devantler-tech/ksail#5723) and this repo already pins ksail 7.135.0 in CI.

Proposed change — one line in ksail.prod.yaml (agent-protected file, so this is deliberately left as a maintainer-applied change rather than an agent PR):

spec:
  provider:
    hetzner:
      floatingIPEnabled: true   # add; floatingIPLocation defaults to the cluster location (fsn1)

What it does on the next ksail cluster update: reserves a Hetzner floating IP (small monthly cost), renders it as the cluster endpoint + certificate SAN (existing CP node IPs stay in the SANs, so current kubeconfigs keep working), and adds the Talos VIP block to the control planes so the elected leader claims the address (hcloud token embedded in CP machine config).

Blast radius / rollout notes (why this is promotion-gated): applying rolls the control-plane machine configs (Talos VIP + cert SANs). Existing KUBE_CONFIG/TALOS_CONFIG CI secrets stay valid (node-IP SANs retained). After it settles, re-point the kubeconfig context at the floating IP so Scenario-9-class endpoint breakage goes away for CP recreates (full rebuilds still need the Scenario 9 secret refresh — PKI changes).

Acceptance criteria:

  • floatingIPEnabled: true live in ksail.prod.yaml, deploy green
  • Floating IP answers kubectl get --raw=/readyz via a kubeconfig pointed at it
  • DR runbook Scenario 9 updated to reflect the stable endpoint (follow-up docs PR — I can ship that once this lands)

Verified constraints: prod is Talos×Hetzner with 3 CPs (VIP-compatible), no existing VIP/floating-IP Talos patches to conflict, Hetzner server-limit unaffected (virtual address).

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Fields

    No fields configured for issues without a type.

    Projects

    Status
    No status

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions