Move agent bootstrap lifecycle into binary#157
Conversation
There was a problem hiding this comment.
Pull request overview
This PR moves the host bootstrap/unbootstrap lifecycle (systemd unit install/removal + Azure CLI auth staging) into the aks-flex-node binary via a new bootstrap command, leaving agent as the long-lived systemd-launched daemon. It also updates install/uninstall scripts, e2e harnesses, and docs to follow the new bootstrap/daemon split.
Changes:
- Add
bootstrapcommand andpkg/daemonlifecycle helpers to install/start/uninstall the embedded systemd unit and stage Azure CLI auth into a root-owned directory. - Refactor command execution utilities (removing
pkg/utils/utils.go, adding/expandingpkg/utils/utilexecandutilio.FileExists) and update Arc/drift call sites. - Update install/uninstall scripts, e2e join/unjoin flows, and docs to use
bootstrapfor initial setup and systemd-managed operation.
Reviewed changes
Copilot reviewed 20 out of 21 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| scripts/uninstall.sh | Adjust uninstall flow to rely on unbootstrap for systemd removal; add SKIP_AZCLI support. |
| scripts/install.sh | Add local binary/version overrides and SKIP_AZCLI; remove systemd/unit installation from the script. |
| pkg/utils/utils.go | Remove legacy utility helpers (systemd + exec wrappers, FileExists, cleanup helpers). |
| pkg/utils/utilio/fs.go | Add utilio.FileExists helper used by drift codepaths. |
| pkg/utils/utilexec/exec.go | Expand exec helpers with slog streaming, systemctl helpers, and safe remove helpers. |
| pkg/drift/node_maintenance.go | Switch kubeconfig existence check to utilio.FileExists. |
| pkg/daemon/lifecycle.go | New: install/start/uninstall embedded systemd unit and copy Azure CLI auth files. |
| pkg/daemon/assets/aks-flex-node-agent.service | New embedded unit file used by bootstrap. |
| pkg/arc/helpers.go | Migrate Arc service checks to utilexec + slog logging. |
| pkg/arc/arc_uninstaller.go | Migrate cleanup execution/systemctl interactions to utilexec. |
| pkg/arc/arc_installer.go | Migrate dpkg/bash/curl/wget/azcmagent execution to utilexec. |
| main.go | Add bootstrap command registration; suppress exit on context.Canceled. |
| hack/qemu/README.md | Update VM instructions to run bootstrap instead of agent. |
| hack/e2e/lib/node-join.sh | Update E2E join to install via script and bootstrap via transient unit; add service validation. |
| hack/e2e/lib/node-join-token.sh | Update E2E unjoin to run uninstall script and validate removal. |
| hack/e2e/lib/node-join-msi.sh | Update E2E unjoin to run uninstall script and validate removal. |
| hack/e2e/lib/node-join-kubeadm.sh | Update E2E unjoin to run uninstall script and validate removal. |
| hack/e2e/README.md | Update join-method documentation to reference bootstrap. |
| docs/usage.md | Update usage guide to reflect bootstrap (systemd-managed) vs agent (daemon). |
| commands.go | Add bootstrap cobra command; make agent daemon-only; uninstall systemd unit during unbootstrap. |
| README.md | Update quickstart to use bootstrap. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 20 out of 21 changed files in this pull request and generated 2 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Summary
bootstrapcommand that installs the embedded systemd unit, bootstraps the nspawn node, then enables and starts the agent service.agentto only run the long-lived daemon loop used by systemd, and move unit install/uninstall behavior into Go.Validation
go test ./...make lint