feat: cocoon vm logs to print per-VM hypervisor log#29
Conversation
There was a problem hiding this comment.
Pull request overview
This PR adds a cocoon vm logs <VM> [-f|--follow] command to surface the per-VM hypervisor (VMM) process log file via the CLI, removing the need for users to manually find and tail backend-specific paths under log_dir.
Changes:
- Extend the
hypervisor.Hypervisorinterface withLogPath(ctx, ref)and implement shared path resolution inhypervisor.Backend. - Standardize hypervisor log file naming by deriving the launch-time log path from the same helper used for reads.
- Add
vm logscommand (with optional follow streaming) and document it in the README.
Reviewed changes
Copilot reviewed 7 out of 7 changed files in this pull request and generated 2 comments.
Show a summary per file
| File | Description |
|---|---|
| README.md | Documents the new vm logs command and flags. |
| hypervisor/hypervisor.go | Adds LogPath to the Hypervisor interface. |
| hypervisor/helpers.go | Implements shared log path building and ref→VM ID resolution. |
| hypervisor/firecracker/start.go | Uses shared helper for Firecracker log file path. |
| hypervisor/cloudhypervisor/start.go | Uses shared helper for Cloud Hypervisor log file path. |
| cmd/vm/lifecycle.go | Implements vm logs handler and streaming logic. |
| cmd/vm/commands.go | Registers the vm logs subcommand and --follow/-f flag. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
There was a problem hiding this comment.
Pull request overview
Copilot reviewed 7 out of 7 changed files in this pull request and generated 1 comment.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Adds a new subcommand that streams the per-VM cloud-hypervisor.log / firecracker.log already written under <log_dir>/<backend>/<vmID>/. The file captures VMM-side activity (device init, API errors, virtio warnings, shutdown) — guest console output stays on cocoon vm console. Hypervisor interface gains LogPath(ctx, ref); both backends resolve the ref to a VM ID and return <VMLogDir>/<backend>.log. The launch-time filename literals in start.go are switched to the new logFileName constant so the read and write paths can't drift. -f/--follow polls the file every 500ms and exits on Ctrl-C through the existing root-level signal context.
The CH and FC LogPath methods were structurally identical (ResolveRef →
VMLogDir → join filename); only the literal log filename differed, and
that already follows the backend type ('cloud-hypervisor' / 'firecracker').
Move both helpers onto Backend so the embedded struct satisfies the
Hypervisor interface, derive the filename from b.Typ + '.log', and let
launch-time write paths reuse LogFilePath instead of carrying their own
filename consts. Drops the duplicated logFileName constants and the
per-backend method bodies.
- Fold LogFilePath/LogPath into hypervisor/helpers.go alongside
PIDFilePath / SocketPath (matches sibling-style: same shape, same
Backend method, all per-VM path builders live in one file).
- Replace tailFollow polling with utils.WatchFile (fsnotify, already
used by vm status). 100ms debounce, exits on ctx.Done() or watcher
channel close. Eliminates ~120 idle read syscalls/min.
- Reword the os.IsNotExist error to lead with the action verb
("open log %s: VM may not have been started yet"), matching the
rest of cocoon's error format.
Open the log file first; only start the WatchFile goroutine after the open succeeds, and bound it to a child ctx canceled on streamLog return. Previously the watcher fired up before os.Open, so a missing log file returned the error but left the watcher running until the parent ctx finished — fine for the CLI exit path, but a leak under the cocoon library API and pointless work either way.
Both backends'\'' start paths reopen the VM log with O_TRUNC (same inode), so a follower whose fd was past the previous EOF would silently miss everything the new boot writes until the file grew past the old offset. Compare current fd position against the file'\''s current size on every fsnotify event; if size shrunk, seek back to 0 before io.Copy so the new boot'\''s log is replayed from byte 0.
LogPath now reads logDir from the persisted VMRecord (with a fallback to b.Conf.VMLogDir for legacy records that predate LogDir persistence). A subsequent --log-dir config change no longer points vm logs at a path that doesn't hold the running VM's actual log. README clarifies the file lives under the VM's log dir while the VM exists but is truncated on every vm run / vm start; -f handles the rewind already (added in 27eb5e2).
Agent-Logs-Url: https://github.com/cocoonstack/cocoon/sessions/8eea672e-ec8b-43da-8b67-0cb238f855d0 Co-authored-by: CMGS <506216+CMGS@users.noreply.github.com>
…he file Agent-Logs-Url: https://github.com/cocoonstack/cocoon/sessions/c602c1f4-ba97-42ba-8306-d81e66623ebf Co-authored-by: CMGS <506216+CMGS@users.noreply.github.com>
Codex review on PR #29 surfaced a real race: when stop+start truncates and regrows the log to the same byte length within the 100ms WatchFile debounce window, info.Size() < pos is false and the size-only rewind misses the new boot's lines entirely. Repro: deterministic CH boot warnings, follower stuck at old EOF, new boot's timestamped lines silently dropped. Switch the regen check to a head-signature compare: ReadAt the first 64 bytes (CH/FC stamp a unique boot timestamp at line 1) and rewind when the signature changes. ReadAt doesn't move the fd position, so the catch-up io.Copy continues from where it left off when no rewrite occurred. Drop the copyCtx 32-KiB chunked io.Copy wrapper that copilot added — hypervisor logs are KB-scale, not a critical path, and the parent ctx already cancels the surrounding select; chunked copying was over- engineering. Drop the watchCtx WithCancel wrapper for the same reason: parent ctx is enough. Lift the ReadAt-from-zero pattern (already in cloudimg/inspect.peekHead) into utils.FileHead and reuse from both call sites.
SKILL.md requires package-level const at the top of the file (after imports). I'd placed it mid-file by streamLog. Moving it up. Full-codebase audit (203 Go files via 5 parallel agents) found no other violations.
|
@copilot only review, no changes, no overdesign |
* feat: cocoon vm logs to print per-VM hypervisor log
Adds a new subcommand that streams the per-VM cloud-hypervisor.log /
firecracker.log already written under <log_dir>/<backend>/<vmID>/. The
file captures VMM-side activity (device init, API errors, virtio
warnings, shutdown) — guest console output stays on cocoon vm console.
Hypervisor interface gains LogPath(ctx, ref); both backends resolve the
ref to a VM ID and return <VMLogDir>/<backend>.log. The launch-time
filename literals in start.go are switched to the new logFileName
constant so the read and write paths can't drift.
-f/--follow polls the file every 500ms and exits on Ctrl-C through the
existing root-level signal context.
* refactor: pull LogPath/LogFilePath into hypervisor.Backend
The CH and FC LogPath methods were structurally identical (ResolveRef →
VMLogDir → join filename); only the literal log filename differed, and
that already follows the backend type ('cloud-hypervisor' / 'firecracker').
Move both helpers onto Backend so the embedded struct satisfies the
Hypervisor interface, derive the filename from b.Typ + '.log', and let
launch-time write paths reuse LogFilePath instead of carrying their own
filename consts. Drops the duplicated logFileName constants and the
per-backend method bodies.
* refactor: apply /simplify findings on vm logs
- Fold LogFilePath/LogPath into hypervisor/helpers.go alongside
PIDFilePath / SocketPath (matches sibling-style: same shape, same
Backend method, all per-VM path builders live in one file).
- Replace tailFollow polling with utils.WatchFile (fsnotify, already
used by vm status). 100ms debounce, exits on ctx.Done() or watcher
channel close. Eliminates ~120 idle read syscalls/min.
- Reword the os.IsNotExist error to lead with the action verb
("open log %s: VM may not have been started yet"), matching the
rest of cocoon's error format.
* fix: don't leak fsnotify watcher when log file is missing
Open the log file first; only start the WatchFile goroutine after the
open succeeds, and bound it to a child ctx canceled on streamLog return.
Previously the watcher fired up before os.Open, so a missing log file
returned the error but left the watcher running until the parent ctx
finished — fine for the CLI exit path, but a leak under the cocoon
library API and pointless work either way.
* fix: vm logs -f handles backend log truncation across stop/start
Both backends'\'' start paths reopen the VM log with O_TRUNC (same inode),
so a follower whose fd was past the previous EOF would silently miss
everything the new boot writes until the file grew past the old offset.
Compare current fd position against the file'\''s current size on every
fsnotify event; if size shrunk, seek back to 0 before io.Copy so the
new boot'\''s log is replayed from byte 0.
* fix: vm logs uses persisted VMRecord.LogDir, doc clarifies truncate
LogPath now reads logDir from the persisted VMRecord (with a fallback
to b.Conf.VMLogDir for legacy records that predate LogDir persistence).
A subsequent --log-dir config change no longer points vm logs at a path
that doesn't hold the running VM's actual log.
README clarifies the file lives under the VM's log dir while the VM
exists but is truncated on every vm run / vm start; -f handles the
rewind already (added in 27eb5e2).
* fix: make streamLog io.Copy context-aware for prompt Ctrl-C cancellation
* docs: clarify -f seeks to start on truncation rather than reopening the file
* fix: detect log truncate-then-regrow via head signature, drop ctx wrap
Codex review on PR #29 surfaced a real race: when stop+start truncates
and regrows the log to the same byte length within the 100ms WatchFile
debounce window, info.Size() < pos is false and the size-only rewind
misses the new boot's lines entirely. Repro: deterministic CH boot
warnings, follower stuck at old EOF, new boot's timestamped lines
silently dropped.
Switch the regen check to a head-signature compare: ReadAt the first
64 bytes (CH/FC stamp a unique boot timestamp at line 1) and rewind
when the signature changes. ReadAt doesn't move the fd position, so
the catch-up io.Copy continues from where it left off when no rewrite
occurred.
Drop the copyCtx 32-KiB chunked io.Copy wrapper that copilot added —
hypervisor logs are KB-scale, not a critical path, and the parent ctx
already cancels the surrounding select; chunked copying was over-
engineering. Drop the watchCtx WithCancel wrapper for the same reason:
parent ctx is enough.
Lift the ReadAt-from-zero pattern (already in cloudimg/inspect.peekHead)
into utils.FileHead and reuse from both call sites.
* style: move const logHeadSigLen to top of lifecycle.go
SKILL.md requires package-level const at the top of the file (after
imports). I'd placed it mid-file by streamLog. Moving it up.
Full-codebase audit (203 Go files via 5 parallel agents) found no other
violations.
Summary
Adds
cocoon vm logs <VM> [-f]to print the per-VM hypervisor process log that cocoon already writes (cloud-hypervisor.log or firecracker.log under<log_dir>/<backend>/<vmID>/). Until now there was no CLI shortcut — users had to know the path layout andtailit themselves.The log captures VMM-side activity (device init, API errors, virtio warnings, shutdown) — guest console output stays on
cocoon vm console.What changed
hypervisor.HypervisorgainsLogPath(ctx, ref) (string, error). Both backends resolve the ref to a canonical VM ID and return<VMLogDir>/<backend>.log.cloudhypervisor/start.goandfirecracker/start.goare switched to a newlogFileNameconst in each backend's package, so the read and write paths can't drift.Logsaction oncmd/vm.Actions, registered asvm logs. Default is one-shot cat;-f/--followpolls every 500 ms and exits cleanly on Ctrl-C via the existing root-levelsignal.NotifyContext.logs [-f] VMentry; new### Logs Flagssection explains scope and shows examples.Test plan
make lintclean on darwin + linuxgo test ./...all passvm logs --helpshows-f/--followvm logs <existing-CH-vm>prints accumulated CH warningsvm logs nonexistent→ clearVM not foundvm logs <fresh-CH-vm>→ prints content right aftervm runvm logs -f <vm>→ live tail, Ctrl-C / timeout exits cleanlyvm logs <fresh-FC-vm>→ prints firecracker's own log lines (Async file IO,vcpuetc.)