Skip to content

refactor: full-codebase simplify pass — reuse + quality#30

Merged
CMGS merged 7 commits into
masterfrom
chore/full-simplify-pass
May 6, 2026
Merged

refactor: full-codebase simplify pass — reuse + quality#30
CMGS merged 7 commits into
masterfrom
chore/full-simplify-pass

Conversation

@CMGS
Copy link
Copy Markdown
Contributor

@CMGS CMGS commented May 6, 2026

Summary

Audited 200+ Go files via 5 parallel agents (reuse / quality / efficiency / modernize / generics). Applied the Tier 1 findings — high-confidence fixes with bounded scope and zero behavior change. Skipped findings that were already-accepted trade-offs (config.json re-reads on clone, accepted in PR #27 Copilot review), or separate-PR scope (generic DoAPIJSON[T] across CH/FC, status event-loop dedup, IndexLayout helper across 4 backends), or churn-beyond-value (info.Hypervisor JSON tag — touches 5+ construction sites for cosmetic gain).

What changed

Reuse (cross-file dedup)

  • utils.ReadJSONFile(path, v) — 5+ os.ReadFile + json.Unmarshal duplicates collapsed. Sites:
    • snapshot/envelope.go::ReadSnapshotEnvelope
    • hypervisor/snapshot.go::LoadSnapshotMeta
    • hypervisor/cloudhypervisor/clone.go::parseCHConfig
  • errors.Is(err, fs.ErrNotExist) still works through %w so callers (e.g. ReadSnapshotEnvelopeErrEnvelopeMissing) keep their sentinel mapping.

Quality

  • Drop dead VerifyProcess + verifyProcessExe (utils/process.go, process_linux.go, process_other.go). All production callers go through VerifyProcessCmdline with a non-empty expectArg (sockPath); the bare variant was unreachable. Tests of dead code removed.
  • types.ImageTypeOCI / types.ImageTypeCloudImg constants replace bare "oci" / "cloudimg" string comparisons in cmd/core/helpers.go::digestPullRef and cmd/vm/run.go::printPostCloneHints. Backend typ consts in images/oci and images/cloudimg now reference the public constant.
  • cloneResumeOpts cleanup (hypervisor/cloudhypervisor/clone.go): drops 5 fields (cpu, diskQueueSize, noDirectIO, windows, onDemand) that were just transcribed from vmCfg. Pass *types.VMConfig directly; restoreAndResumeClone reads the originals via opts.vmCfg.*.

Modernize / Generics

Both audits returned zero actionable findings. Codebase already uses slices, maps, cmp.Or, errors.Join, range-over-int idiomatically. Existing generics (Store[T], ResolveRef[T], ForEach[T]) cover the structural-dedup space; remaining candidates are either single-use, semantically distinct per type, or the boxed-any was an intentional encoder sink.

Net diff

13 files changed, 36 insertions(+), 106 deletions(-)

Test plan

  • make lint clean on darwin + linux
  • go test ./... passes (incl. utils/process_test.go after dead-test removal)
  • Testbed regression suite: 25/25 PASS
    • lifecycle (run/inspect/exec/exit-code/stdin/env)
    • vm logs (oneshot, follow, follow with truncate-regrow detection)
    • snapshot save/list/clone/restore (resource flags rejected, defaults inherited)
    • network (--nics 0)
    • help text invariants

Skipped findings (separate PRs / accepted trade-offs)

  • Efficiency F1/F2 (multi-read of config.json / cocoon.json on clone/restore) — Copilot review on PR refactor: clone/restore inherit all resources from snapshot #27 accepted the trade-off (~1 KB cached, sub-ms, clone is cold path).
  • Generics F1 (utils.DoAPIJSON[T] for HTTP+JSON pattern across CH/FC) — bigger refactor, separate PR.
  • Quality F13 (statusEventLoop / statusEventLoopJSON near-duplicates) — needs design for shared diff abstraction, separate PR.
  • Quality F12 (info.Hypervisor JSON tag) — would touch 5+ construction sites for cosmetic gain.
  • Reuse F2 (IndexLayout helper across 4 backends) — embedding shapes differ, marginal win.

Tier 1 fixes from 5-agent simplify audit (200+ Go files). Skipped
findings that were either already-accepted trade-offs (config.json
re-reads on clone — Copilot review accepted), separate-PR scope
(generic DoAPIJSON across CH/FC, status event-loop dedup), or change
churn beyond their value (info.Hypervisor JSON tag, IndexLayout
helper across 4 backends).

Reuse:
- utils.ReadJSONFile: 5+ os.ReadFile + json.Unmarshal duplicates
  collapsed (snapshot/envelope.go, hypervisor/snapshot.go,
  hypervisor/cloudhypervisor/clone.go.parseCHConfig). errors.Is
  through %w preserves the fs.ErrNotExist signal callers rely on.

Quality:
- Drop dead VerifyProcess + verifyProcessExe (utils/process.go,
  process_linux.go, process_other.go). All production callers go
  through VerifyProcessCmdline with non-empty expectArg; the bare
  variant was unreachable. Tests of dead code removed.
- types.ImageTypeOCI / ImageTypeCloudImg constants replace bare
  'oci' / 'cloudimg' string comparisons in cmd/core/helpers.go,
  cmd/vm/run.go; backend typ consts in images/oci, images/cloudimg
  reference the public constant.
- cloneResumeOpts in hypervisor/cloudhypervisor/clone.go drops 5
  fields (cpu, diskQueueSize, noDirectIO, windows, onDemand) that
  were just transcribed from vmCfg. Pass *types.VMConfig directly;
  restoreAndResumeClone reads the originals via opts.vmCfg.

Modernize and generics audits returned zero actionable findings.
Codebase is already idiomatic Go 1.25 (slices/maps/cmp/errors.Join
all in active use, range-over-int adopted).

Lint: 0 issues both platforms. Tests: full suite passes.
Testbed: 25/25 regression tests green.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR performs a codebase-wide simplification/refactor pass focused on deduplication and consistency, while aiming to keep runtime behavior unchanged.

Changes:

  • Added utils.ReadJSONFile and refactored multiple JSON read+unmarshal call sites to use it.
  • Removed dead VerifyProcess / verifyProcessExe logic and related tests, leaving VerifyProcessCmdline as the single entry point.
  • Centralized image backend type strings via types.ImageTypeOCI / types.ImageTypeCloudImg and updated call sites to use the constants.
  • Simplified Cloud Hypervisor clone resume options by passing *types.VMConfig instead of copying multiple fields.

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated no comments.

Show a summary per file
File Description
utils/process.go Removes VerifyProcess and simplifies VerifyProcessCmdline behavior.
utils/process_test.go Deletes tests for removed dead APIs; keeps cmdline-verification tests.
utils/process_other.go Removes now-unused non-Linux verifyProcessExe stub.
utils/process_linux.go Removes now-unused Linux verifyProcessExe implementation and related import.
utils/atomic.go Adds ReadJSONFile helper for shared JSON file loading.
types/config.go Introduces exported image type constants used across the codebase.
snapshot/envelope.go Refactors snapshot envelope loading to use utils.ReadJSONFile.
hypervisor/snapshot.go Refactors snapshot metadata loading to use utils.ReadJSONFile.
hypervisor/cloudhypervisor/clone.go Simplifies clone resume options and refactors CH config parsing to use utils.ReadJSONFile.
images/oci/oci.go Replaces hardcoded "oci" type string with types.ImageTypeOCI.
images/cloudimg/cloudimg.go Replaces hardcoded "cloudimg" type string with types.ImageTypeCloudImg.
cmd/vm/run.go Uses types.ImageTypeCloudImg constant in post-clone hint logic.
cmd/core/helpers.go Uses types.ImageTypeOCI constant in OCI digest pinning logic.
Comments suppressed due to low confidence (1)

utils/process.go:57

  • VerifyProcessCmdline no longer special-cases expectArg == "". On Linux, verifyProcessCmdline uses strings.Contains(cmdline, expectArg), so an empty expectArg always passes and the only remaining check is a substring match on binaryName, which is weaker than the previous /proc/{pid}/exe basename check. Consider restoring a strict executable-name verification path for expectArg == "" (e.g., readlink(/proc/pid/exe) or exact argv[0] basename), or explicitly reject empty expectArg and update callers/docs accordingly to avoid misidentifying/killing the wrong process.
// VerifyProcessCmdline checks binary name and that expectArg appears in
// /proc/{pid}/cmdline. This prevents cross-instance misidentification when
// multiple processes of the same binary are running (e.g. multiple VMs).
// On non-Linux platforms, falls back to IsProcessAlive.
func VerifyProcessCmdline(pid int, binaryName, expectArg string) bool {
	if pid <= 0 {
		return false
	}
	if match, ok := verifyProcessCmdline(pid, binaryName, expectArg); ok {
		return match
	}
	return IsProcessAlive(pid)
}

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot review on PR #30 surfaced that VerifyProcessCmdline with
expectArg="" silently degrades to a binary-name substring match
(strings.Contains(cmdline, "") is always true), risking cross-VM
PID confusion. No production caller passes empty today, but the
guard is cheap and prevents future misuse.

Also trim the godoc to 3 lines and shrink the ImageType const
header to 1 line.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 13 out of 13 changed files in this pull request and generated 1 comment.

Comment thread utils/process.go Outdated
CMGS added 2 commits May 6, 2026 18:14
The labels claimed root/cocoon credentials for SSH access. Only the
24.04-picoclaw variant actually installed openssh-server; the other four
(22.04, 24.04, 24.04-chrome, 24.04-xface) had the labels but no sshd, so
any external tool that took them at face value would hit a closed port.

Verified zero readers across the cocoonstack: glance, vk-cocoon, and
epoch all grep clean for cocoon.ssh / Labels["cocoon.ssh"]. The README
claim that the labels exist 'for external tooling' was aspirational and
never landed in code.

Drop the label everywhere (including picoclaw, where the credentials are
real but still unread by tooling) and rewrite the README note to point
at the actual control-plane path: cocoon-agent over vsock for kubectl
exec/logs. picoclaw's sshd is preserved for its existing serial/network
use cases.
CI fail: empty expectArg now returns false (defensive guard added in
prior commit), so terminateWithPidfd returns handled=true without
killing, and TestTerminateProcess_SIGTERMIgnored_FallsBackToKill hangs
waiting for the bash process to exit.
Follow-up to 180fd4c which removed the LABEL from all 5 ubuntu image
variants. Verified across the whole tree (vk-cocoon, glance, epoch,
cocoon-operator, cocoon-webhook, cocoon-agent, cocoon-net,
cocoon-common, cocoon-specs, cocoonv2 itself): no code reads the
label, but glance operators consult it as the human-facing source of
truth when populating their own glance-credentials Secret.

picoclaw is the one variant that actually bakes openssh-server +
chpasswd's root:cocoon, so the label is honest documentation there.
The four lying variants (22.04, 24.04, chrome, xface) stay stripped.
Copilot finished work on behalf of CMGS May 6, 2026 10:24
@CMGS CMGS force-pushed the chore/full-simplify-pass branch from 818dadd to 38648e2 Compare May 6, 2026 10:24
Earlier attempt at this revert (818dadd) didn't land in the branch
history — codex re-flagged it via direct probe (TerminateProcess with
empty expectArg still no-ops). TerminateProcess godoc explicitly says
'optional cmdline arg check', so empty expectArg must remain a valid
binary-only path. Production callers all pass non-empty sockPath, so
the substring-collision risk Copilot raised earlier is unreachable in
practice.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 1 comment.

Comment thread utils/process_linux.go
Copilot pointed out that strings.Contains(cmdline, binaryName) hits
'binaryName anywhere in argv' rather than argv0 specifically — a
process running 'bash -c "cloud-hypervisor --version"' would
false-positive. Real risk during PID reuse: a recycled PID whose
cmdline happens to mention the VMM binary as a string would be
treated as the VMM.

Split /proc/<pid>/cmdline on NUL, compare filepath.Base(argv0) to
binaryName, then run the optional expectArg substring check on the
remainder. ~3 lines, strictly more correct, no API change.
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 18 out of 18 changed files in this pull request and generated 2 comments.

Comment thread utils/process.go
Comment thread os-image/ubuntu/24.04/Dockerfile
Copy link
Copy Markdown

Copilot AI commented May 6, 2026

@CMGS Unfortunately I hit an unexpected error while processing your comment. I've automatically reported this to GitHub.

You can ask me to try again later by mentioning me in a new comment.

If you want to contact GitHub about this error, please mention the following identifier so they can better serve you: cc357837-c54f-4735-b2dc-4a7249d3fa30

Sorry for the inconvenience!

@CMGS CMGS merged commit 34ee0f1 into master May 6, 2026
10 of 11 checks passed
@CMGS CMGS deleted the chore/full-simplify-pass branch May 6, 2026 11:12
Copilot stopped work on behalf of CMGS due to an error May 6, 2026 11:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants