Reconcile terminal Kubernetes agent state#68
Conversation
|
Tracking issue: #85 |
PR #68 Review: Reconcile terminal Kubernetes agent stateExecutive SummaryThis PR maps terminal Kubernetes pod states ( Critical Issues1. TOCTOU race in
|
mfreeman451
left a comment
There was a problem hiding this comment.
Implemented the follow-up hardening for terminal-state persistence in pkg/agent/list.go.
Changes:
- switched persistAgentInfoState to temp-file-plus-rename so the write is atomic
- preserved the original agent-info.json mode instead of hardcoding 0644
- stopped silently discarding persist failures and now emit a debug log on best-effort persistence errors
- added a comment documenting the assumption behind this path
I also added focused coverage in pkg/agent/list_test.go to verify the helper rewrites phase/activity, preserves file mode, and leaves no temp file behind.
On the race concern: this path only runs when the runtime has already reported a terminal Kubernetes state, so in the normal case the harness process is already gone or in its final
shutdown window and no longer continuously heartbeating agent-info.json. That means the remaining read-modify-write TOCTOU window is real in theory, but narrow in practice and limited to a
last-flush edge case during pod termination. The atomic rename removes the partial-write/truncation risk, which was the more actionable filesystem issue here, without broadening this PR
into a larger locking or cross-process coordination change.
Summary
Testing
Context