feat: Add PID tracking and process management#8
Conversation
- Add process_id field to Session struct for tracking spawned terminals - Implement cross-platform process management using sysinfo crate - Add process module with health checking and kill operations - Extend terminal handler to capture and return PIDs - Add 'shards status' command for detailed process information - Update 'shards list' to show process status (Running/Stopped/No PID) - Update 'shards destroy' to kill processes before cleanup - Add comprehensive process lifecycle tests - All 107 existing tests pass + 4 new process tests
Code Review ReportScope: PR #8 - feat: Add PID tracking and process management Executive SummaryOverall Assessment: NEEDS CHANGES
Recommendation: This PR implements essential process tracking functionality but contains critical race conditions, type safety issues, and silent error handling that must be addressed before merge. The architecture follows project guidelines well (vertical slice, handler/operations pattern), but the process management implementation has fundamental safety issues that could lead to killing wrong processes, resource leaks, and difficult debugging. Fix critical issues before merging. Critical Issues (Must Fix Before Merge)Issue 1: Race Condition in Process Operations - PID Reuse RiskLocation: Problem: Why This Matters: Risk If Unfixed:
Fix Options:
Recommended Fix: // src/sessions/types.rs
pub struct Session {
pub process_id: Option<u32>,
pub process_name: Option<String>, // NEW: Store process name
pub process_start_time: Option<u64>, // NEW: Store start time
}
// src/process/operations.rs
pub fn kill_process_safe(
pid: u32,
expected_name: Option<&str>,
expected_start_time: Option<u64>
) -> Result<(), ProcessError> {
let mut system = System::new();
system.refresh_processes(ProcessesToUpdate::All, true);
let pid_obj = Pid::from_u32(pid);
match system.process(pid_obj) {
Some(process) => {
// Validate this is the same process
if let Some(name) = expected_name {
if process.name().to_string_lossy() != name {
return Err(ProcessError::PidReused {
pid,
expected: name.to_string(),
actual: process.name().to_string_lossy().to_string()
});
}
}
if let Some(start_time) = expected_start_time {
if process.start_time() != start_time {
return Err(ProcessError::PidReused { pid });
}
}
if process.kill() {
Ok(())
} else {
Err(ProcessError::KillFailed {
pid,
message: "Process kill signal failed".to_string()
})
}
}
None => Err(ProcessError::NotFound { pid }),
}
}Issue 2: Silent Error Masking in Process Status ChecksLocation: Problem: let process_status = if let Some(pid) = session.process_id {
match process::is_process_running(pid) {
Ok(true) => format!("Running({})", pid),
Ok(false) => format!("Stopped({})", pid),
Err(_) => format!("Error({})", pid), // SILENT ERROR - no logging
}
} else {
"No PID".to_string()
};Why This Matters: Risk If Unfixed:
Fix Options:
Recommended Fix: let process_status = if let Some(pid) = session.process_id {
match process::is_process_running(pid) {
Ok(true) => format!("Running({})", pid),
Ok(false) => format!("Stopped({})", pid),
Err(e) => {
// Log the actual error for debugging
tracing::warn!(
event = "cli.list_process_check_failed",
pid = pid,
session_branch = session.branch,
error = %e,
error_type = std::any::type_name_val(&e)
);
format!("Error({})", pid)
}
}
} else {
"No PID".to_string()
};Issue 3: Platform-Unsafe PID Type (u32 vs i32)Location: Problem:
The sysinfo crate's Why This Matters: Risk If Unfixed:
Fix Options:
Recommended Fix: // src/process/types.rs (NEW FILE)
use sysinfo::Pid as SysinfoPid;
/// Platform-safe process ID wrapper
#[derive(Debug, Clone, Copy, PartialEq, Eq, Serialize, Deserialize)]
#[serde(transparent)]
pub struct Pid(u32);
impl Pid {
pub fn new(pid: u32) -> Self {
Self(pid)
}
pub fn as_u32(&self) -> u32 {
self.0
}
pub fn to_sysinfo_pid(&self) -> SysinfoPid {
SysinfoPid::from_u32(self.0)
}
}
impl From<u32> for Pid {
fn from(pid: u32) -> Self {
Self(pid)
}
}
// Update Session
pub struct Session {
pub process_id: Option<Pid>, // Changed from Option<u32>
}Issue 4: Continuing Cleanup Despite Process Kill FailureLocation: Problem: Err(e) => {
tracing::warn!(
event = "session.destroy_kill_failed",
pid = pid,
error = %e,
message = "Failed to kill process, continuing with cleanup"
);
}Why This Matters: Risk If Unfixed:
Fix Options:
Recommended Fix: // Option A: Fail fast with force flag
if let Some(pid) = session.process_id {
match crate::process::kill_process(pid) {
Ok(()) => {
info!(event = "session.destroy_kill_completed", pid = pid);
}
Err(crate::process::ProcessError::NotFound { .. }) => {
info!(event = "session.destroy_kill_already_dead", pid = pid);
}
Err(e) => {
if !force {
error!(
event = "session.destroy_kill_failed_blocking",
pid = pid,
error = %e
);
return Err(SessionError::ProcessKillFailed {
pid,
message: format!("Process still running. Use --force to destroy anyway: {}", e)
});
} else {
tracing::warn!(
event = "session.destroy_kill_failed_forced",
pid = pid,
error = %e,
message = "Process kill failed but --force specified, continuing"
);
}
}
}
}Issue 5: Stringly-Typed Process StatusLocation: Problem: pub struct ProcessInfo {
pub pid: u32,
pub name: String,
pub status: String, // Could be anything
}Why This Matters: Risk If Unfixed:
Fix Options:
Recommended Fix: // src/process/types.rs
#[derive(Debug, Clone, PartialEq, Eq, Serialize, Deserialize)]
pub enum ProcessStatus {
Running,
Sleeping,
Stopped,
Zombie,
Dead,
Unknown(String), // Fallback for unexpected states
}
impl From<sysinfo::ProcessStatus> for ProcessStatus {
fn from(status: sysinfo::ProcessStatus) -> Self {
let status_str = status.to_string();
match status_str.as_str() {
"Run" | "Running" => ProcessStatus::Running,
"Sleep" | "Sleeping" => ProcessStatus::Sleeping,
"Stop" | "Stopped" => ProcessStatus::Stopped,
"Zombie" => ProcessStatus::Zombie,
"Dead" => ProcessStatus::Dead,
_ => ProcessStatus::Unknown(status_str),
}
}
}
pub struct ProcessInfo {
pub pid: Pid,
pub name: String,
pub status: ProcessStatus, // Changed from String
}Important Issues (Should Fix)Issue 1: Performance Impact of Full Process RefreshLocation: Problem: Impact: Fix Options:
Suggested Fix: pub fn is_process_running(pid: u32) -> Result<bool, ProcessError> {
let mut system = System::new();
let pid_obj = Pid::from_u32(pid);
// Only refresh the specific PID we care about
system.refresh_processes(ProcessesToUpdate::Some(&[pid_obj]), true);
Ok(system.process(pid_obj).is_some())
}Issue 2: Missing Process Name/Start Time CaptureLocation: Problem: Impact: Fix Options:
Suggested Fix: let child = cmd.spawn().map_err(|e| TerminalError::SpawnFailed {
message: format!("Failed to execute {}: {}", spawn_command[0], e),
})?;
let process_id = child.id();
// Capture process metadata immediately
let process_metadata = if let Ok(info) = process::get_process_info(process_id) {
Some(ProcessMetadata {
name: info.name,
start_time: info.start_time,
})
} else {
None
};
let result = SpawnResult::new(
terminal_type.clone(),
command.to_string(),
working_directory.to_path_buf(),
Some(process_id),
process_metadata,
);Issue 3: No PID ValidationLocation: Problem: Impact: Fix Options:
Suggested Fix: impl Pid {
pub fn new(pid: u32) -> Result<Self, ProcessError> {
if pid == 0 {
return Err(ProcessError::InvalidPid { pid });
}
Ok(Self(pid))
}
}Issue 4: Missing Documentation for process_id FieldLocation: Problem: Impact: Suggested Fix: pub struct Session {
pub id: String,
pub branch: String,
pub worktree_path: PathBuf,
pub command: String,
pub agent: String,
pub status: SessionStatus,
pub created_at: String,
/// Process ID of the spawned terminal/agent process.
///
/// This is `None` if:
/// - The session was created before PID tracking was implemented
/// - The terminal spawn failed to capture the PID
/// - The session is in a stopped state
///
/// Note: PIDs can be reused by the OS, so this should be validated
/// against process name/start time before use.
pub process_id: Option<u32>,
}Suggestions (Nice to Have)Suggestion 1: Add Process Health Check CommandLocation: New feature Current State: Users can see process status in Suggestion 2: Improve Error Messages with Actionable GuidanceLocation: Current State: Error messages state what went wrong Example: #[error("Process '{pid}' not found. The process may have already exited. Use 'shards cleanup' to remove stale sessions.")]
NotFound { pid: u32 },Suggestion 3: Add Process Metadata to ProcessInfoLocation: Current State: ProcessInfo only has pid, name, status Detailed Agent ReportsCode Quality Analysis (code-reviewer)Files Reviewed: process/operations.rs, sessions/handler.rs, terminal/handler.rs Findings Summary: Patterns Observed:
Documentation Analysis (comment-analyzer)Comments Reviewed: 3 doc comments, 0 inline comments Findings Summary: Comment Quality Score: 4/10 Error Handling Analysis (error-hunter)Error Handlers Reviewed: 4 error paths Findings Summary: Silent Failure Risk: HIGH Type Design Analysis (type-analyzer)Types Reviewed: Session, ProcessError, ProcessInfo, SpawnResult Findings Summary: Overall Type Safety Score: 3/10 What's Done Well
Action Items (Prioritized)Must Do (Blocking)
Should Do (Before Merge)
Consider (Optional)
Decision GuideIf you have limited time, focus on:
If you want thorough improvement, also address:
Quick wins (easy fixes with good impact):
Review generated by Kiro AI agents |
…error handling Critical fixes: - Add PID reuse protection by storing and validating process name + start time - Use platform-safe Pid newtype wrapper instead of raw u32 - Replace stringly-typed ProcessStatus with proper enum - Fail destroy command if process kill fails (prevents orphaned processes) - Add proper error logging in list command (no more silent failures) Performance improvements: - Use ProcessesToUpdate::Some for single PID operations (O(1) vs O(n)) Type safety improvements: - New process/types.rs module with Pid and ProcessStatus types - Add PID validation (reject PID 0) - ProcessInfo now uses typed Pid and ProcessStatus Documentation: - Add comprehensive doc comments for process_id field - Document when process_id is None and PID reuse risks All 107 tests passing.
Code Review Issues Addressed ✅All critical and important issues from the code review have been fixed: Critical Issues Fixed (5/5)
Important Issues Fixed (4/4)
Test ResultsAll existing tests pass with the new implementation. Files Changed
The implementation now has robust process management with protection against PID reuse attacks, proper error handling, and type safety. Ready for re-review! 🚀 |
Summary
Implements comprehensive process tracking for spawned terminals to enable lifecycle management, prevent stale processes, and provide reliable cleanup.
Changes
Testing
Files Changed
Closes: Process tracking feature implementation