Skip to content

feat(kernel): add process kill semantics with recursive child cleanup#71

Merged
FlareCoding merged 3 commits intostellux-3.0-prototypingfrom
pr/proc-kill-support
Mar 15, 2026
Merged

feat(kernel): add process kill semantics with recursive child cleanup#71
FlareCoding merged 3 commits intostellux-3.0-prototypingfrom
pr/proc-kill-support

Conversation

@FlareCoding
Copy link
Owner

@FlareCoding FlareCoding commented Mar 15, 2026

Replace the kernel panic in proc_close (when a parent exits with running attached children) with fire-and-forget kill via a new force_wake_for_kill() mechanism. Children are terminated recursively through the existing reaper close_all -> proc_close chain.


Note

High Risk
High risk because it changes core scheduler/timer/syscall paths and alters process lifecycle semantics (exit/wait), which can affect task teardown, blocking behavior, and system stability across architectures.

Overview
Implements asynchronous process termination via a new per-task kill_pending flag and sched::force_wake_for_kill(), which cancels timer sleeps (timer::cancel_sleep) and wakes blocked tasks so they can be reaped.

Scheduler and syscall return paths (aarch64/x86_64 on_yield/on_tick and stlx_syscall_handler) now auto-exit(0x9) user tasks with kill_pending, and common blocking sites (ring_buffer waits, socket accept, proc_wait) become kill-aware and return EINTR where appropriate.

Process resources switch from raw exit_code to Linux-compatible wait_status, proc_close no longer fatals when a parent exits with an attached running child (it triggers kill instead), and userland gains proc_kill() plus wait-status decode macros; new kernel tests cover sleep/wait-queue kill wakeups and status encoding.

Written by Cursor Bugbot for commit f40cb61. This will update automatically on new commits. Configure here.

Replace the kernel panic in proc_close (when a parent exits with
running attached children) with fire-and-forget kill via a new
force_wake_for_kill() mechanism. Children are terminated recursively
through the existing reaper close_all -> proc_close chain.
@FlareCoding FlareCoding merged commit 9fbd80b into stellux-3.0-prototyping Mar 15, 2026
11 checks passed
@FlareCoding FlareCoding deleted the pr/proc-kill-support branch March 15, 2026 22:36
Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 2 potential issues.

Fix All in Cursor

pr->wait_status = exit_code & 0x7F;
} else {
pr->wait_status = (exit_code & 0xFF) << 8;
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wait status misencodes voluntary exit during concurrent kill

Low Severity · Logic Bug

When kill_pending is true, wait_status is set to exit_code & 0x7F, treating exit_code as a signal number. This is correct when the scheduler/syscall kill checks call sched::exit(0x9), but races with a voluntary sched::exit(code) from the user sys_exit syscall. If the child calls exit(42) while kill_pending is concurrently set by force_wake_for_kill, the wait status becomes 42 — reporting "killed by signal 42" rather than a normal exit or SIGKILL. The killed branch likely needs to use a hardcoded signal value (e.g., 9) instead of relying on exit_code.

Fix in Cursor Fix in Web

if (!pr->child || pr->exited) {
sync::spin_unlock_irqrestore(pr->lock, irq);
resource::resource_release(obj);
return pr->exited ? 0 : syscall::EINVAL;
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reading pr->exited after releasing spinlock is a data race

Low Severity · Logic Bug

pr->exited is checked under pr->lock in the if condition on line 429, but read again on line 432 after the lock is released on line 430. The child's sched::exit() can modify pr->exited concurrently under pr->lock, making this a TOCTOU race and technically undefined behavior. The value of pr->exited observed at line 432 could differ from what was seen at line 429. A local variable capturing the value under the lock before releasing it would fix this.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant