Skip to content

feat(sched): global task registry to be able to enumerate system-wide processes + stellux proc kill system call#119

Merged
FlareCoding merged 4 commits intostellux-3.0-prototypingfrom
pr/global-task-registry
Apr 1, 2026
Merged

feat(sched): global task registry to be able to enumerate system-wide processes + stellux proc kill system call#119
FlareCoding merged 4 commits intostellux-3.0-prototypingfrom
pr/global-task-registry

Conversation

@FlareCoding
Copy link
Copy Markdown
Owner

@FlareCoding FlareCoding commented Apr 1, 2026

Note

High Risk
Touches core scheduler/task lifecycle by inserting/removing tasks from a new global registry during creation and reaping, which could introduce race conditions or leaks if ordering is wrong. Also adds a new process-control syscall (proc_kill_tid) that can terminate tasks system-wide, so correctness and permission checks are security/stability relevant.

Overview
Introduces a global sched::g_task_registry (hashmap keyed by TID) to track all tasks and support system-wide enumeration via snapshotting.

Scheduler/task teardown paths now register tasks on creation and unregister them on cleanup (including the unstarted-task destroy path), and the scheduler init now initializes the registry.

Adds a new syscall SYS_PROC_KILL_TID / proc_kill_tid that looks up a task by TID via the registry and forces a kill-wake for userland tasks, rejecting idle/kernel/self targets with appropriate errors; userland gains a proc_kill_tid() wrapper and a new kill app.

Fixes a poll/wake race by resolving aborted poll_wait blocking so the task doesn’t continue running with a queued sched_link, and adds debug-only detection for ready-list double-enqueue; includes extensive new unit/integration tests covering the registry and the poll race.

Written by Cursor Bugbot for commit 8acf33e. This will update automatically on new commits. Configure here.

@FlareCoding FlareCoding changed the title feat(sched): global task registry to be able to enumerate system-wide processes feat(sched): global task registry to be able to enumerate system-wide processes + kill system call Apr 1, 2026
@FlareCoding FlareCoding changed the title feat(sched): global task registry to be able to enumerate system-wide processes + kill system call feat(sched): global task registry to be able to enumerate system-wide processes + stellux proc kill system call Apr 1, 2026
FlareCoding and others added 4 commits March 31, 2026 23:03
* test(sched): exhaustive unit and integration tests for global task registry

DS-tier tests (29 tests, no scheduler required):
- init/count: empty registry starts at zero
- insert: single, multiple, TID identity preservation, boundary TID values
- remove: single, from-many, first/last, never-inserted safety, empty registry,
  remove-all, double-remove safety
- reinsert: after remove, multiple cycles
- snapshot_tids: empty, single, multiple, max capping, zero max, after remove
- count tracking: incremental insert/removal, mixed operations
- stress: 64-task insert/remove patterns, full cycle reinsert with new TIDs
- consistency: snapshot agrees with count, reinit clears state
- link integrity: pprev null after remove

SCHED-tier integration tests (11 tests, real kernel tasks):
- created task appears in global registry before enqueue
- multiple created tasks all visible in registry
- count increases on task creation
- exited task eventually removed by reaper
- task visible while alive, gone after exit+reap
- batch exits all cleaned up
- snapshot captures running/sleeping tasks
- concurrent inserts from multiple CPUs via sub-task creation
- rapid create/exit waves don't corrupt registry
- count consistent with snapshot in quiescent state
- no duplicate TIDs in snapshot

All 500 tests pass on both x86_64 and aarch64.

Co-authored-by: Albert Slepak <FlareCoding@users.noreply.github.com>

* fix(tests): suppress nodiscard warnings in dma and poll_resource tests

Cast intentionally-discarded [[nodiscard]] return values to (void):
- dma.test.cpp: dma::alloc_pages in page_alloc_free_no_leak
- poll_resource.test.cpp: ring_buffer_write in 4 test setup sites

All 500 tests pass warning-free on both x86_64 and aarch64.

Co-authored-by: Albert Slepak <FlareCoding@users.noreply.github.com>

* fix(tests): fix task leak in multiple_tasks_appear integration test

Save task pointers across the RUN_ELEVATED block so all created tasks
can be enqueued after TID verification. Previously the pointers were
lost after the block exited, leaking 4 task structs, stacks, and
handles per test run and permanently polluting g_task_registry.

Co-authored-by: Albert Slepak <FlareCoding@users.noreply.github.com>

* fix(boot): add 1s delay before unit tests to let APs finish initialization

Some CI runners use slow machines and smp::init() may not be able to
initialize all CPUs quickly enough. Sleep for 1 second to give the
APs a chance to fully initialize before running unit tests.

Co-authored-by: Albert Slepak <FlareCoding@users.noreply.github.com>

---------

Co-authored-by: Cursor Agent <cursoragent@cursor.com>
Co-authored-by: Albert Slepak <FlareCoding@users.noreply.github.com>
@FlareCoding FlareCoding force-pushed the pr/global-task-registry branch from 4f3ab06 to 8acf33e Compare April 1, 2026 06:04
@FlareCoding FlareCoding merged commit 3e829fd into stellux-3.0-prototyping Apr 1, 2026
3 checks passed
@FlareCoding FlareCoding deleted the pr/global-task-registry branch April 1, 2026 06:04
Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

sched::g_task_registry.unlock(irq);
});

return result;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No authorization check in proc_kill_tid syscall

Medium Severity · Security Issue

The proc_kill_tid syscall allows any userland process to kill any other userland process by TID with no ownership or permission check. The existing proc_kill uses handle-based authorization (callers can only kill processes they created), but proc_kill_tid bypasses this entirely — only checking that the target isn't a kernel or idle task. Any unprivileged process knowing a TID can terminate an arbitrary user process.

Fix in Cursor Fix in Web

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant