-
Notifications
You must be signed in to change notification settings - Fork 1
EEVDF stack issue #144
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
EEVDF stack issue #144
Conversation
Caution Review failedThe pull request is closed. WalkthroughUnified scheduler abstraction added (include/Scheduler.h); kernel-wide replacements of MLFQ APIs with Scheduler wrappers; EEVDF scheduler hardened (RB-tree insert/delete, validation, termination/cleanup, token checks); VFCompositor init introduced; shared privilege/flag macros moved to Shared.h; many modules updated to call Scheduler APIs. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Module as Kernel Module
participant SchedulerAPI as include/Scheduler.h
participant Impl as Scheduler Impl (MLFQ/EEVDF)
Note right of SchedulerAPI #bfe0d6: Unified wrapper layer
Module->>SchedulerAPI: CreateProcess(name, entry)
SchedulerAPI->>Impl: config-specific CreateProcess(...)
Impl-->>SchedulerAPI: pid
SchedulerAPI-->>Module: pid
sequenceDiagram
autonumber
participant CPU
participant EEVDF as EEVDFSchedule
participant RB as Runqueue (RB-Tree)
participant Curr as Current Task
participant Next as Next Task
participant Idle as Idle Task
CPU->>EEVDF: tick / Schedule(regs)
EEVDF->>EEVDF: update clock
alt Curr was running
EEVDF->>RB: enqueue(Curr)
end
EEVDF->>RB: pick_leftmost()
RB-->>EEVDF: Next
alt Next invalid (no stack or RIP==0)
EEVDF->>RB: remove(Next)
EEVDF->>EEVDF: retry pick (bounded attempts)
alt no valid task found
EEVDF->>Idle: select Idle
EEVDF-->>CPU: context switch to Idle
end
else Next valid
EEVDF-->>CPU: context switch to Next
end
Note right of EEVDF #fff4d1: every 100 schedules -> cleanup terminated queue
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (2 warnings)
✅ Passed checks (1 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (5)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
kernel/sched/EEVDF.c
(13 hunks)kernel/sched/MLFQ.h
(1 hunks)
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-08-10T08:52:13.313Z
Learnt from: assembler-0
PR: assembler-0/VoidFrame#63
File: kernel/etc/Shell.c:24-28
Timestamp: 2025-08-10T08:52:13.313Z
Learning: In VoidFrame's Process.c, PIDs are allocated using `__sync_fetch_and_add(&next_pid, 1)` which only increments and never reuses freed PIDs, causing PID exhaustion after MAX_PROCESSES terminations.
Applied to files:
kernel/sched/EEVDF.c
🧬 Code graph analysis (1)
kernel/sched/EEVDF.c (6)
mm/MemOps.c (1)
FastMemcpy
(212-447)kernel/atomic/Atomics.c (1)
AtomicInc
(10-12)kernel/etc/Shell.c (1)
ShellProcess
(1252-1280)mm/VMem.c (1)
VMemAllocStack
(709-741)kernel/atomic/Spinlock.h (1)
SpinUnlockIrqRestore
(161-164)kernel/etc/Format.c (1)
FormatA
(280-286)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
♻️ Duplicate comments (1)
kernel/sched/EEVDF.c (1)
734-761
: Critical: Incomplete fallback-to-idle logicLines 734-735 update
rq->current_slot
andcurrent_process
tonext_slot
before validation. When validation fails at lines 741-742, the code setsnext_slot = 0
and jumps toswitch_to_idle
, but never updatesrq->current_slot
/current_process
back to 0 or loads the idle context intoregs
. This leaves stale state and can cause the "already in tree" panic on the next tick, as flagged in previous reviews.Move the assignments and context load inside the validation block:
- rq->current_slot = next_slot; - current_process = next_slot; - if (LIKELY(next_slot != 0)) { EEVDFProcessControlBlock* new_proc = &processes[next_slot]; // Validate process before switching if (UNLIKELY(!new_proc->stack || new_proc->context.rip == 0)) { - // Invalid process, fall back to idle - next_slot = 0; - goto switch_to_idle; + goto pick_next; } + rq->current_slot = next_slot; + current_process = next_slot; new_proc->state = PROC_RUNNING; ready_process_bitmap &= ~(1ULL << next_slot); new_proc->exec_start = GetNS(); new_proc->slice_ns = EEVDFCalcSlice(rq, new_proc); FastMemcpy(regs, &new_proc->context, sizeof(Registers)); AtomicInc(&context_switches); eevdf_scheduler.switch_count++; + } else { + rq->current_slot = 0; + current_process = 0; + FastMemcpy(regs, &processes[0].context, sizeof(Registers)); } switch_to_idle:
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
arch/x86_64/interrupts/Interrupts.c
(3 hunks)arch/x86_64/syscall/Syscall.c
(1 hunks)fs/EXT/Ext2.c
(21 hunks)fs/VFRFS.c
(2 hunks)include/Scheduler.h
(1 hunks)kernel/core/Compositor.c
(2 hunks)kernel/core/Compositor.h
(1 hunks)kernel/core/Kernel.c
(4 hunks)kernel/etc/Console.c
(2 hunks)kernel/etc/Editor.c
(4 hunks)kernel/etc/Shell.c
(8 hunks)kernel/execf/aout/AoutLoader.c
(2 hunks)kernel/execf/elf/ELFloader.c
(3 hunks)kernel/execf/pe/PEloader.c
(3 hunks)kernel/sched/EEVDF.c
(13 hunks)kernel/sched/MLFQ.c
(2 hunks)kernel/sched/MLFQ.h
(1 hunks)kernel/sched/Shared.h
(1 hunks)mm/PageFaultHandler.c
(2 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- kernel/sched/MLFQ.h
🧰 Additional context used
🧠 Learnings (4)
📓 Common learnings
Learnt from: assembler-0
PR: assembler-0/VoidFrame#111
File: kernel/sched/MLFQ.c:1514-1514
Timestamp: 2025-09-06T05:31:15.868Z
Learning: In the MLFQ scheduler, removing MLFQYield() from critical system processes like DynamoX and Astra improves system responsiveness, indicating that the yield mechanism may be too aggressive for system-critical processes that need quick response times.
📚 Learning: 2025-08-10T08:52:13.313Z
Learnt from: assembler-0
PR: assembler-0/VoidFrame#63
File: kernel/etc/Shell.c:24-28
Timestamp: 2025-08-10T08:52:13.313Z
Learning: In VoidFrame's Process.c, PIDs are allocated using `__sync_fetch_and_add(&next_pid, 1)` which only increments and never reuses freed PIDs, causing PID exhaustion after MAX_PROCESSES terminations.
Applied to files:
kernel/sched/EEVDF.c
📚 Learning: 2025-09-19T11:55:43.328Z
Learnt from: assembler-0
PR: assembler-0/VoidFrame#132
File: arch/x86_64/syscall/Syscall.c:12-19
Timestamp: 2025-09-19T11:55:43.328Z
Learning: In the VoidFrame kernel project, assembler-0 is implementing basic POSIX-compliant syscalls in ring 0 without user/kernel memory protection, prioritizing getting basic functionality working before adding full security features.
Applied to files:
kernel/core/Kernel.c
📚 Learning: 2025-08-13T07:26:56.892Z
Learnt from: assembler-0
PR: assembler-0/VoidFrame#79
File: drivers/PCI/PCI.c:154-154
Timestamp: 2025-08-13T07:26:56.892Z
Learning: VoidFrame kernel uses direct interrupt dispatch in arch/x86_64/interrupts/Interrupts.c rather than RegisterInterruptHandler pattern. Hardware interrupts are handled in switch cases by interrupt number (32+ for IRQs), with proper PIC EOI handling required.
Applied to files:
kernel/core/Kernel.c
🧬 Code graph analysis (9)
kernel/core/Compositor.h (1)
kernel/core/Compositor.c (1)
VFCompositorRequestInit
(32-61)
kernel/sched/MLFQ.c (1)
kernel/etc/Console.c (1)
PrintKernelSuccess
(203-208)
kernel/core/Compositor.c (1)
kernel/etc/Console.c (6)
PrintKernelError
(210-215)Snooze
(47-55)PrintKernelWarning
(217-222)PrintKernel
(171-189)PrintKernelSuccess
(203-208)PrintKernelInt
(295-319)
kernel/sched/EEVDF.c (7)
mm/MemOps.c (1)
FastMemcpy
(212-447)kernel/atomic/Atomics.c (1)
AtomicInc
(10-12)kernel/atomic/Spinlock.h (1)
SpinUnlockIrqRestore
(161-164)kernel/etc/Console.c (4)
PrintKernel
(171-189)PrintKernelError
(210-215)PrintKernelSuccess
(203-208)PrintKernelInt
(295-319)kernel/etc/Shell.c (1)
ShellProcess
(1251-1279)mm/VMem.c (1)
VMemAllocStack
(709-741)kernel/etc/Format.c (1)
FormatA
(280-286)
kernel/etc/Editor.c (1)
drivers/PS2.c (1)
HasInput
(280-282)
arch/x86_64/interrupts/Interrupts.c (1)
kernel/etc/Console.c (2)
PrintKernelInt
(295-319)PrintKernelWarning
(217-222)
kernel/etc/Shell.c (3)
drivers/PCI/PCI.c (1)
PciEnumerate
(150-154)drivers/xHCI/xHCI.c (1)
xHCIEnumerate
(852-928)mm/MemOps.c (1)
FastMemcpy
(212-447)
fs/EXT/Ext2.c (1)
kernel/atomic/Spinlock.h (3)
WriteLock
(121-134)ReadLock
(100-111)ReadUnlock
(113-119)
include/Scheduler.h (2)
kernel/sched/MLFQ.c (11)
MLFQSchedInit
(1716-1800)MLFQCreateProcess
(1174-1176)MLFQGetCurrentProcess
(1233-1238)MLFQGetCurrentProcessByPID
(1240-1252)MLFQYield
(1001-1004)MLFQSchedule
(771-958)MLFQKillProcess
(462-464)MLFQKillCurrentProcess
(1916-1920)MLFQListProcesses
(1843-1880)MLFQDumpPerformanceStats
(1802-1829)MLFQDumpSchedulerState
(1882-1914)kernel/sched/EEVDF.c (11)
EEVDFSchedInit
(764-833)EEVDFCreateProcess
(835-941)EEVDFGetCurrentProcess
(943-948)EEVDFGetCurrentProcessByPID
(950-961)EEVDFYield
(963-968)EEVDFSchedule
(677-762)EEVDFKillProcess
(970-973)EEVDFKillCurrentProcess
(975-980)EEVDFListProcesses
(1027-1061)EEVDFDumpPerformanceStats
(1080-1092)EEVDFDumpSchedulerState
(1011-1025)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: build
🔇 Additional comments (20)
mm/PageFaultHandler.c (1)
8-8
: LGTM! Clean API migration.The replacement of MLFQ-specific calls with the unified Scheduler API is correct and maintains the existing logic for tracking page fault information.
Also applies to: 30-30
kernel/etc/Console.c (1)
5-5
: LGTM! Scheduler API migration in console snooze logic.The migration correctly updates the privilege level check in
Snooze()
to use the unified Scheduler API while preserving the original control flow.Also applies to: 51-51
kernel/sched/Shared.h (1)
4-6
: Good abstraction for privilege levels.These macros provide clear, well-documented privilege level constants that improve code readability and maintainability. The naming is consistent (PROC_PRIV_* prefix) and the values follow a logical ordering.
kernel/core/Compositor.h (1)
39-39
: LGTM! New compositor initialization API.The
VFCompositorRequestInit
function declaration adds a clean public interface for compositor initialization. The implementation (seen in Compositor.c) properly handles edge cases like disabled builds and duplicate initialization attempts.arch/x86_64/syscall/Syscall.c (1)
3-3
: LGTM! Syscall handler migrated to unified Scheduler API.The SYS_EXIT handler correctly uses the new
KillCurrentProcess()
andYield()
functions from the unified Scheduler interface, maintaining the existing syscall semantics.Also applies to: 8-9
kernel/core/Kernel.c (1)
3-3
: LGTM! Kernel initialization migrated to unified Scheduler API.The kernel initialization and main loop correctly transition to the unified Scheduler interface:
SchedulerInit()
replacesMLFQInit()
in the PXS2 initialization phaseYield()
replacesMLFQYield()
in the main loop- TSC initialization added for precise timing support
Note: The PR title mentions "EEVDF stack issue" but these changes appear to be API migration. The actual stack-related fixes may be in other files not shown in this review.
Also applies to: 7-7, 24-24, 29-29, 713-713, 783-783, 785-785
kernel/etc/Editor.c (1)
6-6
: LGTM! Editor yield points migrated to Scheduler API.The editor's cooperative yield calls are correctly updated to use
Yield()
from the unified Scheduler interface. These occur at appropriate points during input waiting, maintaining the existing editor behavior.Also applies to: 51-51, 241-241, 254-254
fs/VFRFS.c (1)
4-4
: LGTM! Filesystem PID tracking migrated to Scheduler API.The file open operation correctly captures the owner process ID using
GetCurrentProcess()->pid
from the unified Scheduler interface, maintaining existing process ownership tracking behavior.Also applies to: 210-210
kernel/sched/MLFQ.c (1)
1717-1717
: LGTM!The initialization logging additions are consistent with the codebase patterns and improve observability during scheduler initialization.
Also applies to: 1798-1798
kernel/execf/aout/AoutLoader.c (1)
5-5
: LGTM!The migration from MLFQ-specific APIs to the unified Scheduler API is consistent with the broader refactoring objectives.
Also applies to: 143-143
fs/EXT/Ext2.c (1)
6-6
: LGTM!The migration to the unified Scheduler API is applied consistently across all locking operations throughout the file.
Also applies to: 68-68, 86-86, 90-90, 95-95, 98-98
arch/x86_64/interrupts/Interrupts.c (1)
10-10
: LGTM!The scheduler API migration is correct, and the added return statement on line 31 properly prevents fall-through after handling the timer interrupt.
Also applies to: 27-31, 110-114
kernel/execf/pe/PEloader.c (1)
6-6
: LGTM!The scheduler API migration is consistent with changes across the codebase.
Also applies to: 112-112, 216-216
kernel/core/Compositor.c (1)
11-11
: LGTM!The new
VFCompositorRequestInit
function is well-structured with appropriate conditional compilation guards, cached PID management to prevent duplicate initialization, and proper error handling.Also applies to: 32-61
kernel/execf/elf/ELFloader.c (1)
6-6
: LGTM!The scheduler API migration is consistent with the broader refactoring.
Also applies to: 122-122, 269-269
kernel/sched/EEVDF.c (5)
325-328
: LGTM!The defensive check prevents double-insertion into the red-black tree, which could corrupt the tree structure.
446-453
: LGTM!The rb_leftmost update logic correctly walks up ancestors while coming from the right child, fixing the corruption issue mentioned in previous reviews. The use of a separate
current
pointer prevents overwriting thenode
variable that's needed for the deletion.
815-829
: LGTM!The shell process creation is properly guarded by conditional compilation and includes appropriate error handling and logging.
884-884
: LGTM!Using the creator's PID for the token is correct. Since
EEVDFGetCurrentProcess()
never returns NULL (it panics on invalid indices), the dereference at line 903 is safe.Also applies to: 903-903
910-910
: LGTM - Critical stack fix!Removing the addition of
EEVDF_STACK_SIZE
is correct.VMemAllocStack
already returns the stack top (highest address) asbase_addr + total_size
, so addingEEVDF_STACK_SIZE
again would move the stack pointer beyond the allocated region, causing stack corruption. This fix aligns with the PR title "EEVDF stack issue".
Removed adding EEVDF_STACK_SIZE to VMemAllocStack() allocated stack, fix logic issues
Summary by CodeRabbit