Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
42 changes: 22 additions & 20 deletions arch/x86_64/interrupts/Interrupts.c
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
#include "Interrupts.h"
#include "Console.h"
#include "Ide.h"
#include "Io.h"
#include "MemOps.h"
#include "PS2.h"
#include "Panic.h"
#include "Pic.h"
Copy link

@coderabbitai coderabbitai bot Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: EOI path is hardwired to PIC; breaks APIC mode

All EOI calls here go to PICSendEOI(), bypassing the controller abstraction. In APIC mode, failing to write LAPIC EOI will wedge further interrupts. Route through InterruptControllerSendEOI(regs->interrupt_number) instead, and include InterruptController.h.

Apply this diff:

-#include "Pic.h"
+#include "InterruptController.h"
@@
-            PICSendEOI(regs->interrupt_number);
+            InterruptControllerSendEOI(regs->interrupt_number);
@@
-            PICSendEOI(regs->interrupt_number);
+            InterruptControllerSendEOI(regs->interrupt_number);
@@
-        case 34:
-            PICSendEOI(regs->interrupt_number);
+        case 34:
+            InterruptControllerSendEOI(regs->interrupt_number);
             return;
@@
-            PICSendEOI(regs->interrupt_number);
+            InterruptControllerSendEOI(regs->interrupt_number);
@@
-            PICSendEOI(regs->interrupt_number);
+            InterruptControllerSendEOI(regs->interrupt_number);
@@
-            PICSendEOI(regs->interrupt_number);
+            InterruptControllerSendEOI(regs->interrupt_number);

Optional: If you want to keep direct PIC usage for PIC-only builds, add a fast path chosen at compile- or run-time, but keep one public API for EOIs.

Also applies to: 18-18, 23-25, 26-28, 32-33, 37-38, 42-43, 46-47, 50-51

🤖 Prompt for AI Agents
In arch/x86_64/interrupts/Interrupts.c around line 7 (and also apply to the
other listed sites: 18, 23-25, 26-28, 32-33, 37-38, 42-43, 46-47, 50-51), the
file currently includes Pic.h and calls PICSendEOI() directly; replace that with
the controller abstraction by including InterruptController.h at the top and
replace each PICSendEOI(...) call with
InterruptControllerSendEOI(regs->interrupt_number). If you must keep a PIC-only
fast path, implement it behind a compile- or run-time branch but expose and use
the single public API InterruptControllerSendEOI(...) everywhere in this file.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well i acknowledge that but the thing is, the APIC, does NOT work, when STI, it just dies, no kernel panic, no crashing, no continue and just stops (not crashed btw, if it were to crash, qemu would said that its paused)

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@assembler-0, I understand the APIC hang issue - that's a classic symptom of APIC initialization problems. The "dies on STI" behavior typically indicates interrupt routing or LAPIC configuration issues.

However, we can solve both problems properly:

For the architecture issue: Instead of hardcoding PIC calls, let's use conditional compilation or runtime detection:

#include "InterruptController.h"

// In interrupt handlers, use:
InterruptControllerSendEOI(regs->interrupt_number);

The InterruptController.c already handles the APIC/PIC selection, so this maintains the abstraction while letting you force PIC mode through the interrupt controller's logic.

For debugging the APIC hang: The STI hang suggests:

  1. LAPIC not accepting interrupts - Check LAPIC_SPURIOUS_INTERRUPT_VECTOR register
  2. Missing spurious interrupt handler - APIC requires handling spurious vectors
  3. IOAPIC routing issues - Interrupt redirection table misconfigured
  4. LAPIC timer misconfigured - Can cause immediate hangs

Want me to help investigate the specific APIC hang issue? The proper fix is getting APIC working, not bypassing the abstraction layer.


🏁 Script executed:

#!/bin/bash
# Let's check the APIC initialization code
echo "=== Searching for APIC initialization ==="
rg -n -A5 -B5 "ApicInstall|APIC.*install" --type=c

echo -e "\n=== Looking for spurious interrupt handling ==="
rg -n -A3 -B3 "spurious|SPURIOUS" --type=c

echo -e "\n=== Checking LAPIC/IOAPIC setup ==="
rg -n -A5 -B5 "LAPIC|IOAPIC|Local.*APIC" --type=c

Length of output: 12325


I found the issue! The APIC initialization is setting all interrupt entries as MASKED initially, but then never enables/unmasks them when setting up the interrupts.

Looking at the code:

  1. Line 151: IoApicWrite(IOAPIC_REG_REDTBL_BASE + 0*2, 32 | APIC_DELMODE_FIXED | **APIC_INT_MASKED**);
  2. Lines 155, 159, 163, 167, 171: All other IRQs also set APIC_INT_MASKED

But when you call APIC_enable_irq(), it only unmasks if the IRQ is < ioapic_max_redirections, and ioapic_max_redirections is never initialized!

The fix is simple: Initialize ioapic_max_redirections and actually unmask the interrupts during APIC setup.

In your APIC installation, you need to:

  1. Read the IOAPIC version register to get max redirections
  2. Actually unmask the interrupts you configure
  3. Call the enable functions for required IRQs

This explains the STI hang - interrupts are routed to APIC but all masked, so no interrupts fire and the system appears hung.

Want me to show the specific fixes needed in the APIC code?


🏁 Script executed:

#!/bin/bash
# Check if ioapic_max_redirections is ever initialized
echo "=== Checking ioapic_max_redirections initialization ==="
rg -n -A3 -B3 "ioapic_max_redirections" --type=c

echo -e "\n=== Check where APIC_enable_irq is called ==="
rg -n -A2 -B2 "APIC_enable_irq|ApicEnable" --type=c

echo -e "\n=== Check interrupt enabling ==="
rg -n -A3 -B3 "enable.*irq|unmask.*irq" --type=c

Length of output: 8212


Perfect! I found the exact issue. The APIC setup is almost correct, but there's a critical missing piece:

The kernel calls PIC_enable_irq() directly instead of using the abstraction layer!

Look at kernel/core/Kernel.c lines 505-510:

static void IRQUnmaskCoreSystems() {
    PrintKernel("Unmasking IRQs...");
    PIC_enable_irq(0);  // ← These bypass the abstraction!
    PIC_enable_irq(1);
    PIC_enable_irq(12);
    PIC_enable_irq(2);
    PIC_enable_irq(14);
    PIC_enable_irq(15);

When APIC is active, these PIC_enable_irq() calls do nothing to the APIC - they only modify the PIC mask state, but the APIC interrupts remain masked!

The Fix

Replace the direct PIC calls with the abstraction layer:

// In kernel/core/Kernel.c, change IRQUnmaskCoreSystems():
static void IRQUnmaskCoreSystems() {
    PrintKernel("Unmasking IRQs...");
    IC_enable_irq(0);   // Timer
    IC_enable_irq(1);   // Keyboard  
    IC_enable_irq(12);  // Mouse
    IC_enable_irq(2);   // Cascade/FAT12
    IC_enable_irq(14);  // IDE Primary
    IC_enable_irq(15);  // IDE Secondary
    PrintKernelSuccess("System: IRQs unmasked\n");
}

And fix the interrupt handler to use the abstraction:

// In arch/x86_64/interrupts/Interrupts.c
#include "InterruptController.h"  // Instead of "Pic.h"

// Replace all PICSendEOI calls with:
InterruptControllerSendEOI(regs->interrupt_number);

This explains the STI hang perfectly:

  1. APIC initializes with all interrupts masked
  2. Kernel tries to unmask IRQs using PIC_enable_irq()
  3. APIC interrupts stay masked, PIC is disabled
  4. STI enables interrupts but none can fire → system hangs

The ioapic_max_redirections is being initialized correctly (line 61), so the APIC code itself is fine - it's just never being called!

Want me to create the full fix patch?


🧬 Code Graph Analysis Results
  • File: drivers/Pic.c, lines 67-70
void PICSendEOI(uint64_t interrupt_number) {
    if (interrupt_number >= 40) outb(PIC2_COMMAND, 0x20);
    outb(PIC1_COMMAND, 0x20);
}
  • File: drivers/Ide.c, lines 257-261
void IDEPrimaryIRQH(void) {
    // Primary IDE channel interrupt
    // Read status to acknowledge
    inb(IDE_PRIMARY_BASE + IDE_REG_STATUS);
}
  • File: drivers/Ide.c, lines 263-267
void IDESecondaryIRQH(void) {
    // Secondary IDE channel interrupt  
    // Read status to acknowledge
    inb(IDE_SECONDARY_BASE + IDE_REG_STATUS);
}
  • File: kernel/etc/Console.c, lines 190-195
void PrintKernelWarning(const char* str) {
    ConsoleSetColor(VGA_COLOR_WARNING);
    PrintKernelBadge("WARNING");
    PrintKernel(str);
    ConsoleSetColor(VGA_COLOR_DEFAULT);
}
  • File: kernel/etc/Console.c, lines 259-283
void PrintKernelInt(int64_t num) {
    char buffer[21];

    if (num == 0) {
        PrintKernel("0");
        return;
    }

    bool negative = num < 0;
    if (negative) num = -num;

    int pos = 20;
    buffer[pos--] = '\0';

    while (num > 0 && pos >= 0) {
        buffer[pos--] = '0' + (num % 10);
        num /= 10;
    }

    if (negative && pos >= 0) {
        buffer[pos--] = '-';
    }

    PrintKernel(&buffer[pos + 1]);
}
  • File: kernel/memory/MemOps.c, lines 44-67
void htoa(uint64_t n, char* buffer) {
    static const char hex_chars[16] = "0123456789ABCDEF";
    buffer[0] = '0';
    buffer[1] = 'x';

    // Unroll the loop for better performance
    buffer[2]  = hex_chars[(n >> 60) & 0xF];
    buffer[3]  = hex_chars[(n >> 56) & 0xF];
    buffer[4]  = hex_chars[(n >> 52) & 0xF];
    buffer[5]  = hex_chars[(n >> 48) & 0xF];
    buffer[6]  = hex_chars[(n >> 44) & 0xF];
    buffer[7]  = hex_chars[(n >> 40) & 0xF];
    buffer[8]  = hex_chars[(n >> 36) & 0xF];
    buffer[9]  = hex_chars[(n >> 32) & 0xF];
    buffer[10] = hex_chars[(n >> 28) & 0xF];
    buffer[11] = hex_chars[(n >> 24) & 0xF];
    buffer[12] = hex_chars[(n >> 20) & 0xF];
    buffer[13] = hex_chars[(n >> 16) & 0xF];
    buffer[14] = hex_chars[(n >> 12) & 0xF];
    buffer[15] = hex_chars[(n >> 8) & 0xF];
    buffer[16] = hex_chars[(n >> 4) & 0xF];
    buffer[17] = hex_chars[n & 0xF];
    buffer[18] = '\0';
}
  • File: kernel/memory/MemOps.c, lines 6-36
void strcpy(char* dest, const char* src) {
    // Optimize for 64-bit aligned copies when possible
    if (((uintptr_t)dest & 7) == 0 && ((uintptr_t)src & 7) == 0) {
        uint64_t* d64 = (uint64_t*)dest;
        const uint64_t* s64 = (const uint64_t*)src;

        uint64_t val;
        while ((val = *s64++) != 0) {
            // Check if any byte in the 64-bit value is zero
            if ((val & 0xFF00000000000000ULL) == 0 ||
                (val & 0x00FF000000000000ULL) == 0 ||
                (val & 0x0000FF0000000000ULL) == 0 ||
                (val & 0x000000FF00000000ULL) == 0 ||
                (val & 0x00000000FF000000ULL) == 0 ||
                (val & 0x0000000000FF0000ULL) == 0 ||
                (val & 0x000000000000FF00ULL) == 0 ||
                (val & 0x00000000000000FFULL) == 0) {
                // Found null terminator, fall back to byte copy
                char* d = (char*)d64;
                const char* s = (const char*)(s64 - 1);
                while ((*d++ = *s++));
                return;
            }
            *d64++ = val;
        }
        *(char*)d64 = '\0';
    } else {
        // Original byte-by-byte copy for unaligned data
        while ((*dest++ = *src++));
    }
}
  • File: kernel/memory/MemOps.c, lines 38-42
void strcat(char* dest, const char* src) {
    // Find end of dest string more efficiently
    while (*dest) dest++;
    strcpy(dest, src);  // Reuse optimized strcpy
}
  • File: kernel/memory/MemOps.c, lines 69-89
void itoa(uint64_t n, char* buffer) {
    if (n == 0) {
        buffer[0] = '0';
        buffer[1] = '\0';
        return;
    }

    char temp_buffer[21];
    char* p = &temp_buffer[20];
    *p = '\0';

    // Use faster division by avoiding modulo when possible
    while (n >= 10) {
        uint64_t q = n / 10;
        *--p = '0' + (n - q * 10);  // Faster than n % 10
        n = q;
    }
    *--p = '0' + n;

    strcpy(buffer, p);
}
  • File: drivers/PS2.c, lines 118-168
void KeyboardHandler(void) {
    uint8_t status = inb(KEYBOARD_STATUS_PORT);
    if (!(status & 0x01)) return;

    // If bit 5 is set, the data is for the mouse.
    // In that case, the MouseHandler will deal with it, so we should ignore it.
    if (status & 0x20) {
        return;
    }

    uint8_t scancode = inb(KEYBOARD_DATA_PORT);
    int key_released = scancode & 0x80;
    scancode &= 0x7F;

    // Handle modifier keys
    if (scancode == 0x2A || scancode == 0x36) { // Left/Right Shift
        shift_pressed = !key_released;
        return;
    }
    if (scancode == 0x1D) { // Ctrl
        ctrl_pressed = !key_released;
        return;
    }
    if (scancode == 0x38) { // Alt
        alt_pressed = !key_released;
        return;
    }

    if (key_released) return;
    if (scancode >= sizeof(scancode_to_ascii)) return;

    char c;
    if (shift_pressed) {
        c = stcancode_to_ascii_shift[scancode];
    } else {
        c = scancode_to_ascii[scancode];
    }

    // Handle Ctrl combinations
    if (ctrl_pressed && c >= 'a' && c <= 'z') {
        c = c - 'a' + 1;
    } else if (ctrl_pressed && c >= 'A' && c <= 'Z') {
        c = c - 'A' + 1;
    }

    if (c && buffer_count < 255) {
        input_buffer[buffer_tail] = c;
        buffer_tail = (buffer_tail + 1) % 256;
        buffer_count++;
    }
}
  • File: drivers/PS2.c, lines 170-226
void MouseHandler(void) {
    uint8_t status = inb(KEYBOARD_STATUS_PORT);
    if (!(status & 0x01)) return;

    // Check if this is mouse data (bit 5 set in status)
    if (!(status & 0x20)) return;

    uint8_t data = inb(KEYBOARD_DATA_PORT);

    mouse.packet[mouse.packet_index] = data;
    mouse.packet_index++;

    // Standard PS2 mouse sends 3-byte packets
    if (mouse.packet_index >= 3) {
        // Parse packet
        uint8_t flags = mouse.packet[0];
        int8_t delta_x = (int8_t)mouse.packet[1];
        int8_t delta_y = (int8_t)mouse.packet[2];

        // Check if packet is valid (bit 3 should be set)
        if (flags & 0x08) {
            // Handle X overflow
            if (flags & 0x40) {
                delta_x = (flags & 0x10) ? -256 : 255;
            } else if (flags & 0x10) {
                delta_x = (delta_x == 0) ? 0 : (delta_x | 0xFFFFFF00);
            }

            // Handle Y overflow
            if (flags & 0x80) {
                delta_y = (flags & 0x20) ? -256 : 255;
            } else if (flags & 0x20) {
                delta_y = (delta_y == 0) ? 0 : (delta_y | 0xFFFFFF00);
            }

            // Y axis is inverted in PS2 mouse
            delta_y = -delta_y;

            // Update position
            mouse.delta_x = delta_x;
            mouse.delta_y = delta_y;
            mouse.x += delta_x;
            mouse.y += delta_y;

            // Update button state
            mouse.buttons = flags & 0x07; // Lower 3 bits are button states

            // Clamp position to reasonable bounds (you can adjust these)
            if (mouse.x < 0) mouse.x = 0;
            if (mouse.y < 0) mouse.y = 0;
            if (mouse.x > 1023) mouse.x = 1023;  // Adjust for your screen resolution
            if (mouse.y > 767) mouse.y = 767;
        }

        mouse.packet_index = 0;
    }
}
  • File: kernel/process/Process.c, lines 705-882
    Summary (docstring-style):
  • Function: void FastSchedule(struct Registers* regs)
  • Purpose: Perform MLFQ-based scheduling, including context switching, quantum management, and various priority/aging adjustments.
  • Parameters: regs - pointer to the current processor state to be saved/restored as part of scheduling.
  • Return: None (void)
  • Behavior details:
    • Acquires scheduler_lock, increments scheduling counters, and applies aging/fairness adjustments.
    • Computes CPU burst history, validates process tokens, and may terminate a process on token corruption.
    • Saves the current process context, decides whether to preempt, and, if preempting, demotes/promotes priorities, re-queues, and prepares the next process.
    • Chooses the next ready process with FindBestQueue / DeQueue, and performs a context switch (FastMemcpy / AstraPreflightCheck / Astro consistent checks).
    • Updates quantum_remaining, last_scheduled_tick, context, and context_switch_overhead.
    • Handles idle case when no ready process exists.
  • Exceptions/Errors: Uses goto select_next for retry in some edge cases; may terminate a process on token corruption via ASTerminate.
  • Important implementation notes: Contains multiple FIXED comments indicating adjustments to preemption, aging, and quantum behavior. Interaction with other subsystems includes token validation, scheduler queues, and context saving/restoration.

🧠 Learnings used
Learnt from: assembler-0
PR: assembler-0/VoidFrame#79
File: drivers/PCI/PCI.c:154-154
Timestamp: 2025-08-13T07:26:56.892Z
Learning: VoidFrame kernel uses direct interrupt dispatch in arch/x86_64/interrupts/Interrupts.c rather than RegisterInterruptHandler pattern. Hardware interrupts are handled in switch cases by interrupt number (32+ for IRQs), with proper PIC EOI handling required.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:22:28.816Z
Learning: In VoidFrame kernel, panic functions must call "asm volatile("cli")" as the very first instruction to disable interrupts immediately. Without this, if interrupts are enabled (after "sti"), an interrupt can fire during panic handling causing the panic to fail, hang, or behave unpredictably. This must be done in ALL panic entry points: Panic(), PanicWithCode(), PanicWithContext(), PanicFromInterrupt(), and at the start of KernelPanicHandler().

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:22:28.816Z
Learning: In VoidFrame kernel, panic functions must disable interrupts with "cli" as the very first operation, before any other code executes. If interrupts are enabled (after "sti"), an interrupt can fire between the panic call and the cli instruction in KernelPanicHandler, causing the panic to fail or behave unpredictably.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

bruh just skip it, do i look like i forgot to change the IRQ handler?, anyways, room for improvements for the MLFQ sched and security features, since this is a syscall less kernel, and this uint32_t new_priority = 0;, if its not 0, it would just blew up, hmm no user process will be scheduled

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@assembler-0, got it - skipping the interrupt controller abstraction for now. Let me investigate that MLFQ priority issue you mentioned.


🏁 Script executed:

#!/bin/bash
# Find the specific priority line mentioned
echo "=== Searching for 'new_priority = 0' hardcoded value ==="
rg -n -A3 -B3 "new_priority.*=.*0" --type=c

echo -e "\n=== Check priority handling in scheduler ==="
rg -n -A5 -B5 "new_priority|priority.*=" kernel/process/Process.c

echo -e "\n=== Look for user process creation and priority assignment ==="
rg -n -A3 -B3 "user.*process|PROC.*USER|priority.*user" --type=c

echo -e "\n=== Check RT_PRIORITY_THRESHOLD and priority constants ==="
rg -n -A2 -B2 "RT_PRIORITY_THRESHOLD|PRIORITY_.*=" --type=c

Length of output: 18711


I found the exact issue you're referring to! The problem is in the SmartAging() function at line 645:

// To prevent starvation, user processes must be boosted
// to the highest priority (0) to guarantee they get to run. Boosting
// them to a lower priority was ineffective.
uint32_t new_priority = 0;

The Issue: This hardcodes user processes to boost to priority 0 (RT level), but your MLFQ design has RT_PRIORITY_THRESHOLD = 3. This means:

  • Priority 0-2: Real-time queues (for system processes)
  • Priority 3-4: Regular queues (for user processes)

When user processes get boosted to priority 0, they're jumping into the system-level RT queue, which breaks the isolation between system and user processes in your syscall-less kernel.

The Fix: Change the hardcoded new_priority = 0 to respect the RT threshold:

// In SmartAging() around line 645
uint32_t new_priority = RT_PRIORITY_THRESHOLD; // User processes boost to highest user priority (3)

This aligns with your other user process handling:

  • Line 724: processes[i].priority = RT_PRIORITY_THRESHOLD; // User processes to user RT
  • Line 808: old_proc->priority = RT_PRIORITY_THRESHOLD;
  • Line 1035: processes[slot].priority = (privilege == PROC_PRIV_SYSTEM) ? 0 : RT_PRIORITY_THRESHOLD;

Security Implications: The current hardcoded 0 accidentally elevates user processes to system-level priority during anti-starvation boosts, potentially breaking your privilege separation in the syscall-less design.

Want me to create an issue for this MLFQ priority isolation fix?


🧬 Code Graph Analysis Results

Here are the snippets and summaries from other parts of the codebase that are relevant to understanding arch/x86_64/interrupts/Interrupts.c and its handling of IRQs, scheduling, and security checks.

  • File: drivers/Pic.c [lines 67-70]
    Full snippet:
    void PICSendEOI(uint64_t interrupt_number) {
    if (interrupt_number >= 40) outb(PIC2_COMMAND, 0x20);
    outb(PIC1_COMMAND, 0x20);
    }

  • File: drivers/Ide.c [lines 257-261]
    Full snippet:
    void IDEPrimaryIRQH(void) {
    // Primary IDE channel interrupt
    // Read status to acknowledge
    inb(IDE_PRIMARY_BASE + IDE_REG_STATUS);
    }

  • File: drivers/Ide.c [lines 263-267]
    Full snippet:
    void IDESecondaryIRQH(void) {
    // Secondary IDE channel interrupt
    // Read status to acknowledge
    inb(IDE_SECONDARY_BASE + IDE_REG_STATUS);
    }

  • File: kernel/etc/Console.c [lines 190-195]
    Full snippet:
    void PrintKernelWarning(const char* str) {
    ConsoleSetColor(VGA_COLOR_WARNING);
    PrintKernelBadge("WARNING");
    PrintKernel(str);
    ConsoleSetColor(VGA_COLOR_DEFAULT);
    }

  • File: kernel/etc/Console.c [lines 259-283]
    Full snippet:
    void PrintKernelInt(int64_t num) {
    char buffer[21];

    if (num == 0) {
        PrintKernel("0");
        return;
    }
    
    bool negative = num < 0;
    if (negative) num = -num;
    
    int pos = 20;
    buffer[pos--] = '\0';
    
    while (num > 0 && pos >= 0) {
        buffer[pos--] = '0' + (num % 10);
        num /= 10;
    }
    
    if (negative && pos >= 0) {
        buffer[pos--] = '-';
    }
    
    PrintKernel(&buffer[pos + 1]);
    

    }

  • File: kernel/memory/MemOps.c [lines 44-67]
    Full snippet:
    void htoa(uint64_t n, char* buffer) {
    static const char hex_chars[16] = "0123456789ABCDEF";
    buffer[0] = '0';
    buffer[1] = 'x';

    // Unroll the loop for better performance
    buffer[2]  = hex_chars[(n >> 60) & 0xF];
    buffer[3]  = hex_chars[(n >> 56) & 0xF];
    buffer[4]  = hex_chars[(n >> 52) & 0xF];
    buffer[5]  = hex_chars[(n >> 48) & 0xF];
    buffer[6]  = hex_chars[(n >> 44) & 0xF];
    buffer[7]  = hex_chars[(n >> 40) & 0xF];
    buffer[8]  = hex_chars[(n >> 36) & 0xF];
    buffer[9]  = hex_chars[(n >> 32) & 0xF];
    buffer[10] = hex_chars[(n >> 28) & 0xF];
    buffer[11] = hex_chars[(n >> 24) & 0xF];
    buffer[12] = hex_chars[(n >> 20) & 0xF];
    buffer[13] = hex_chars[(n >> 16) & 0xF];
    buffer[14] = hex_chars[(n >> 12) & 0xF];
    buffer[15] = hex_chars[(n >> 8)  & 0xF];
    buffer[16] = hex_chars[(n >> 4)  & 0xF];
    buffer[17] = hex_chars[n & 0xF];
    buffer[18] = '\0';
    

    }

  • File: kernel/memory/MemOps.c [lines 6-36]
    Full snippet:
    void strcpy(char* dest, const char* src) {
    // Optimize for 64-bit aligned copies when possible
    if (((uintptr_t)dest & 7) == 0 && ((uintptr_t)src & 7) == 0) {
    uint64_t* d64 = (uint64_t*)dest;
    const uint64_t* s64 = (const uint64_t*)src;

        uint64_t val;
        while ((val = *s64++) != 0) {
            // Check if any byte in the 64-bit value is zero
            if ((val & 0xFF00000000000000ULL) == 0 ||
                (val & 0x00FF000000000000ULL) == 0 ||
                (val & 0x0000FF0000000000ULL) == 0 ||
                (val & 0x000000FF00000000ULL) == 0 ||
                (val & 0x00000000FF000000ULL) == 0 ||
                (val & 0x0000000000FF0000ULL) == 0 ||
                (val & 0x000000000000FF00ULL) == 0 ||
                (val & 0x00000000000000FFULL) == 0) {
                // Found null terminator, fall back to byte copy
                char* d = (char*)d64;
                const char* s = (const char*)(s64 - 1);
                while ((*d++ = *s++));
                return;
            }
            *d64++ = val;
        }
        *(char*)d64 = '\0';
    } else {
        // Original byte-by-byte copy for unaligned data
        while ((*dest++ = *src++));
    }
    

    }

  • File: kernel/memory/MemOps.c [lines 38-42]
    Full snippet:
    void strcat(char* dest, const char* src) {
    // Find end of dest string more efficiently
    while (*dest) dest++;
    strcpy(dest, src); // Reuse optimized strcpy
    }

  • File: kernel/memory/MemOps.c [lines 69-89]
    Full snippet:
    void itoa(uint64_t n, char* buffer) {
    if (n == 0) {
    buffer[0] = '0';
    buffer[1] = '\0';
    return;
    }

    char temp_buffer[21];
    char* p = &temp_buffer[20];
    *p = '\0';
    
    // Use faster division by avoiding modulo when possible
    while (n >= 10) {
        uint64_t q = n / 10;
        *--p = '0' + (n - q * 10);  // Faster than n % 10
        n = q;
    }
    *--p = '0' + n;
    
    strcpy(buffer, p);
    

    }

  • File: drivers/PS2.c [lines 118-168]
    Summary:
    KeyboardHandler(void) reads the PS/2 keyboard status, filters out mouse input, decodes scancodes, tracks modifier keys (Shift, Ctrl, Alt), translates to ASCII with or without shift, applies Ctrl-key combinations, and buffers input into input_buffer with wraparound. It updates internal state such as shift_pressed, ctrl_pressed, alt_pressed, and buffer_tail/buffer_count.

  • File: drivers/PS2.c [lines 170-226]
    Summary:
    MouseHandler(void) reads PS/2 mouse data when available, accumulates 3-byte packets, validates the packet, computes deltas with overflow handling, inverts Y axis, updates mouse state (x, y, delta_x, delta_y, buttons), and clamps coordinates to a defined range. It then resets the packet index.

  • File: kernel/process/Process.c [lines 705-882]
    Summary:
    FastSchedule(struct Registers* regs) implements a multi-level feedback queue (MLFQ) scheduler with various illness/fairness adjustments:

    • Maintains scheduler_lock and tick counters
    • Performs fairness boosts at intervals and handles aging
    • Tracks CPU bursts and CPU time per process
    • Validates process tokens after a run (ValidateToken) and may terminate corrupt tokens (ASTerminate)
    • Preserves and updates process context during preemption decisions
    • Chooses next process via FindBestQueue, DeQueue, AstraPreflightCheck
    • Performs context switch, resets quantum, and tracks context switch overhead
    • Special handling of system vs. user processes, including priority adjustments and demotions
    • Uses various constants/thresholds (FAIRNESS_BOOST_ACTUAL_INTERVAL, FAIRNESS_WAIT_THRESHOLD, STARVATION_THRESHOLD, RT_PRIORITY_THRESHOLD, PREEMPTION_MIN_PRIORITY_GAP, etc.)

Note: The Interrupts.c file calls:

  • FastSchedule(regs) on timer interrupts (IRQ 0, interrupt_number 32)
  • KeyboardHandler() on keyboard interrupts (IRQ 1, interrupt_number 33)
  • MouseHandler() on mouse interrupts (IRQ 12, interrupt_number 44)
  • IDEPrimaryIRQH() / IDESecondaryIRQH() on IDE IRQs (interrupt_numbers 46 and 47)
  • PICSendEOI(regs->interrupt_number) to acknowledge each interrupt

Also relevant for understanding: Token validation and AstraPreflightCheck usage indicate security checks during scheduling and interrupt handling.


✏️ Learnings added
Learnt from: assembler-0
PR: assembler-0/VoidFrame#95
File: arch/x86_64/interrupts/Interrupts.c:7-7
Timestamp: 2025-08-23T10:11:58.964Z
Learning: In VoidFrame kernel's MLFQ scheduler, user processes should be boosted to RT_PRIORITY_THRESHOLD (priority 3) during anti-starvation handling, not hardcoded priority 0. Priority 0-2 are reserved for system processes (RT queues), while priority 3-4 are for user processes (regular queues). The SmartAging function at line 645 incorrectly uses "uint32_t new_priority = 0;" which breaks privilege separation in the syscall-less kernel design.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: assembler-0
PR: assembler-0/VoidFrame#79
File: drivers/PCI/PCI.c:154-154
Timestamp: 2025-08-13T07:26:56.892Z
Learning: VoidFrame kernel uses direct interrupt dispatch in arch/x86_64/interrupts/Interrupts.c rather than RegisterInterruptHandler pattern. Hardware interrupts are handled in switch cases by interrupt number (32+ for IRQs), with proper PIC EOI handling required.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:22:28.816Z
Learning: In VoidFrame kernel, panic functions must call "asm volatile("cli")" as the very first instruction to disable interrupts immediately. Without this, if interrupts are enabled (after "sti"), an interrupt can fire during panic handling causing the panic to fail, hang, or behave unpredictably. This must be done in ALL panic entry points: Panic(), PanicWithCode(), PanicWithContext(), PanicFromInterrupt(), and at the start of KernelPanicHandler().

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:22:28.816Z
Learning: In VoidFrame kernel, panic functions must disable interrupts with "cli" as the very first operation, before any other code executes. If interrupts are enabled (after "sti"), an interrupt can fire between the panic call and the cli instruction in KernelPanicHandler, causing the panic to fail or behave unpredictably.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#82
File: kernel/process/Process.h:0-0
Timestamp: 2025-08-14T02:11:27.395Z
Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#63
File: kernel/etc/Shell.c:24-28
Timestamp: 2025-08-10T08:52:13.313Z
Learning: The VoidFrame kernel has critical issues with PID allocation - it uses a simple incrementing counter without recycling freed PIDs, causing PID 4 to become PID 5 after death instead of reusing slot 4.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#80
File: README.md:7-8
Timestamp: 2025-08-13T08:43:54.750Z
Learning: User assembler-0 is working on VoidFrame kernel with a tight deadline (2-3 days before summer ends) and specifically requested runtime safety checks similar to __stack_chk_fail but for deadlocks and other kernel issues. The kernel already has extensive safety mechanisms including stack canaries, guard pages, heap integrity checks, and process security validation.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#82
File: kernel/process/Process.h:0-0
Timestamp: 2025-08-14T02:11:58.963Z
Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#70
File: kernel/process/Process.c:1224-1228
Timestamp: 2025-08-11T09:51:51.217Z
Learning: In VoidFrame kernel, avoid using floating-point arithmetic in kernel space functions like SystemService. Use fixed-point arithmetic instead (e.g., 16.16 format with integer operations) to prevent FPU state corruption, improve performance, and ensure predictable behavior. This is especially important for system-level processes running with PROC_PRIV_SYSTEM privileges.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#72
File: arch/x86_64/syscall/SyscallEntry.asm:24-30
Timestamp: 2025-08-12T01:43:39.847Z
Learning: User prefers to defer syscall implementation complexity until next year and is considering making a ring 0 kernel instead of continuing with user-mode syscall approach in the VoidFrame kernel project.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#70
File: kernel/process/Process.c:1224-1228
Timestamp: 2025-08-11T09:51:32.713Z
Learning: In the VoidFrame kernel (kernel/process/Process.c), avoid using floating-point arithmetic in kernel space. Use fixed-point arithmetic instead with integer scaling (e.g., multiplying by 1000 for 3 decimal places) to avoid FPU state management overhead and potential exceptions.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is that useful when i literally says uint32_t new_priority = RT_PRIORITY_THRESHOLD; literally blew up the entire scheduler?, it just only doesnt starve, it simply refuses to switch, and the problem is uint32_t new_priority = RT_PRIORITY_THRESHOLD; is the problem, that i found interesting

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧩 Analysis chain

@assembler-0 That's a fascinating observation! You're right - there's clearly something deeper going on with the scheduler's queue management at RT_PRIORITY_THRESHOLD. Let me investigate why boosting to priority 3 breaks process switching entirely.

The issue might be in how the scheduler differentiates between RT and regular queues at the boundary. When you boost to exactly RT_PRIORITY_THRESHOLD (priority 3), it could be:

  1. Bitmap confusion: The scheduler uses separate rt_bitmap and active_bitmap - boosting to priority 3 might not set the right bitmap bits
  2. FindBestQueue logic: The queue selection algorithm might have edge case handling at the RT/regular boundary
  3. Preemption rules: Priority 3 might be treated inconsistently in preemption decisions

What specific behavior do you see when it "refuses to switch"? Does it:

  • Get stuck on one process indefinitely?
  • Stop scheduling entirely (idle loop)?
  • Only schedule system processes but ignore user processes?

This will help pinpoint whether it's a queue selection issue or bitmap management problem.


🏁 Script executed:

#!/bin/bash
# Let's examine the scheduler logic around RT_PRIORITY_THRESHOLD handling
echo "=== FindBestQueue implementation and RT_PRIORITY_THRESHOLD handling ==="
rg -n -A10 -B5 "FindBestQueue" kernel/process/Process.c

echo -e "\n=== Queue bitmap handling around RT_PRIORITY_THRESHOLD ==="
rg -n -A5 -B5 "active_bitmap|rt_bitmap.*RT_PRIORITY_THRESHOLD" kernel/process/Process.c

echo -e "\n=== Preemption logic with RT_PRIORITY_THRESHOLD ==="
rg -n -A8 -B3 "preempt.*RT_PRIORITY_THRESHOLD|RT_PRIORITY_THRESHOLD.*preempt" kernel/process/Process.c

echo -e "\n=== SmartAging boost logic and queue management ==="
ast-grep --pattern $'SmartAging($_) {
  $$$
  new_priority = $_;
  $$$
}'

Length of output: 6774


@assembler-0 Found the root cause! The issue is a critical bitmap management bug in AddToScheduler().

The Bug: When adding processes to the scheduler, rt_bitmap is never set, only cleared:

// Line 507 - active_bitmap gets set
MLFQscheduler.active_bitmap |= (1U << priority);

// But rt_bitmap is NEVER set for RT priorities!
// Only cleared in RemoveFromScheduler (lines 555-557)

Why Priority 0 "Works":

  • FindBestQueue() checks rt_active = active_bitmap & rt_bitmap (line 569)
  • Since rt_bitmap is always 0, rt_active is always 0
  • So it always falls through to regular queue selection (line 575)
  • Priority 0 processes end up in regular queue logic by accident

Why RT_PRIORITY_THRESHOLD "Breaks":

  • When SmartAging boosts to priority 3, it exposes that RT queues (0-2) are completely broken
  • The scheduler tries to find RT processes but rt_bitmap is empty
  • Regular queue selection at priority 3 hits edge cases in the load balancing logic

The Fix: Add rt_bitmap initialization in AddToScheduler():

void AddToScheduler(uint32_t slot, uint32_t priority) {
    // ... existing code ...
    EnQueue(&MLFQscheduler.queues[priority], slot);
    MLFQscheduler.active_bitmap |= (1U << priority);
    
    // FIX: Set rt_bitmap for RT priorities
    if (priority < RT_PRIORITY_THRESHOLD) {
        MLFQscheduler.rt_bitmap |= (1U << priority);
    }
    
    MLFQscheduler.total_processes++;
}

This explains why your entire RT scheduling has been accidentally broken but seemed to work - everything was falling through to regular queue selection! Want me to check if there are other bitmap inconsistencies?


🧬 Code Graph Analysis Results
  • File: kernel/process/Process.c, lines 705-882
    Summary (docstring-style):

    • Function: FastSchedule(struct Registers* regs)
    • Parameters: regs - pointer to thread/process register state
    • Return: void
    • Behavior: Acquires scheduler_lock, updates scheduling counters, and performs a fairness-by-timeout pass over processes. For each non-idle process, if its wait_time since last scheduled exceeds FAIRNESS_WAIT_THRESHOLD or STARVATION_THRESHOLD, adjusts its priority:
      • If the process is a system process and priority > 0, set priority to 0 (RT) to favor system tasks.
      • If the process is not a system process and priority > RT_PRIORITY_THRESHOLD, clamp its priority to RT_PRIORITY_THRESHOLD (prevent runaway user tasks from staying non-RT while still allowing RT-capable tasks to be scheduled).
    • Notes:
      • The logic explicitly uses RT_PRIORITY_THRESHOLD as a lower bound cap for non-system processes under fairness/ starvation conditions.
      • The function interacts with MLFQscheduler data (quantums, tick counters, last_scheduled_tick, etc.), may call FindBestQueue, AddToScheduler, AstraPreflightCheck, and potentially terminate a process on token corruption.
      • It ultimately performs a context switch by copying context, updating state, and returning via the scheduler flow.
    • Important details relevant to user comment: The RT_PRIORITY_THRESHOLD clamp is triggered when processes exceed wait thresholds, which is the behavior the user pointed to as influencing scheduler switches.
  • File: drivers/Pic.c, lines 67-70
    Complete snippet:
    void PICSendEOI(uint64_t interrupt_number) {
    if (interrupt_number >= 40) outb(PIC2_COMMAND, 0x20);
    outb(PIC1_COMMAND, 0x20);
    }

  • File: drivers/Ide.c, lines 257-261
    Complete snippet:
    void IDEPrimaryIRQH(void) {
    // Primary IDE channel interrupt
    // Read status to acknowledge
    inb(IDE_PRIMARY_BASE + IDE_REG_STATUS);
    }

  • File: drivers/Ide.c, lines 263-267
    Complete snippet:
    void IDESecondaryIRQH(void) {
    // Secondary IDE channel interrupt
    // Read status to acknowledge
    inb(IDE_SECONDARY_BASE + IDE_REG_STATUS);
    }


🧠 Learnings used
Learnt from: assembler-0
PR: assembler-0/VoidFrame#79
File: drivers/PCI/PCI.c:154-154
Timestamp: 2025-08-13T07:26:56.892Z
Learning: VoidFrame kernel uses direct interrupt dispatch in arch/x86_64/interrupts/Interrupts.c rather than RegisterInterruptHandler pattern. Hardware interrupts are handled in switch cases by interrupt number (32+ for IRQs), with proper PIC EOI handling required.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:22:28.816Z
Learning: In VoidFrame kernel, panic functions must call "asm volatile("cli")" as the very first instruction to disable interrupts immediately. Without this, if interrupts are enabled (after "sti"), an interrupt can fire during panic handling causing the panic to fail, hang, or behave unpredictably. This must be done in ALL panic entry points: Panic(), PanicWithCode(), PanicWithContext(), PanicFromInterrupt(), and at the start of KernelPanicHandler().

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:22:28.816Z
Learning: In VoidFrame kernel, panic functions must disable interrupts with "cli" as the very first operation, before any other code executes. If interrupts are enabled (after "sti"), an interrupt can fire between the panic call and the cli instruction in KernelPanicHandler, causing the panic to fail or behave unpredictably.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#95
File: arch/x86_64/interrupts/Interrupts.c:7-7
Timestamp: 2025-08-23T10:11:58.964Z
Learning: In VoidFrame kernel's MLFQ scheduler, user processes should be boosted to RT_PRIORITY_THRESHOLD (priority 3) during anti-starvation handling, not hardcoded priority 0. Priority 0-2 are reserved for system processes (RT queues), while priority 3-4 are for user processes (regular queues). The SmartAging function at line 645 incorrectly uses "uint32_t new_priority = 0;" which breaks privilege separation in the syscall-less kernel design.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#82
File: kernel/process/Process.h:0-0
Timestamp: 2025-08-14T02:11:27.395Z
Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#82
File: kernel/process/Process.h:0-0
Timestamp: 2025-08-14T02:11:58.963Z
Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#80
File: README.md:7-8
Timestamp: 2025-08-13T08:43:54.750Z
Learning: User assembler-0 is working on VoidFrame kernel with a tight deadline (2-3 days before summer ends) and specifically requested runtime safety checks similar to __stack_chk_fail but for deadlocks and other kernel issues. The kernel already has extensive safety mechanisms including stack canaries, guard pages, heap integrity checks, and process security validation.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#70
File: kernel/process/Process.c:1224-1228
Timestamp: 2025-08-11T09:51:32.713Z
Learning: In the VoidFrame kernel (kernel/process/Process.c), avoid using floating-point arithmetic in kernel space. Use fixed-point arithmetic instead with integer scaling (e.g., multiplying by 1000 for 3 decimal places) to avoid FPU state management overhead and potential exceptions.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#70
File: kernel/process/Process.c:1224-1228
Timestamp: 2025-08-11T09:51:51.217Z
Learning: In VoidFrame kernel, avoid using floating-point arithmetic in kernel space functions like SystemService. Use fixed-point arithmetic instead (e.g., 16.16 format with integer operations) to prevent FPU state corruption, improve performance, and ensure predictable behavior. This is especially important for system-level processes running with PROC_PRIV_SYSTEM privileges.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#63
File: kernel/etc/Shell.c:24-28
Timestamp: 2025-08-10T08:52:13.313Z
Learning: The VoidFrame kernel has critical issues with PID allocation - it uses a simple incrementing counter without recycling freed PIDs, causing PID 4 to become PID 5 after death instead of reusing slot 4.

#include "Process.h"
#include "MemOps.h"

// The C-level interrupt handler, called from the assembly stub
void InterruptHandler(Registers* regs) {
Expand All @@ -15,50 +15,43 @@ void InterruptHandler(Registers* regs) {
switch (regs->interrupt_number) {
case 32: // Timer interrupt (IRQ 0)
FastSchedule(regs);
outb(0x20, 0x20); // EOI to master PIC
PICSendEOI(regs->interrupt_number);
return;

case 33: // Keyboard interrupt (IRQ 1)
KeyboardHandler();
outb(0x20, 0x20); // EOI to master PIC
PICSendEOI(regs->interrupt_number);
return;

case 34:
PICSendEOI(regs->interrupt_number);
return;

case 44: // mouse (IRQ 12)
MouseHandler();
outb(0xA0, 0x20); // EOI to slave PIC
outb(0x20, 0x20); // EOI to master PIC
PICSendEOI(regs->interrupt_number);
return;

case 46: // IDE Primary (IRQ 14)
IDEPrimaryIRQH();
outb(0xA0, 0x20); // EOI to slave PIC
outb(0x20, 0x20); // EOI to master PIC
PICSendEOI(regs->interrupt_number);
return;

case 47: // IDE Secondary (IRQ 15)
IDESecondaryIRQH();
outb(0xA0, 0x20); // EOI to slave PIC
outb(0x20, 0x20); // EOI to master PIC
PICSendEOI(regs->interrupt_number);
return;

// Handle other hardware interrupts (34-45)
case 34 ... 43: case 45: // passthrough
case 35 ... 43: case 45: // passthrough
PrintKernelWarning("[IRQ] Unhandled hardware interrupt: ");
PrintKernelInt(regs->interrupt_number - 32);
PrintKernelWarning("\n");

// Send EOI to the appropriate PIC
if (regs->interrupt_number >= 40) {
outb(0xA0, 0x20); // EOI to slave PIC
}
outb(0x20, 0x20); // EOI to master PIC
PICSendEOI(regs->interrupt_number);
return;
}

// Handle CPU exceptions (0-31)

// Buffer for creating descriptive panic messages.
// Made static to reside in .bss rather than on a potentially corrupt stack.
static char panic_message[256];

switch (regs->interrupt_number) {
Expand All @@ -72,6 +65,15 @@ void InterruptHandler(Registers* regs) {
break;
}

case 8:
{
char rip_str[20];
htoa(regs->rip, rip_str);
strcpy(panic_message, "Double Fault at ");
strcat(panic_message, rip_str);
PanicFromInterrupt(panic_message, regs);
}

Comment on lines +68 to +76
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Missing break after Double Fault handler; avoid accidental fall-through

If PanicFromInterrupt ever returns (e.g., in debug builds), execution will fall through to case 13. Add a break for consistency with other cases (6, 13, 14, default).

         case 8:
         {
             char rip_str[20];
             htoa(regs->rip, rip_str);
             strcpy(panic_message, "Double Fault at ");
             strcat(panic_message, rip_str);
             PanicFromInterrupt(panic_message, regs);
+            break;
         }
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
case 8:
{
char rip_str[20];
htoa(regs->rip, rip_str);
strcpy(panic_message, "Double Fault at ");
strcat(panic_message, rip_str);
PanicFromInterrupt(panic_message, regs);
}
case 8:
{
char rip_str[20];
htoa(regs->rip, rip_str);
strcpy(panic_message, "Double Fault at ");
strcat(panic_message, rip_str);
PanicFromInterrupt(panic_message, regs);
break;
}
🤖 Prompt for AI Agents
In arch/x86_64/interrupts/Interrupts.c around lines 68 to 76, the case 8 (Double
Fault) handler calls PanicFromInterrupt but lacks a break, risking fall-through
to case 13 if PanicFromInterrupt returns; add a break statement after the
PanicFromInterrupt(...) call (matching the pattern used in cases 6, 13, 14, and
default) so execution does not accidentally continue into the next case.

case 13: // General Protection Fault
{
char ec_str[20];
Expand Down
22 changes: 1 addition & 21 deletions drivers/APIC.c
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#include "APIC.h"
#include "Console.h"

#include "Io.h"
// APIC base address (will be detected from MSR)
static volatile uint32_t* apic_base = NULL;
static volatile uint32_t* ioapic_base = NULL;
Expand All @@ -10,26 +10,6 @@ static uint32_t ioapic_max_redirections = 0;
// IRQ masking state (same as PIC for compatibility)
static uint32_t irq_mask = 0xFFFFFFFF; // All masked initially

// CPUID detection
static inline void cpuid(uint32_t leaf, uint32_t* eax, uint32_t* ebx, uint32_t* ecx, uint32_t* edx) {
__asm__ volatile("cpuid"
: "=a"(*eax), "=b"(*ebx), "=c"(*ecx), "=d"(*edx)
: "a"(leaf));
}

// MSR access
static inline uint64_t rdmsr(uint32_t msr) {
uint32_t low, high;
__asm__ volatile("rdmsr" : "=a"(low), "=d"(high) : "c"(msr));
return ((uint64_t)high << 32) | low;
}

static inline void wrmsr(uint32_t msr, uint64_t value) {
uint32_t low = value & 0xFFFFFFFF;
uint32_t high = value >> 32;
__asm__ volatile("wrmsr" :: "a"(low), "d"(high), "c"(msr));
}

int ApicDetect(void) {
uint32_t eax, ebx, ecx, edx;

Expand Down
13 changes: 8 additions & 5 deletions drivers/InterruptController.c
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
#include "InterruptController.h"
#include "APIC.h"
#include "Pic.h"
#include "Console.h"
#include "Io.h"
#include "Pic.h"

static interrupt_controller_t current_controller = INTC_PIC;
static int apic_available = 0;
Expand All @@ -13,6 +14,7 @@ void InterruptControllerInstall(void) {
if (ApicDetect()) {
apic_available = 1;
ApicInstall();
PICMaskAll();
current_controller = INTC_APIC;
PrintKernelSuccess("IC: Using APIC interrupt controller\n");
} else {
Expand Down Expand Up @@ -47,15 +49,16 @@ void IC_disable_irq(uint8_t irq_line) {
}
}

void InterruptControllerSendEOI(void) {
void InterruptControllerSendEOI(uint64_t interrupt_number) {
switch (current_controller) {
case INTC_APIC:
ApicSendEOI();
break;
case INTC_PIC:
default:
// PIC EOI is handled directly in interrupt handler
// via outb(0x20, 0x20) / outb(0xA0, 0x20)
if (interrupt_number >= 40) {
outb(0xA0, 0x20); // EOI to slave PIC
}
outb(0x20, 0x20); // EOI to master PIC
break;
}
}
Expand Down
2 changes: 1 addition & 1 deletion drivers/InterruptController.h
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ void InterruptControllerEnable(void);
void InterruptControllerDisable(void);
void IC_enable_irq(uint8_t irq_line);
void IC_disable_irq(uint8_t irq_line);
void InterruptControllerSendEOI(void);
void InterruptControllerSendEOI(uint64_t interrupt_number);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Verification agent

🧩 Analysis chain

API change to InterruptControllerSendEOI(vec) is fine; ensure callsites updated and centralize PIC EOI.

  • Signature change is consistent with PIC-EOI flow. Please verify all callsites now pass the vector.
  • To avoid duplication and keep a single point of truth, route PIC EOI via PICSendEOI() instead of direct outb in the PIC branch of InterruptController.c (the current implementation still does raw outb per snippet).

Run:


🏁 Script executed:

#!/bin/bash
# 1) Ensure all callsites provide an argument now
rg -nP --type=c -C1 '\bInterruptControllerSendEOI\s*\('

# 2) Find lingering direct PIC EOI writes (should be replaced with PICSendEOI)
rg -nP --type=c -C2 'outb\s*\(\s*(0x20|0xA0)\s*,\s*0x20\s*\)'

# 3) Prefer using PICSendEOI in PIC path
rg -nP --type=c 'PICSendEOI\s*\('

Length of output: 1782


Centralize EOI logic through the InterruptControllerSendEOI API

To fully leverage the new InterruptControllerSendEOI(uint64_t interrupt_number) abstraction and eliminate duplicated raw PIC EOI writes, please:

• In drivers/InterruptController.c
– Replace the direct outb(…) calls in the INTC_PIC branch with a single call to PICSendEOI(interrupt_number).
– This ensures PIC-specific behavior remains in one place and is testable via the PICSendEOI driver.

• In arch/x86_64/interrupts/Interrupts.c (and any other arch-specific handlers)
– Swap out all calls to PICSendEOI(regs->interrupt_number) in favor of the higher-level InterruptControllerSendEOI(regs->interrupt_number).
– This funnels all EOI traffic—PIC, APIC, or future controllers—through a single entry point.

• Verify that there are no remaining callsites of InterruptControllerSendEOI( without an argument (and vice versa), and that the compiler no longer issues warnings about unused functions or missing definitions.

Suggested diffs:

--- drivers/InterruptController.c
+++ drivers/InterruptController.c
@@ -57,8 +57,7 @@ void InterruptControllerSendEOI(uint64_t interrupt_number) {
     case INTC_PIC:
-        if (interrupt_number >= 40) {
-            outb(0xA0, 0x20); // EOI to slave PIC
-        }
-        outb(0x20, 0x20); // EOI to master PIC
+        PICSendEOI(interrupt_number);
         break;
--- arch/x86_64/interrupts/Interrupts.c
+++ arch/x86_64/interrupts/Interrupts.c
@@ -18,7 +18,7 @@ void handle_interrupt(struct regs *regs) {
-    PICSendEOI(regs->interrupt_number);
+    InterruptControllerSendEOI(regs->interrupt_number);
🤖 Prompt for AI Agents
In drivers/InterruptController.h around line 18, the project should centralize
EOI handling via InterruptControllerSendEOI(uint64_t interrupt_number); update
drivers/InterruptController.c: in the INTC_PIC branch replace raw outb(...)
sequences with a single call PICSendEOI(interrupt_number) routed through the
public InterruptControllerSendEOI API; update
arch/x86_64/interrupts/Interrupts.c (and any arch-specific handlers) to call
InterruptControllerSendEOI(regs->interrupt_number) instead of
PICSendEOI(regs->interrupt_number); finally scan for and fix any callsites that
pass no argument or call the wrong function signature and rebuild to ensure no
unused/missing definition warnings remain.

void InterruptControllerSetTimer(uint32_t frequency_hz);

// Query functions
Expand Down
10 changes: 10 additions & 0 deletions drivers/Pic.c
Original file line number Diff line number Diff line change
Expand Up @@ -59,6 +59,16 @@ void PIC_disable_irq(uint8_t irq_line) {
pic_write_mask();
}

void PICMaskAll() {
s_irq_mask = 0xFFFF;
pic_write_mask();
}

void PICSendEOI(uint64_t interrupt_number) {
if (interrupt_number >= 40) outb(PIC2_COMMAND, 0x20);
outb(PIC1_COMMAND, 0x20);
}

void PicInstall() {
// Standard initialization sequence
outb(PIC1_COMMAND, ICW1_INIT | ICW1_ICW4);
Expand Down
2 changes: 2 additions & 0 deletions drivers/Pic.h
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,7 @@ void PitInstall();
void PitSetFrequency(uint16_t hz);
void PIC_enable_irq(uint8_t irq_line);
void PIC_disable_irq(uint8_t irq_line);
void PICMaskAll();
void PICSendEOI(uint64_t interrupt_number);

#endif
20 changes: 20 additions & 0 deletions include/Io.h
Original file line number Diff line number Diff line change
Expand Up @@ -53,5 +53,25 @@ static inline void sti(void) {
asm volatile("sti");
}

// CPUID detection
static inline void __attribute__((always_inline)) cpuid(uint32_t leaf, uint32_t* eax, uint32_t* ebx, uint32_t* ecx, uint32_t* edx) {
__asm__ volatile("cpuid"
: "=a"(*eax), "=b"(*ebx), "=c"(*ecx), "=d"(*edx)
: "a"(leaf));
}

// MSR access
static inline __attribute__((always_inline)) uint64_t rdmsr(uint32_t msr) {
uint32_t low, high;
__asm__ volatile("rdmsr" : "=a"(low), "=d"(high) : "c"(msr));
return ((uint64_t)high << 32) | low;
}

static inline __attribute__((always_inline)) void wrmsr(uint32_t msr, uint64_t value) {
uint32_t low = value & 0xFFFFFFFF;
uint32_t high = value >> 32;
__asm__ volatile("wrmsr" :: "a"(low), "d"(high), "c"(msr));
}

#endif

6 changes: 4 additions & 2 deletions kernel/core/Kernel.c
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
#include "ISA.h"
#include "Ide.h"
#include "Idt.h"
#include "Io.h"
#include "KernelHeap.h"
#include "LPT/LPT.h"
#include "MemOps.h"
Expand Down Expand Up @@ -640,15 +641,16 @@ static InitResultT PXS2(void) {
LPT_Init();
PrintKernelSuccess("System: LPT Driver initialized\n");

// NEW: Final memory health check
PrintKernel("Info: Final memory health check...\n");
GetDetailedMemoryStats(&stats);
if (stats.fragmentation_score > 50) {
PrintKernelWarning("[WARN] High memory fragmentation detected\n");
}

// Unmask IRQs
IRQUnmaskCoreSystems();

// Memory protection
StackGuardInit();
SetupMemoryProtection();

Expand Down Expand Up @@ -685,7 +687,7 @@ void KernelMainHigherHalf(void) {
PrintKernelSuccess("System: Kernel initialization complete\n");
PrintKernelSuccess("System: Initializing interrupts...\n");

asm volatile("sti");
sti();

while (1) { // redundant but added for worst case scenario, should not reach here
Yield();
Expand Down
4 changes: 2 additions & 2 deletions kernel/core/Multiboot2.h
Original file line number Diff line number Diff line change
@@ -1,8 +1,8 @@
#ifndef MB2_H
#define MB2_H
#include "stdint.h"
#define MULTIBOOT2_MAGIC_HEADER 0xE85250D6 // in your boot.asm
#define MULTIBOOT2_BOOTLOADER_MAGIC 0x36D76289 // passed to your kernel
#define MULTIBOOT2_MAGIC_HEADER 0xE85250D6
#define MULTIBOOT2_BOOTLOADER_MAGIC 0x36D76289
#define MULTIBOOT2_TAG_TYPE_END 0
#define MULTIBOOT2_TAG_TYPE_CMDLINE 1
#define MULTIBOOT2_TAG_TYPE_BOOTLOADER_NAME 2
Expand Down
12 changes: 6 additions & 6 deletions kernel/core/Panic.c
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
#include "Panic.h"

#include "Io.h"
// The ONLY necessary includes for display are now Console and Serial
#include "Console.h"
#include "KernelHeap.h"
Expand Down Expand Up @@ -61,7 +61,7 @@ static void U32ToDecStr(uint32_t value, char* buffer) {
// --- High-Level Panic Screen Composition ---

void __attribute__((noreturn)) KernelPanicHandler(const char* message, uint64_t error_code, PanicContext* ctx) {
asm volatile("cli");
cli();
ClearScreen();
ConsoleSetColor(VGA_COLOR_WHITE);

Comment on lines 65 to 67
Copy link

@coderabbitai coderabbitai bot Aug 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue

Critical: mixing VBE graphics path with Console text output causes framebuffer corruption during panic.

In the VBE branch, after VBEShowPanic(), the code continues to use PrintKernel/PrintKernelAt and other console-based dumps. Per prior guidance in this codebase, the panic path must choose either pure VBE graphics or pure text console, not both.

Apply this minimal, safe diff to avoid mixing:

  • Move ClearScreen/ConsoleSetColor into the text-mode path only.
  • In graphics mode, log to serial and immediately jump to the hang loop.
@@
-    cli();
-    ClearScreen();
-    ConsoleSetColor(VGA_COLOR_WHITE);
+    cli();
@@
-    if (use_vbe_graphics) {
-        // Pure VBE graphics path - show panic image ONLY
-        VBEShowPanic();
-
-        delay(600000000); // to give "real" look
-        PrintKernel("[FATAL] - [KERNEL PANIC] - ");
-        if (message) PrintKernel(message);
-        else PrintKernel("No message provided.");
-        PrintKernel("\n");
-        if (ctx) {
-            char hex[20];
-            char dec_buffer[12];
-            char temp_buffer[128];
-            temp_buffer[0] = '\0';
-            PrintKernel("[CPU CONTEXT]\n");
-            PrintKernel("----------------------\n");
-            U64ToHexStr(ctx->rip, hex);
-            temp_buffer[0] = '\0'; strcat(temp_buffer, "RIP:  "); strcat(temp_buffer, hex);
-            PrintKernel(temp_buffer);
-            PrintKernel("\n");
-            U64ToHexStr(ctx->rsp, hex);
-            temp_buffer[0] = '\0'; strcat(temp_buffer, "RSP:  "); strcat(temp_buffer, hex);
-            PrintKernel(temp_buffer);
-            PrintKernel("\n");
-            U64ToHexStr(ctx->rbp, hex);
-            temp_buffer[0] = '\0'; strcat(temp_buffer, "RBP:  "); strcat(temp_buffer, hex);
-            PrintKernel(temp_buffer);
-            PrintKernel("\n");
-            U64ToHexStr(error_code, hex);
-            temp_buffer[0] = '\0'; strcat(temp_buffer, "CODE: "); strcat(temp_buffer, hex);
-            PrintKernel(temp_buffer);
-            PrintKernel("\n");
-            // --- Source Location ---
-            PrintKernel("[SOURCE LOCATION]\n");
-            PrintKernel("----------------------\n");
-            if (ctx->file && ctx->file[0] != '\0') {
-                temp_buffer[0] = '\0'; strcat(temp_buffer, "File: "); strcat(temp_buffer, ctx->file);
-                PrintKernel(temp_buffer);
-                PrintKernel("\n");
-                temp_buffer[0] = '\0'; strcat(temp_buffer, "Func: "); strcat(temp_buffer, ctx->function);
-                PrintKernel(temp_buffer);
-                PrintKernel("\n");
-                U32ToDecStr(ctx->line, dec_buffer);
-                temp_buffer[0] = '\0'; strcat(temp_buffer, "Line: "); strcat(temp_buffer, dec_buffer);
-                PrintKernel(temp_buffer);
-                PrintKernel("\n");
-            } else {
-                PrintKernel("Unavailable\n");
-            }
-            PrintKernel("[SYSTEM INFORMATION]\n");
-            PrintKernel("----------------------\n");
-            DumpSchedulerState();
-            DumpPerformanceStats();
-            MemoryStats stats;
-            GetDetailedMemoryStats(&stats);
-            PrintKernel("  Physical: ");
-            PrintKernelInt(stats.free_physical_bytes / (1024*1024));
-            PrintKernel("MB free, ");
-            PrintKernelInt(stats.fragmentation_score);
-            PrintKernel("% fragmented\n");
-            PrintVMemStats();
-            PrintHeapStats();
-            PrintKernelAt("SYSTEM HALTED.", 50, 45);
-            ConsoleSetColor(VGA_COLOR_DEFAULT);
-        }
-    } else {
+    if (use_vbe_graphics) {
+        // Pure VBE graphics path - image only, no console text
+        VBEShowPanic();
+        // Log to serial so the info is still captured
+        SerialWrite("[FATAL] - [KERNEL PANIC] -- ");
+        if (message) SerialWrite(message);
+        SerialWrite("\n");
+        goto hang;
+    } else {
+        ClearScreen();
+        ConsoleSetColor(VGA_COLOR_WHITE);

If you prefer, we can also strip the now-unreachable text block instead of relying on goto.

Do you want me to follow up with a cleanup patch that removes the now-dead code in the VBE branch?

Also applies to: 77-151

🤖 Prompt for AI Agents
In kernel/core/Panic.c around lines 65-67 (and likewise for the VBE-related
block spanning ~77-151), the panic path is mixing VBE graphics output with
console text calls (ClearScreen/ConsoleSetColor/PrintKernel*), which corrupts
the framebuffer; move ClearScreen() and ConsoleSetColor(VGA_COLOR_WHITE) into
the text-mode branch only, and in the VBE graphics branch ensure code calls
VBEShowPanic(), emits any needed serial log, and then immediately jumps to the
hang loop (do not fall through to any console-based dumps); optionally remove
the now-unreachable text-only block in the VBE branch to clean up dead code.

Copy link
Owner Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well theres a reason for that, i want it to overlap, so the stack trace can be viewed

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@assembler-0, I understand now - you intentionally want the stack trace and debugging information to be displayed over the VBE panic image so that critical diagnostic information remains visible even in graphics mode. This is a reasonable design choice for a panic handler where debugging information takes priority over visual presentation.

I'll note this for future reference so I don't flag this pattern again in your panic handling code.


✏️ Learnings added
Learnt from: assembler-0
PR: assembler-0/VoidFrame#95
File: kernel/core/Panic.c:65-67
Timestamp: 2025-08-23T10:06:02.975Z
Learning: In VoidFrame kernel's panic handler, the VBE graphics path intentionally mixes VBEShowPanic() with console text operations (PrintKernel*, PrintKernelAt) to overlay stack traces and debugging information on top of the panic image. This overlap is deliberate to ensure critical diagnostic information remains visible during kernel panics, prioritizing debugging capability over visual presentation.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.


🧠 Learnings used
Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:22:28.816Z
Learning: In VoidFrame kernel, panic functions must call "asm volatile("cli")" as the very first instruction to disable interrupts immediately. Without this, if interrupts are enabled (after "sti"), an interrupt can fire during panic handling causing the panic to fail, hang, or behave unpredictably. This must be done in ALL panic entry points: Panic(), PanicWithCode(), PanicWithContext(), PanicFromInterrupt(), and at the start of KernelPanicHandler().

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:22:28.816Z
Learning: In VoidFrame kernel, panic functions must disable interrupts with "cli" as the very first operation, before any other code executes. If interrupts are enabled (after "sti"), an interrupt can fire between the panic call and the cli instruction in KernelPanicHandler, causing the panic to fail or behave unpredictably.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:03:22.076Z
Learning: In VoidFrame kernel, after VBEShowPanic() draws a panic image to the framebuffer, continuing to use text console operations like PrintKernelAt() causes display corruption because it tries to render text on top of the raw pixel data. The panic handler must choose either pure VBE graphics mode (show image only) or pure text console mode, but not mix both.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:03:22.076Z
Learning: In VoidFrame kernel, mixing VBE graphics mode operations (like VBEShowPanic drawing to framebuffer) with text console operations causes display corruption with random pixels. The panic handler must choose either pure VBE graphics mode or pure text mode, not both simultaneously.

Learnt from: assembler-0
PR: assembler-0/VoidFrame#62
File: kernel/core/Panic.c:69-79
Timestamp: 2025-08-10T06:03:22.076Z
Learning: In VoidFrame kernel's VBEConsoleClear() function, the color is hardcoded to 0x08 (gray on black) instead of using the current console color state, causing color washing issues. The function should use console.color or the current color state to maintain color consistency across screen clears.

Expand Down Expand Up @@ -253,7 +253,7 @@ static inline uint64_t __get_rip(void) {
}

void __attribute__((noreturn)) PanicFromInterrupt(const char* message, Registers* regs) {
asm volatile("cli");
cli();
PanicContext ctx = {0};
ctx.rip = regs->rip;
ctx.rsp = regs->rsp;
Expand All @@ -266,7 +266,7 @@ void __attribute__((noreturn)) PanicFromInterrupt(const char* message, Registers
}

void __attribute__((noreturn)) Panic(const char* message) {
asm volatile("cli");
cli();
PanicContext ctx = {0};
ctx.rip = __get_rip();
asm volatile("movq %%rsp, %0" : "=r"(ctx.rsp));
Expand All @@ -275,7 +275,7 @@ void __attribute__((noreturn)) Panic(const char* message) {
}

void __attribute__((noreturn)) PanicWithCode(const char* message, uint64_t error_code) {
asm volatile("cli");
cli();
PanicContext ctx = {0};
ctx.rip = __get_rip();
asm volatile("movq %%rsp, %0" : "=r"(ctx.rsp));
Expand All @@ -284,7 +284,7 @@ void __attribute__((noreturn)) PanicWithCode(const char* message, uint64_t error
}

void __attribute__((noreturn)) PanicWithContext(const char* message, uint64_t error_code, const char* function, const char* file, int line) {
asm volatile("cli");
cli();
PanicContext ctx = {0};
ctx.rip = __get_rip();
asm volatile("movq %%rsp, %0" : "=r"(ctx.rsp));
Expand Down
Loading