Development #90

assembler-0 · 2025-08-16T05:10:23Z

Summary by CodeRabbit

Refactor
- Optimized low-level memory operations and string/number utilities for faster performance on modern CPUs, improving overall responsiveness and reliability.
Documentation
- Updated roadmap with revised long-term goals, including memory management, userspace runtime, filesystem, drivers, graphics, networking, and improved fault handling; clarified remaining demo milestone.
Style
- Minor comment and spacing fixes for readability.

coderabbitai · 2025-08-16T05:10:28Z

Walkthrough

The PR updates the roadmap document, adjusts comments in two source files, and significantly rewrites kernel/memory/MemOps.c to add CPU feature–aware, alignment-optimized implementations for memset, memcpy, memcmp, zero-page, and string helpers, including a signature change for memset to use a uint64_t size.

Changes

Cohort / File(s)	Summary
Roadmap update `docs/ROADMAP.md`	Reworked “Final Goals”: removed 4 items and added 11 new items; retained “At least one demo program.” No code changes.
Comment tweaks `drivers/LPT/LPT.c`, `kernel/core/Kernel.c`	LPT: spacing in a μs comment. Kernel: removed a guidance comment near memory protection setup. No functional changes.
Memory ops optimization & API change `kernel/memory/MemOps.c`	Added AVX2/SSE2 and alignment-aware fast paths for memset, memcpy, memcmp, zero-page; optimized strcpy/strcat/htoa/itoa; changed memset signature to use uint64_t size; introduced feature checks and fallbacks.

Sequence Diagram(s)

sequenceDiagram
  participant Caller
  participant MemOps as memset()
  participant CPU as CPU Feature Check
  participant AVX2 as AVX2 Path
  participant SSE2 as SSE2 Path
  participant Fallback as Fallback/Scalar

  Caller->>MemOps: memset(dest, val, size)
  MemOps->>CPU: Detect AVX2/SSE2
  alt AVX2 available and size ≥ 32
    MemOps->>AVX2: Broadcast val, 256-bit stores
    AVX2-->>MemOps: vzeroupper
  else SSE2 available and size ≥ 16
    MemOps->>SSE2: Broadcast val, 128-bit stores
  else
    MemOps->>Fallback: 64-bit/aligned then bytes
  end
  MemOps-->>Caller: dest

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~55 minutes

Possibly related PRs

Development #70 — Also modifies MemOps.c around adding/exporting a memset wrapper; overlaps with this PR’s memset signature change.
Development #52 — Touches MemOps.c string/memory utilities; related to refactors/removals in the same area.
FCKED #15 — Shifts code to use FastMemset/FastMemcpy; depends on or intersects with the optimized memory APIs introduced here.

Poem

A nibble, a hop, through caches I race,
With AVX winds, I fill every space.
Strings hum quicker, pages turn light,
SSE moonbeams in kernel night.
Roadmaps rewritten, goals take flight—
A rabbit approves: fast bytes, just right. 🐇✨

Tip

🔌 Remote MCP (Model Context Protocol) integration is now available!

Pro plan users can now connect to remote MCP servers from the Integrations page. Connect with popular remote MCPs such as Notion and Linear to add more context to your reviews and chats.

✨ Finishing Touches

📝 Generate Docstrings

🧪 Generate unit tests

Create PR with unit tests
Post copyable unit tests in a comment
Commit unit tests in branch Development

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

🪧 Tips

Chat

There are 3 ways to chat with CodeRabbit:

Review comments: Directly reply to a review comment made by CodeRabbit. Example:
- I pushed a fix in commit <commit_id>, please review it.
- Open a follow-up GitHub issue for this discussion.
Files and specific lines of code (under the "Files changed" tab): Tag @coderabbitai in a new review comment at the desired location with your query.
PR comments: Tag @coderabbitai in a new PR comment to ask questions about the PR branch. For the best results, please provide a very specific query, as very limited context is provided in this mode. Examples:
- @coderabbitai gather interesting stats about this repository and render them as a table. Additionally, render a pie chart showing the language distribution in the codebase.
- @coderabbitai read the files in the src/scheduler package and generate a class diagram using mermaid and a README in the markdown format.

Support

Need help? Create a ticket on our support page for assistance with any issues or questions.

CodeRabbit Commands (Invoked using PR/Issue comments)

Type @coderabbitai help to get the list of available commands.

Other keywords and placeholders

Add @coderabbitai ignore anywhere in the PR description to prevent this PR from being reviewed.
Add @coderabbitai summary to generate the high-level summary at a specific location in the PR description.
Add @coderabbitai anywhere in the PR title to generate the title automatically.

CodeRabbit Configuration File (`.coderabbit.yaml`)

You can programmatically configure CodeRabbit by adding a .coderabbit.yaml file to the root of your repository.
Please see the configuration documentation for more information.
If your editor has YAML language server enabled, you can add the path at the top of this file to enable auto-completion and validation: # yaml-language-server: $schema=https://coderabbit.ai/integrations/schema.v2.json

Status, Documentation and Community

Visit our Status Page to check the current availability of CodeRabbit.
Visit our Documentation for detailed information on how to use CodeRabbit.
Join our Discord Community to get help, request features, and share feedback.
Follow us on X/Twitter for updates and announcements.

coderabbitai

Actionable comments posted: 2

🧹 Nitpick comments (5)

drivers/LPT/LPT.c (1)
57-64: Fix strobe bit index inconsistency in comments

LPT_CONTROL_STROBE is 0x01 (bit 0), but Line 57 says “bit 1”. Line 63 says “bit 0”. Align the comments for clarity.
-    // Assert strobe (bit 1 -> pin LOW due to inversion)
+    // Assert strobe (bit 0 -> pin LOW due to inversion)
 ...
-    // Release strobe (bit 0 -> pin HIGH)
+    // Release strobe (bit 0 -> pin HIGH)
docs/ROADMAP.md (1)
122-132: Tighten wording and fix minor grammar/style; resolve duplication with earlier sections

“setup” → “set up”

Prefer sentence-style capitalization: “Better fault handling”

“Framebuffer graphics support” is already marked done in “Extra Spice”. Consider reframing this final-goal item to avoid duplication.

“Virtual files” is ambiguous; “Virtual filesystems” is clearer.

Proposed edits:
- [ ] Usable VMM & PMM
- [ ] Userspace C runtime (libc-lite)
- [ ] XFS support
- [ ] Install and setup to disk
- [ ] Framebuffer graphics support
- [ ] Virtual files
- [ ] HPET/APIC support
- [ ] ACPI support
- [ ] Modern drivers
- [ ] Networking
- [ ] Better Fault handling
+ [ ] Usable VMM & PMM
+ [ ] Userspace C runtime (libc-lite)
+ [ ] XFS support
+ [ ] Install and set up to disk
+ [ ] Advanced framebuffer features (modesetting, double-buffering)
+ [ ] Virtual filesystems
+ [ ] HPET/APIC support
+ [ ] ACPI support
+ [ ] Modern drivers
+ [ ] Networking
+ [ ] Better fault handling
Follow-up:

If “Advanced framebuffer features” isn’t intended, alternatively remove that line to avoid duplicating the already-completed checklist item above. Do you want me to adjust the wording based on your intended scope?
kernel/memory/MemOps.c (3)
14-21: Use branchless zero-byte detection to reduce branches in strcpy fast path

The eight per-byte checks introduce branchy code. A standard branchless test is faster and simpler.
-            if ((val & 0xFF00000000000000ULL) == 0 ||
-                (val & 0x00FF000000000000ULL) == 0 ||
-                (val & 0x0000FF0000000000ULL) == 0 ||
-                (val & 0x000000FF00000000ULL) == 0 ||
-                (val & 0x00000000FF000000ULL) == 0 ||
-                (val & 0x0000000000FF0000ULL) == 0 ||
-                (val & 0x000000000000FF00ULL) == 0 ||
-                (val & 0x00000000000000FFULL) == 0) {
+            // HasZeroByte trick:
+            // ((x - 0x0101...) & ~x & 0x8080...) != 0  => some byte is zero
+            if (((val - 0x0101010101010101ULL) & ~val & 0x8080808080808080ULL) != 0) {
147-152: Align then 8-byte fill in memset small/medium path

Currently 8-byte stores only happen when d is already 8-byte aligned. You can cheaply align by byte stores, then stream 8-byte stores for the bulk.
-        while (size >= 8 && ((uintptr_t)d & 7) == 0) {
-            *(uint64_t*)d = val64;
-            d += 8;
-            size -= 8;
-        }
+        while (((uintptr_t)d & 7) != 0 && size > 0) {
+            *d++ = val;
+            size--;
+        }
+        while (size >= 8) {
+            *(uint64_t*)d = val64;
+            d += 8;
+            size -= 8;
+        }
221-233: Nice 64-byte unrolling; consider handling unaligned src too

If src isn’t 8-byte aligned, you skip the 64-bit bulk path entirely. x86-64 tolerates unaligned 64-bit loads; you could still use 64-bit copies for most payloads with minimal penalty.

If you want, I can draft a variant that uses unaligned 64-bit loads when d is aligned, falling back to 4B/1B only for the tail.

📜 Review details

Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro

💡 Knowledge Base configuration:

MCP integration is disabled by default for public repositories
Jira integration is disabled by default for public repositories
Linear integration is disabled by default for public repositories

You can enable these sources in your CodeRabbit configuration.

📥 Commits

Reviewing files that changed from the base of the PR and between 1537513 and 2d89e0c.

📒 Files selected for processing (4)

docs/ROADMAP.md (1 hunks)
drivers/LPT/LPT.c (1 hunks)
kernel/core/Kernel.c (0 hunks)
kernel/memory/MemOps.c (6 hunks)

💤 Files with no reviewable changes (1)

kernel/core/Kernel.c

🧰 Additional context used

🧠 Learnings (2)

📚 Learning: 2025-08-14T02:11:27.395Z

Learnt from: assembler-0
PR: assembler-0/VoidFrame#82
File: kernel/process/Process.h:0-0
Timestamp: 2025-08-14T02:11:27.395Z
Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.

Applied to files:

kernel/memory/MemOps.c

📚 Learning: 2025-08-14T02:11:58.963Z

Learnt from: assembler-0
PR: assembler-0/VoidFrame#82
File: kernel/process/Process.h:0-0
Timestamp: 2025-08-14T02:11:58.963Z
Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.

Applied to files:

kernel/memory/MemOps.c

🧬 Code Graph Analysis (1)

kernel/memory/MemOps.c (1)

arch/x86_64/cpu/Cpu.c (1)

GetCpuFeatures (40-42)

🪛 LanguageTool

docs/ROADMAP.md

[grammar] ~122-~122: There might be a mistake here.
Context: ...GRUB2 EFI support - [ ] Usable VMM & PMM - [ ] Userspace C runtime (libc-lite) - [ ...

(QB_NEW_EN)

[grammar] ~123-~123: There might be a mistake here.
Context: ...MM - [ ] Userspace C runtime (libc-lite) - [ ] XFS support - [ ] Install and setup ...

(QB_NEW_EN)

[grammar] ~124-~124: There might be a mistake here.
Context: ... C runtime (libc-lite) - [ ] XFS support - [ ] Install and setup to disk - [ ] Fram...

(QB_NEW_EN)

[grammar] ~125-~125: There might be a mistake here.
Context: ... support - [ ] Install and setup to disk - [ ] Framebuffer graphics support - [ ] V...

(QB_NEW_EN)

[grammar] ~126-~126: There might be a mistake here.
Context: ... disk - [ ] Framebuffer graphics support - [ ] Virtual files - [ ] HPET/APIC suppor...

(QB_NEW_EN)

[grammar] ~127-~127: There might be a mistake here.
Context: ...fer graphics support - [ ] Virtual files - [ ] HPET/APIC support - [ ] ACPI support...

(QB_NEW_EN)

[grammar] ~128-~128: There might be a mistake here.
Context: ... ] Virtual files - [ ] HPET/APIC support - [ ] ACPI support - [ ] Modern drivers - ...

(QB_NEW_EN)

[grammar] ~129-~129: There might be a mistake here.
Context: ...[ ] HPET/APIC support - [ ] ACPI support - [ ] Modern drivers - [ ] Networking - [ ...

(QB_NEW_EN)

[grammar] ~130-~130: There might be a mistake here.
Context: ... - [ ] ACPI support - [ ] Modern drivers - [ ] Networking - [ ] Better Fault handli...

(QB_NEW_EN)

[grammar] ~131-~131: There might be a mistake here.
Context: ...rt - [ ] Modern drivers - [ ] Networking - [ ] Better Fault handling - [ ] At least...

(QB_NEW_EN)

[grammar] ~132-~132: There might be a mistake here.
Context: ...] Networking - [ ] Better Fault handling - [ ] At least one demo program

(QB_NEW_EN)

🔇 Additional comments (6)

drivers/LPT/LPT.c (1)

60-61: LGTM on comment tweak

The added space in the unit comment improves readability. No functional changes.

kernel/memory/MemOps.c (5)

6-34: Sanity check: word-wise reads can cross page boundaries

The 64-bit read path can read up to 7 bytes past the first NUL within the current 8-byte word. If a string terminates at a page end and the next page is unmapped, the word read can fault. If this function is ever used on memory that may be adjacent to guard pages, consider guarding against cross-page reads or limiting the fast path near page boundaries.

Do we guarantee kernel strings come from fully-mapped regions (including the trailing bytes within the 8-byte word)? If not, I can propose a guarded path that falls back to byte-wise copy when near a page boundary.

38-41: LGTM: strcat reuses optimized strcpy

Looks correct and keeps logic simple.

44-65: LGTM: htoa unroll

Produces fixed-width 0x-prefixed 16-hex-digit strings; buffer[18] terminator is correct.

75-86: LGTM: itoa with div-only digit extraction

Handles zero, maximum 20 digits for uint64, and reuses strcpy. No issues spotted.

301-333: LGTM: FastMemcmp aligned-chunk path

Semantics align with memcmp: returns negative/positive on first differing byte; byte-wise tail is correct. Looks good.

coderabbitai · 2025-08-16T05:18:25Z

kernel/memory/MemOps.c

+void* memset(void* dest, int value, unsigned long size) {
    return FastMemset(dest, value, size);
 }


💡 Verification agent

🧩 Analysis chain

memset signature mismatch with PR intent; unify on uint64_t

The wrapper still declares size as unsigned long, but PR summary says memset now uses uint64_t. This can cause prototype/ABI mismatches with MemOps.h and call sites.

Recommended change:

-void* memset(void* dest, int value, unsigned long size) { +void* memset(void* dest, int value, uint64_t size) { return FastMemset(dest, value, size); }

Follow-up:

Ensure MemOps.h and all extern declarations use the same signature.

Please run the script below to locate and review all prototypes/usages.

🏁 Script executed:

#!/bin/bash # Verify memset/fast memset prototypes and call sites rg -nP -C2 'void\s*\*\s*memset\s*\(' rg -nP -C2 'void\s*\*\s*FastMemset\s*\(' rg -nP -C2 '\bmemset\s*\(' | sed 's/^/call: /' # Check for size type mismatches in headers fd MemOps.h | xargs -I{} rg -nC2 'memset\s*\(' {}

Length of output: 2142

🏁 Script executed:

#!/bin/bash head -n 20 kernel/memory/MemOps.c

Length of output: 851

Unify memset signature to uint64_t for ABI consistency

The memset wrapper currently uses unsigned long size while FastMemset and the PR summary expect uint64_t. Update both the definition and prototype to avoid mismatches:

• kernel/memory/MemOps.c (around line 90)

- void* memset(void* dest, int value, unsigned long size) { + void* memset(void* dest, int value, uint64_t size) { return FastMemset(dest, value, size); }

• kernel/memory/MemOps.h (line 10)

- void* memset(void* dest, int value, unsigned long size); + void* memset(void* dest, int value, uint64_t size);

Follow-up:

Verify no remaining declarations or call sites use unsigned long for this wrapper.

Ensure all externs, headers, and any inline usages match the updated signature.

🤖 Prompt for AI Agents

In kernel/memory/MemOps.c around lines 90-92, the memset wrapper uses unsigned long for size but FastMemset and the ABI expect uint64_t; change the function definition to use uint64_t size and update the prototype in kernel/memory/MemOps.h (line 10) to match. Also search the codebase for any remaining declarations, prototypes, externs, headers, or call sites that use unsigned long for this wrapper and update them to uint64_t, ensure you include <stdint.h> where needed, and rebuild to confirm no mismatches remain.

coderabbitai · 2025-08-16T05:18:25Z

kernel/memory/MemOps.c

+    // Use AVX2 if available for even better performance
+    if (features->avx2 && size >= 32) {
+        // Create 256-bit value
+        uint64_t val64 = 0x0101010101010101ULL * val;
+
+        asm volatile(
+            "vmovq %0, %%xmm0\n"
+            "vpbroadcastq %%xmm0, %%ymm0\n"
+            :
+            : "r"(val64)
+            : "xmm0", "ymm0"
+        );
+
+        while (size >= 32) {
+            asm volatile("vmovdqu %%ymm0, (%0)" : : "r"(d) : "memory");
+            d += 32;
+            size -= 32;
+        }
+
+        // Clean up YMM registers
+        asm volatile("vzeroupper" ::: "memory");
+    }


💡 Verification agent

🧩 Analysis chain

Unsafe SIMD usage in kernel paths without FPU/SIMD state management

Using XMM/YMM in kernel memory ops is risky unless the OS sets up FPU/SSE/AVX context management (CR0/CR4 flags, XSAVE/FXSAVE, XCR0 gating) and prevents corruption across interrupts. Retrieved learnings from prior PRs indicate crashes and corruption when SSE was used in memcpy without saving state or masking interrupts.

Action options (preferred first):

Disable SIMD fast paths by default and use 64-bit GP-register loops; keep SIMD behind a compile-time flag meant for environments with proper FPU context handling.

If retaining SIMD:

Gate AVX2 path with OSXSAVE and XCR0 checks: AVX usable iff CPUID.OSXSAVE=1 and (XCR0.XMM|XCR0.YMM) are set. SSE path must ensure FPU enabled in CR0 and that the kernel saves XMM state on interrupt/ctxswitch.

Either save/restore FPU state around these ops or ensure they only run with interrupts disabled and in non-preemptible contexts. Note: disabling interrupts around large copies is generally unacceptable due to latency.

Consider mfence/sfence only if using non-temporal stores (not currently used).

Proposed structure (compile-time guard):

- if (features->avx2 && size >= 32) { +#if defined(CONFIG_ENABLE_KERNEL_SIMD_MEMOPS) + if (features->avx2 && size >= 32 && CpuAvxUsableInKernel()) { ... - } - else if (features->sse2 && size >= 16) { + } else if (features->sse2 && CpuSseUsableInKernel() && size >= 16) { ... - } + } +#endif

And add helpers (outside this hunk, in a suitable .c/.h):

static inline uint64_t ReadXcr0(void) { uint32_t eax, edx; asm volatile("xgetbv" : "=a"(eax), "=d"(edx) : "c"(0)); return ((uint64_t)edx << 32) | eax; } static inline bool CpuAvxUsableInKernel(void) { CpuFeatures* f = GetCpuFeatures(); if (!f->avx2 || !f->osxsave) return false; uint64_t xcr0 = ReadXcr0(); return ((xcr0 & 0x6) == 0x6); // XMM (bit1), YMM (bit2) } static inline bool CpuSseUsableInKernel(void) { CpuFeatures* f = GetCpuFeatures(); // Also ensure kernel has enabled FPU/SSE and manages XMM state. return f->sse2 && f->fxsave_enabled; }

Verification script to check for existing FPU/SIMD state management:

I can provide a SIMD-free FastMemcpy/FastMemset variant using unrolled 64-bit GP stores if you want to keep performance without touching FPU state.

Also applies to: 125-142, 170-192, 193-213, 268-276, 278-290

🏁 Script executed:

#!/bin/bash # Look for XSAVE/FXSAVE usage and OSXSAVE/XCR0 checks rg -nP -C3 'xsave|xrstor|fxsave|fxrstor|xgetbv|XCR0|OSXSAVE|CR0|CR4|TS|kernel_fpu|preempt_disable|cli|sti'

Length of output: 54857

Kernel SIMD Memory Ops Require FPU/SIMD Context Management

The AVX2/SSE2 fast paths in kernel/memory/MemOps.c (e.g., lines 103–124, 125–142, 170–192, 193–213, 268–276, 278–290) invoke XMM/YMM instructions without any CPU context or FPU state handling. This can corrupt the XMM/YMM registers on interrupts or context switches, leading to memory corruption and crashes under load.

Action items:

Wrap all SIMD memcpy/set loops in a compile-time guard and runtime checks that ensure AVX/SSE is usable in kernel context.

Fall back to an unrolled 64-bit general-purpose register loop when SIMD cannot be safely used.

Provide helpers to read XCR0 and verify OSXSAVE, SSE/AVX state bits, for example:

static inline uint64_t ReadXcr0(void) { uint32_t eax, edx; asm volatile("xgetbv" : "=a"(eax), "=d"(edx) : "c"(0)); return ((uint64_t)edx << 32) | eax; } static inline bool CpuAvxUsableInKernel(CpuFeatures* f) { if (!f->osxsave || !f->avx2) return false; uint64_t xcr0 = ReadXcr0(); return (xcr0 & (1ULL<<1 | 1ULL<<2)) == (1ULL<<1 | 1ULL<<2); } static inline bool CpuSseUsableInKernel(CpuFeatures* f) { return f->sse2 && f->fxsave_enabled; }

Suggested hunk around the AVX2/SSE2 blocks:

- if (features->avx2 && size >= 32) { +#if defined(CONFIG_ENABLE_KERNEL_SIMD_MEMOPS) + if (features->avx2 && size >= 32 && CpuAvxUsableInKernel(GetCpuFeatures())) { // AVX2 broadcast + vmovdqu loop } else if (features->sse2 && size >= 16) { +# if CpuSseUsableInKernel(GetCpuFeatures()) // SSE2 movdqu loop - } + #endif +#endif

Alternatively, disable SIMD paths unconditionally and use a carefully unrolled 64-bit GP-register fallback to maintain interrupt safety without FPU state management.

i see, also we have a weird issue with VMemMapMMIO, AGAIN, so the first run:
VMemMapMMIO: Mapping MMIO 0xFEBD0000 -> 0xFFFF800000040000 (size: 0x4000)
VMemMapMMIO: Mapped page 0 - PTE[64] = 0xFEBD001B
VMemMapMMIO: Mapped page 1 - PTE[65] = 0xFEBD101B
VMemMapMMIO: Mapped page 2 - PTE[66] = 0xFEBD201B
VMemMapMMIO: Mapped page 3 - PTE[67] = 0xFEBD301B
[ SUCCESS ] VMemMapMMIO: Successfully mapped 4 pages
and the second run
VMemMapMMIO: Mapping MMIO 0xFEBD0000 -> 0xFFFF800000050000 (size: 0x4000) // also isnt it supposed to be the same as above since the address above is unmapped already
VMemMapMMIO: Mapped page 0 - PTE[80] = 0xFEBD001B
VMemMapMMIO: Mapped page 1 - PTE[81] = 0xFEBD101B
VMemMapMMIO: Mapped page 2 - PTE[82] = 0xFEBD201B
VMemMapMMIO: Mapped page 3 - PTE[83] = 0xFEBD301B
// DIES!

no kernel panic, just dies

it was the xHCI controller

🧩 Analysis chain

@assembler-0, this looks like a virtual address allocation/unmapping issue. The fact that the second run gets a different virtual address (0xFFFF800000050000 vs 0xFFFF800000040000) suggests the first mapping wasn't properly unmapped or the virtual address allocator isn't reusing freed ranges.

Let me investigate the VMemMapMMIO and VMemUnmapMMIO implementations:

🏁 Script executed:

#!/bin/bash # Find VMemMapMMIO and VMemUnmapMMIO implementations echo "=== Searching for VMemMapMMIO implementation ===" ast-grep --pattern 'VMemMapMMIO($$$) { $$$ }' echo -e "\n=== Searching for VMemUnmapMMIO implementation ===" ast-grep --pattern 'VMemUnmapMMIO($$$) { $$$ }' echo -e "\n=== Looking for virtual address allocation logic ===" rg -nP -C3 'virtual.*address.*alloc|virt.*addr.*alloc|mmio.*alloc' echo -e "\n=== Checking for virtual address tracking structures ===" rg -nP -C3 'mmio.*map|virtual.*range|virt.*range'

Length of output: 2123

Let me search more broadly for these functions and examine the VMem implementation:

🏁 Script executed:

#!/bin/bash # Search for VMemMapMMIO function more broadly echo "=== Searching for VMemMapMMIO function ===" rg -nP -C5 'VMemMapMMIO' echo -e "\n=== Searching for VMemUnmapMMIO function ===" rg -nP -C5 'VMemUnmapMMIO' echo -e "\n=== Looking at VMem.c implementation ===" cat kernel/memory/VMem.c | head -200 echo -e "\n=== Searching for next_vaddr usage ===" rg -nP -C3 'next_vaddr'

Length of output: 20525

@assembler-0, I found the issue! The problem is in the virtual address space management, not the page table handling itself.

Root Cause:
VMemMapMMIO expects a pre-allocated virtual address, but there's no mechanism to reclaim virtual address space when unmapping:

First run: xHCI calls VMemAlloc() → gets 0xFFFF800000040000, then VMemMapMMIO() maps it

VMemUnmapMMIO: Properly clears page table entries but doesn't free the virtual address space

Second run: VMemAlloc() allocates a new address 0xFFFF800000050000 because kernel_space.next_vaddr only moves forward

Crash: The linear allocator has no free list - virtual addresses are never reclaimed

The Issue:

// VMemAlloc - only moves forward const uint64_t vaddr = kernel_space.next_vaddr; // Gets new address kernel_space.next_vaddr = vaddr + size; // Never goes backward // VMemUnmapMMIO - clears pages but no virtual address reclaim // No code to mark virtual address as free for reuse

Solution Options:

Quick fix: Add virtual address tracking to VMemUnmapMMIO

Better fix: Implement a proper virtual address space allocator with free lists

For the quick fix, you could add a simple free range tracker to reclaim unmapped virtual addresses. The page table logic itself looks correct based on the retrieved learnings about using the same access pattern.

Would you like me to help design a virtual address free list, or would you prefer to implement a different approach?

ﾉ(◕ヮ◕)ﾉ*:･ﾟ✧

🧬 Code Graph Analysis Results

File: arch/x86_64/cpu/Cpu.c (lines 40-42)
CpuFeatures* GetCpuFeatures(void) {
return &cpu_features;
}

🧠 Learnings used

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:58.963Z Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:27.395Z Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#79 File: drivers/PCI/PCI.c:154-154 Timestamp: 2025-08-13T07:19:55.274Z Learning: VMemUnmapMMIO in VoidFrame kernel should use the same page table access pattern as the working VMemUnmap function: `uint64_t* pt_table = (uint64_t*)(pt_phys + KERNEL_VIRTUAL_OFFSET);` rather than PHYS_TO_VIRT macro or identity mapping checks.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#47 File: kernel/core/Kernel.c:434-434 Timestamp: 2025-07-16T01:04:23.790Z Learning: VMemAlloc() in VoidFrame kernel fails with error code -2 (VMEM_ERROR_INVALID_ADDR) due to virtual address space layout conflict where the heap start address (_bss_end + KERNEL_VIRTUAL_OFFSET + 1MB margin) exceeds VIRT_ADDR_SPACE_END (0xFFFFFFFF80000000ULL), causing VMemMap validation to fail.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#80 File: README.md:7-8 Timestamp: 2025-08-13T08:43:54.750Z Learning: User assembler-0 is working on VoidFrame kernel with a tight deadline (2-3 days before summer ends) and specifically requested runtime safety checks similar to __stack_chk_fail but for deadlocks and other kernel issues. The kernel already has extensive safety mechanisms including stack canaries, guard pages, heap integrity checks, and process security validation.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#53 File: kernel/core/Kernel.c:256-259 Timestamp: 2025-08-09T04:47:56.883Z Learning: The VoidFrame kernel memory crash occurs when VIRT_ADDR_SPACE_START is set to 0xFFFF800000000000ULL instead of 0xFFFFFFFF80000000ULL, causing the heap to allocate 128TB away from where the kernel is mapped, resulting in page faults when accessing heap memory.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#57 File: kernel/memory/VMem.h:64-66 Timestamp: 2025-08-09T09:29:00.630Z Learning: In the VoidFrame kernel, removing locks from KernelMemoryAlloc/KernelMemoryFree causes heap corruption where shell command strings (like "clear" = 0x61656C63) overwrite heap block magic values due to race conditions between allocation and free operations.

🧩 Analysis chain

@assembler-0, understood! The xHCI controller being the culprit makes sense - those controllers can be particularly finicky with MMIO mappings and timing-sensitive operations.

The SIMD safety concerns I raised for the memory operations are still valid for the kernel's long-term stability, but they're separate from the immediate xHCI/VMemMapMMIO issue you were debugging.

Regarding the virtual address difference you noticed (0xFFFF800000040000 vs 0xFFFF800000050000), that could indicate either the previous mapping wasn't fully cleaned up or the virtual allocator is intentionally advancing to avoid potential issues.

Have you resolved the xHCI controller issue, or would you like me to investigate further into the controller's MMIO usage patterns?

🏁 Script executed:

#!/bin/bash # Look for xHCI controller implementation and MMIO usage rg -nP -C5 'xHCI|xhci|XHCI' --type=c

Length of output: 58220

@assembler-0, that makes perfect sense! The xHCI controller's MMIO mapping sequence is indeed complex and could easily cause the VMemMapMMIO crashes you observed.

Looking at the xHCI code, I can see several potential issues:

Repeated initialization: Both xHCIInit() and xHCIEnumerate() call xHCIControllerInit(), which could cause conflicts if the same controller is initialized multiple times without proper cleanup.

MMIO cleanup issues: The virtual address advancing from 0xFFFF800000040000 to 0xFFFF800000050000 suggests the previous mapping wasn't fully cleaned up, so the allocator moved to the next available region.

Controller state: The xHCI driver performs a complex sequence: VMemAlloc → VMemUnmap → VMemMapMMIO, and any failure in this chain could leave the system in an inconsistent state.

The SIMD safety concerns I raised for the memory operations are still valid for long-term kernel stability, but they're separate from this immediate xHCI/MMIO issue.

Have you resolved the xHCI controller issue, or would you like me to investigate the specific MMIO mapping conflicts in the xHCI driver? I could look into potential race conditions or cleanup issues in the controller initialization sequence.

🧬 Code Graph Analysis Results

arch/x86_64/cpu/Cpu.c (lines 40-42)
CpuFeatures* GetCpuFeatures(void) {
return &cpu_features;
}

🧠 Learnings used

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:58.963Z Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#80 File: README.md:7-8 Timestamp: 2025-08-13T08:43:54.750Z Learning: User assembler-0 is working on VoidFrame kernel with a tight deadline (2-3 days before summer ends) and specifically requested runtime safety checks similar to __stack_chk_fail but for deadlocks and other kernel issues. The kernel already has extensive safety mechanisms including stack canaries, guard pages, heap integrity checks, and process security validation.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:27.395Z Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#47 File: kernel/core/Kernel.c:434-434 Timestamp: 2025-07-16T01:04:23.790Z Learning: VMemAlloc() in VoidFrame kernel fails with error code -2 (VMEM_ERROR_INVALID_ADDR) due to virtual address space layout conflict where the heap start address (_bss_end + KERNEL_VIRTUAL_OFFSET + 1MB margin) exceeds VIRT_ADDR_SPACE_END (0xFFFFFFFF80000000ULL), causing VMemMap validation to fail.

oh yeah i would be interested in a permanent and global fix, (the problem is kind of similar to the pid problem, but this time, one wrong move and we all gone)

🧩 Analysis chain

@assembler-0, I need to examine the VMemMapMMIO/VMemUnmapMMIO implementation to understand why the virtual addresses are different and why the system is dying. Let me gather the relevant code:

🏁 Script executed:

#!/bin/bash # Search for VMemMapMMIO and VMemUnmapMMIO implementations echo "=== VMemMapMMIO implementation ===" ast-grep --pattern 'VMemMapMMIO($$$) { $$$ }' echo -e "\n=== VMemUnmapMMIO implementation ===" ast-grep --pattern 'VMemUnmapMMIO($$$) { $$$ }' echo -e "\n=== MMIO virtual address allocation logic ===" rg -nP -C5 'MMIO.*virtual|virtual.*MMIO|mmio.*alloc|alloc.*mmio' echo -e "\n=== Page table access patterns in MMIO functions ===" rg -nP -C3 'pt_table.*KERNEL_VIRTUAL_OFFSET|PHYS_TO_VIRT.*pt|pt.*PHYS_TO_VIRT'

Length of output: 5259

Let me search more specifically for these functions and examine the VMem implementation:

🏁 Script executed:

#!/bin/bash # Search for VMemMapMMIO and VMemUnmapMMIO more broadly echo "=== Searching for VMemMapMMIO function ===" rg -nP -A20 'VMemMapMMIO\s*$' --type=c echo -e "\n=== Searching for VMemUnmapMMIO function ===" rg -nP -A20 'VMemUnmapMMIO\s*\(' --type=c echo -e "\n=== Look at VMem.c structure ===" rg -nP '^(int|void).*VMemMap|^(int|void).*VMemUnmap' kernel/memory/VMem.c echo -e "\n=== Virtual address allocation logic ===" rg -nP -B5 -A10 'VMemAlloc.*size|mmio.*base.*VMemAlloc' --type=c

Length of output: 25232

Now let me examine the complete VMemUnmapMMIO implementation to identify the issue:

🏁 Script executed:

#!/bin/bash # Get the complete VMemUnmapMMIO implementation echo "=== VMemUnmapMMIO complete implementation ===" rg -nP -A50 'void VMemUnmapMMIO\(uint64_t vaddr, uint64_t size$' kernel/memory/VMem.c echo -e "\n=== Compare with VMemUnmap implementation ===" rg -nP -A30 'int VMemUnmap$uint64_t vaddr, uint64_t size$' kernel/memory/VMem.c echo -e "\n=== Look for page table access patterns ===" rg -nP 'pt_table.*=.*$' kernel/memory/VMem.c

Length of output: 4096

Let me get the complete VMemUnmapMMIO implementation to see what's happening after the page table access:

🏁 Script executed:

#!/bin/bash # Get the complete VMemUnmapMMIO implementation with more lines echo "=== Complete VMemUnmapMMIO implementation ===" rg -nP -A70 'void VMemUnmapMMIO\(uint64_t vaddr, uint64_t size$' kernel/memory/VMem.c echo -e "\n=== Check VMemAlloc implementation to see why virtual addresses differ ===" rg -nP -A20 'kernel_space\.next_vaddr' kernel/memory/VMem.c

Length of output: 5956

@assembler-0, I found the issue! There are actually two separate problems here:

1. Different Virtual Addresses (Expected Behavior)
The different virtual addresses (0xFFFF800000040000 vs 0xFFFF800000050000) are actually expected behavior. Your VMemAlloc is a simple bump allocator that only moves kernel_space.next_vaddr forward - it never reuses freed virtual address ranges. When you unmap MMIO, the page table entries are cleared but VMemAlloc doesn't track freed ranges for reuse.

2. System Death (The Real Bug)
The system death is caused by the page table access pattern in VMemUnmapMMIO. Based on your retrieved learning, VMemUnmapMMIO should use the same pattern as VMemUnmap. Looking at line 685:

pt_table = (uint64_t*)PHYS_TO_VIRT(pt_phys);

This should be:

pt_table = (uint64_t*)(pt_phys + KERNEL_VIRTUAL_OFFSET);

Here's the fix for VMemUnmapMMIO:

// Lines 680-686 should be: // Access PT through identity mapping or kernel virtual offset uint64_t* pt_table; if (pt_phys < IDENTITY_MAP_SIZE) { pt_table = (uint64_t*)pt_phys; } else { pt_table = (uint64_t*)(pt_phys + KERNEL_VIRTUAL_OFFSET); }

This matches the pattern that works in other VMem functions and aligns with your previous learning about VMemUnmapMMIO page table access.

🧬 Code Graph Analysis Results

File: arch/x86_64/cpu/Cpu.c, lines 40-42

CpuFeatures* GetCpuFeatures(void) { return &cpu_features; }

✏️ Learnings added

Learnt from: assembler-0 PR: assembler-0/VoidFrame#90 File: kernel/memory/MemOps.c:103-124 Timestamp: 2025-08-16T05:27:10.739Z Learning: In VoidFrame kernel VMemAlloc implementation, different virtual addresses on consecutive allocations after freeing is expected behavior because VMemAlloc uses a simple bump allocator (kernel_space.next_vaddr) that only moves forward and doesn't track or reuse freed virtual address ranges.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:58.963Z Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:27.395Z Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#80 File: README.md:7-8 Timestamp: 2025-08-13T08:43:54.750Z Learning: User assembler-0 is working on VoidFrame kernel with a tight deadline (2-3 days before summer ends) and specifically requested runtime safety checks similar to __stack_chk_fail but for deadlocks and other kernel issues. The kernel already has extensive safety mechanisms including stack canaries, guard pages, heap integrity checks, and process security validation.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#47 File: kernel/core/Kernel.c:434-434 Timestamp: 2025-07-16T01:04:23.790Z Learning: VMemAlloc() in VoidFrame kernel fails with error code -2 (VMEM_ERROR_INVALID_ADDR) due to virtual address space layout conflict where the heap start address (_bss_end + KERNEL_VIRTUAL_OFFSET + 1MB margin) exceeds VIRT_ADDR_SPACE_END (0xFFFFFFFF80000000ULL), causing VMemMap validation to fail.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#53 File: kernel/core/Kernel.c:256-259 Timestamp: 2025-08-09T04:47:56.883Z Learning: The VoidFrame kernel memory crash occurs when VIRT_ADDR_SPACE_START is set to 0xFFFF800000000000ULL instead of 0xFFFFFFFF80000000ULL, causing the heap to allocate 128TB away from where the kernel is mapped, resulting in page faults when accessing heap memory.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#79 File: drivers/PCI/PCI.c:154-154 Timestamp: 2025-08-13T07:19:55.274Z Learning: VMemUnmapMMIO in VoidFrame kernel should use the same page table access pattern as the working VMemUnmap function: `uint64_t* pt_table = (uint64_t*)(pt_phys + KERNEL_VIRTUAL_OFFSET);` rather than PHYS_TO_VIRT macro or identity mapping checks.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#57 File: kernel/memory/VMem.h:64-66 Timestamp: 2025-08-09T09:29:00.630Z Learning: In the VoidFrame kernel, removing locks from KernelMemoryAlloc/KernelMemoryFree causes heap corruption where shell command strings (like "clear" = 0x61656C63) overwrite heap block magic values due to race conditions between allocation and free operations.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#62 File: kernel/core/Panic.c:69-79 Timestamp: 2025-08-10T06:22:28.816Z Learning: In VoidFrame kernel, panic functions must call "asm volatile("cli")" as the very first instruction to disable interrupts immediately. Without this, if interrupts are enabled (after "sti"), an interrupt can fire during panic handling causing the panic to fail, hang, or behave unpredictably. This must be done in ALL panic entry points: Panic(), PanicWithCode(), PanicWithContext(), PanicFromInterrupt(), and at the start of KernelPanicHandler().

🧩 Analysis chain

@assembler-0, I understand the critical nature - virtual address space exhaustion can bring down the entire kernel. Let me design a comprehensive virtual address space allocator to replace the current linear-only system.

Here's a comprehensive solution for permanent virtual address space management:

Virtual Address Space Allocator Design

// Add to VMem.h typedef struct VAddrRange { uint64_t start; uint64_t size; struct VAddrRange* next; } VAddrRange; typedef struct { uint64_t start_addr; uint64_t end_addr; uint64_t next_vaddr; // For new allocations beyond free ranges VAddrRange* free_list; // Free ranges available for reuse SpinLock lock; // Thread safety } VAddrSpace;

Core Functions:

// Enhanced VMemAlloc with free list support uint64_t VMemAlloc(uint64_t size) { size = ALIGN_UP(size, PAGE_SIZE); AcquireSpinLock(&kernel_space.lock); // 1. Try to find suitable free range VAddrRange** current = &kernel_space.free_list; while (*current) { if ((*current)->size >= size) { uint64_t addr = (*current)->start; if ((*current)->size == size) { // Exact match - remove from free list VAddrRange* to_free = *current; *current = (*current)->next; KernelMemoryFree(to_free); } else { // Split range - update remaining free space (*current)->start += size; (*current)->size -= size; } ReleaseSpinLock(&kernel_space.lock); return addr; } current = &(*current)->next; } // 2. No suitable free range - allocate from end of space if (kernel_space.next_vaddr + size > kernel_space.end_addr) { ReleaseSpinLock(&kernel_space.lock); return 0; // Out of virtual address space } uint64_t addr = kernel_space.next_vaddr; kernel_space.next_vaddr += size; ReleaseSpinLock(&kernel_space.lock); return addr; } // Enhanced VMemFree to add ranges back to free list void VMemFree(uint64_t vaddr, uint64_t size) { size = ALIGN_UP(size, PAGE_SIZE); AcquireSpinLock(&kernel_space.lock); // Create new free range VAddrRange* new_range = (VAddrRange*)KernelMemoryAlloc(sizeof(VAddrRange)); if (!new_range) { ReleaseSpinLock(&kernel_space.lock); return; // Memory allocation failed - range lost } new_range->start = vaddr; new_range->size = size; // Insert and coalesce with adjacent ranges VMemCoalesceRange(new_range); ReleaseSpinLock(&kernel_space.lock); } // Coalesce adjacent free ranges to prevent fragmentation static void VMemCoalesceRange(VAddrRange* new_range) { VAddrRange** current = &kernel_space.free_list; VAddrRange* prev = NULL; // Find insertion point and check for adjacency while (*current && (*current)->start < new_range->start) { prev = *current; current = &(*current)->next; } // Check if we can merge with previous range if (prev && (prev->start + prev->size == new_range->start)) { prev->size += new_range->size; KernelMemoryFree(new_range); new_range = prev; } else { // Insert new range new_range->next = *current; if (prev) { prev->next = new_range; } else { kernel_space.free_list = new_range; } } // Check if we can merge with next range if (new_range->next && (new_range->start + new_range->size == new_range->next->start)) { VAddrRange* to_free = new_range->next; new_range->size += to_free->size; new_range->next = to_free->next; KernelMemoryFree(to_free); } }

Integration Points:

// Modified VMemUnmapMMIO to free virtual address space void VMemUnmapMMIO(uint64_t vaddr, uint64_t size) { // ... existing page table cleanup code ... // NEW: Free the virtual address space for reuse VMemFree(vaddr, size); } // Initialization in kernel startup void VMemInit(void) { kernel_space.start_addr = VIRT_ADDR_SPACE_START; kernel_space.end_addr = VIRT_ADDR_SPACE_END; kernel_space.next_vaddr = VIRT_ADDR_SPACE_START; kernel_space.free_list = NULL; InitSpinLock(&kernel_space.lock); }

Benefits:

No virtual address space exhaustion - freed ranges are reused

Fragmentation prevention - coalescing merges adjacent free ranges

Thread safety - proper locking prevents races

Performance - free list lookup is fast for most cases

Fallback safety - still allocates linearly when no free ranges available

This eliminates the "one wrong move and we all gone" scenario by ensuring virtual address space is properly managed and reused.

Would you like me to implement any specific part of this design or help integrate it with your existing VMem code?

ﾉ(◕ヮ◕)ﾉ*:･ﾟ✧

🏁 Script executed:

#!/bin/bash # Let's examine the current VMem implementation more thoroughly echo "=== Current VMem structures and allocation logic ===" rg -nP -A10 -B5 'struct.*kernel_space|next_vaddr|VMemAlloc.*{' kernel/memory/ echo -e "\n=== Current VMemUnmapMMIO implementation ===" rg -nP -A20 'VMemUnmapMMIO.*{' kernel/memory/ echo -e "\n=== Examining any existing free list or range tracking ===" rg -nP -C5 'free.*list|range.*track|virt.*free' kernel/memory/

Length of output: 10866

@assembler-0, perfect! Now I can see the exact structure and provide a complete solution that integrates seamlessly with your existing VMem implementation.

Complete Virtual Address Space Allocator Implementation

1. Update VMem.h - Add Free Range Tracking:

// Add this structure before VirtAddrSpace typedef struct VAddrRange { uint64_t start; uint64_t size; struct VAddrRange* next; } VAddrRange; // Update VirtAddrSpace structure in VMem.h typedef struct { uint64_t* pml4; /**< Physical address of PML4 table */ uint64_t next_vaddr; /**< Next virtual address for allocation */ uint64_t used_pages; /**< Number of pages currently allocated */ uint64_t total_mapped; /**< Total bytes mapped in this space */ VAddrRange* free_list; /**< Free virtual address ranges */ } VirtAddrSpace;

2. Update VMem.c - Enhanced Allocation Logic:

// Add to the top of VMem.c after includes static void VMemCoalesceRange(VAddrRange* new_range); static VAddrRange* VMemFindFreeRange(uint64_t size); static void VMemAddFreeRange(uint64_t start, uint64_t size); // Replace the existing VMemAlloc function void* VMemAlloc(uint64_t size) { if (size == 0) return NULL; size = PAGE_ALIGN_UP(size); irq_flags_t flags = SpinLockIrqSave(&vmem_lock); // First try to find a suitable free range VAddrRange** current = &kernel_space.free_list; VAddrRange* prev_link = NULL; while (*current) { if ((*current)->size >= size) { uint64_t vaddr = (*current)->start; if ((*current)->size == size) { // Exact match - remove entire range from free list VAddrRange* to_free = *current; *current = (*current)->next; KernelMemoryFree(to_free); } else { // Split range - keep remaining portion (*current)->start += size; (*current)->size -= size; } SpinUnlockIrqRestore(&vmem_lock, flags); // Map pages for the allocated range uint64_t allocated_size = 0; for (uint64_t offset = 0; offset < size; offset += PAGE_SIZE) { void* paddr = AllocPage(); if (!paddr) { if (allocated_size > 0) { VMemFree((void*)vaddr, allocated_size); } return NULL; } int result = VMemMap(vaddr + offset, (uint64_t)paddr, PAGE_PRESENT | PAGE_WRITABLE); if (result != VMEM_SUCCESS) { FreePage(paddr); if (allocated_size > 0) { VMemFree((void*)vaddr, allocated_size); } return NULL; } allocated_size += PAGE_SIZE; } // Update tracking flags = SpinLockIrqSave(&vmem_lock); kernel_space.used_pages += size / PAGE_SIZE; kernel_space.total_mapped += size; SpinUnlockIrqRestore(&vmem_lock, flags); FastMemset((void*)vaddr, 0, size); return (void*)vaddr; } prev_link = *current; current = &(*current)->next; } // No suitable free range - allocate from linear space if (kernel_space.next_vaddr < VIRT_ADDR_SPACE_START) { kernel_space.next_vaddr = VIRT_ADDR_SPACE_START; } if (kernel_space.next_vaddr + size > VIRT_ADDR_SPACE_END) { SpinUnlockIrqRestore(&vmem_lock, flags); return NULL; // Out of virtual address space } const uint64_t vaddr = kernel_space.next_vaddr; kernel_space.next_vaddr = vaddr + size; SpinUnlockIrqRestore(&vmem_lock, flags); // Map pages (same as existing code) uint64_t allocated_size = 0; for (uint64_t offset = 0; offset < size; offset += PAGE_SIZE) { void* paddr = AllocPage(); if (!paddr) { if (allocated_size > 0) { VMemFree((void*)vaddr, allocated_size); } return NULL; } int result = VMemMap(vaddr + offset, (uint64_t)paddr, PAGE_PRESENT | PAGE_WRITABLE); if (result != VMEM_SUCCESS) { FreePage(paddr); if (allocated_size > 0) { VMemFree((void*)vaddr, allocated_size); } return NULL; } allocated_size += PAGE_SIZE; } // Update tracking flags = SpinLockIrqSave(&vmem_lock); kernel_space.used_pages += size / PAGE_SIZE; kernel_space.total_mapped += size; SpinUnlockIrqRestore(&vmem_lock, flags); FastMemset((void*)vaddr, 0, size); return (void*)vaddr; } // Enhanced VMemFree to reclaim virtual address space void VMemFree(void* vaddr, uint64_t size) { if (!vaddr || size == 0) return; size = PAGE_ALIGN_UP(size); uint64_t addr = (uint64_t)vaddr; // Unmap and free physical pages (existing logic) for (uint64_t offset = 0; offset < size; offset += PAGE_SIZE) { uint64_t paddr = VMemGetPhysAddr(addr + offset); if (paddr != INVALID_PHYS_ADDR) { VMemUnmap(addr + offset); FreePage((void*)paddr); } } // Update tracking and add range to free list irq_flags_t flags = SpinLockIrqSave(&vmem_lock); kernel_space.used_pages -= size / PAGE_SIZE; kernel_space.total_mapped -= size; // Add freed range to free list VMemAddFreeRange(addr, size); SpinUnlockIrqRestore(&vmem_lock, flags); } // Updated VMemUnmapMMIO to reclaim virtual address space void VMemUnmapMMIO(uint64_t vaddr, uint64_t size) { PrintKernel("VMemUnmapMMIO: Unmapping MMIO at 0x"); PrintKernelHex(vaddr); PrintKernel(" (size: 0x"); PrintKernelHex(size); PrintKernel(")\n"); if (!IS_PAGE_ALIGNED(vaddr) || !IS_PAGE_ALIGNED(size)) { PrintKernel("VMemUnmapMMIO: ERROR - Address or size not page-aligned\n"); return; } // ... existing page table cleanup code ... // NEW: Add freed virtual address range back to free list irq_flags_t flags = SpinLockIrqSave(&vmem_lock); VMemAddFreeRange(vaddr, size); SpinUnlockIrqRestore(&vmem_lock, flags); PrintKernel("VMemUnmapMMIO: Successfully unmapped and reclaimed virtual address space\n"); } // Helper function to add a range to the free list with coalescing static void VMemAddFreeRange(uint64_t start, uint64_t size) { VAddrRange* new_range = (VAddrRange*)KernelMemoryAlloc(sizeof(VAddrRange)); if (!new_range) { PrintKernel("VMemAddFreeRange: WARNING - Could not allocate free range tracking\n"); return; // Range is lost but system continues } new_range->start = start; new_range->size = size; new_range->next = NULL; // Find insertion point and coalesce VAddrRange** current = &kernel_space.free_list; VAddrRange* prev = NULL; // Insert in sorted order by address while (*current && (*current)->start < start) { prev = *current; current = &(*current)->next; } // Try to merge with previous range if (prev && (prev->start + prev->size == start)) { prev->size += size; KernelMemoryFree(new_range); new_range = prev; } else { // Insert new range new_range->next = *current; if (prev) { prev->next = new_range; } else { kernel_space.free_list = new_range; } } // Try to merge with next range if (new_range->next && (new_range->start + new_range->size == new_range->next->start)) { VAddrRange* to_free = new_range->next; new_range->size += to_free->size; new_range->next = to_free->next; KernelMemoryFree(to_free); } }

3. Initialize Free List in VMemInit:

// Add to VMemInit function after existing initialization void VMemInit(void) { // ... existing initialization code ... // Initialize free list kernel_space.free_list = NULL; // ... rest of existing code ... }

4. Optional: Add Debugging Function:

void VMemPrintFreeRanges(void) { irq_flags_t flags = SpinLockIrqSave(&vmem_lock); PrintKernel("Free virtual address ranges:\n"); VAddrRange* current = kernel_space.free_list; uint64_t total_free = 0; while (current) { PrintKernel(" 0x"); PrintKernelHex(current->start); PrintKernel(" - 0x"); PrintKernelHex(current->start + current->size - 1); PrintKernel(" (size: 0x"); PrintKernelHex(current->size); PrintKernel(")\n"); total_free += current->size; current = current->next; } PrintKernel("Total free virtual space: 0x"); PrintKernelHex(total_free); PrintKernel("\n"); SpinUnlockIrqRestore(&vmem_lock, flags); }

Benefits of This Solution:

✅ Virtual Address Space Reclamation - Freed ranges are immediately available for reuse

✅ Automatic Coalescing - Adjacent free ranges merge to prevent fragmentation

✅ Thread Safety - Uses existing vmem_lock for all operations

✅ Backward Compatibility - Doesn't break existing VMemAlloc/VMemFree calls

✅ Performance - Free list lookup is O(n) but typically small, linear allocation as fallback

✅ Robustness - Handles edge cases like allocation failures gracefully

This eliminates the "one wrong move and we all gone" scenario by ensuring your xHCI controller (and any other MMIO devices) can be properly initialized, unmapped, and re-initialized without exhausting virtual address space.

Want me to help you integrate this into your codebase, or would you like me to explain any specific part of the implementation?

ﾉ(◕ヮ◕)ﾉ*:･ﾟ✧

🧬 Code Graph Analysis Results

File: arch/x86_64/cpu/Cpu.c, lines 40-42

CpuFeatures* GetCpuFeatures(void) { return &cpu_features; }

✏️ Learnings added

Learnt from: assembler-0 PR: assembler-0/VoidFrame#90 File: kernel/memory/MemOps.c:103-124 Timestamp: 2025-08-16T05:28:34.904Z Learning: In VoidFrame kernel, the virtual address space management uses a simple linear allocator (kernel_space.next_vaddr) without any reclamation mechanism. VMemFree and VMemUnmapMMIO clear page tables but don't free virtual address space for reuse, causing eventual virtual address space exhaustion. The solution requires implementing a free list with coalescing in the VirtAddrSpace structure and updating allocation/deallocation functions to reuse freed ranges.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

🧠 Learnings used

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:58.963Z Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#80 File: README.md:7-8 Timestamp: 2025-08-13T08:43:54.750Z Learning: User assembler-0 is working on VoidFrame kernel with a tight deadline (2-3 days before summer ends) and specifically requested runtime safety checks similar to __stack_chk_fail but for deadlocks and other kernel issues. The kernel already has extensive safety mechanisms including stack canaries, guard pages, heap integrity checks, and process security validation.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:27.395Z Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.

🧩 Analysis chain

@assembler-0, you're right to question this! Let me examine what PHYS_TO_VIRT actually does and compare the complete implementations to find the real issue.

🏁 Script executed:

#!/bin/bash # Find the PHYS_TO_VIRT macro definition echo "=== PHYS_TO_VIRT macro definition ===" rg -nP '#define.*PHYS_TO_VIRT|PHYS_TO_VIRT.*=' --type=c echo -e "\n=== Compare VMemUnmap vs VMemUnmapMMIO implementations side by side ===" echo "--- VMemUnmap implementation ---" rg -nP -A50 'int VMemUnmap$uint64_t vaddr, uint64_t size$' kernel/memory/VMem.c echo -e "\n--- VMemUnmapMMIO implementation ---" rg -nP -A50 'void VMemUnmapMMIO$uint64_t vaddr, uint64_t size$' kernel/memory/VMem.c echo -e "\n=== Look for KERNEL_VIRTUAL_OFFSET definition ===" rg -nP 'KERNEL_VIRTUAL_OFFSET.*=' --type=c

Length of output: 4938

You're absolutely right! PHYS_TO_VIRT does exactly the same thing - it adds KERNEL_VIRTUAL_OFFSET. And both functions use the identical page table access pattern. Let me get the complete VMemUnmapMMIO implementation to find the real issue:

🏁 Script executed:

#!/bin/bash # Get the complete VMemUnmapMMIO implementation - need more lines echo "=== Complete VMemUnmapMMIO implementation ===" rg -nP -A80 'void VMemUnmapMMIO$uint64_t vaddr, uint64_t size$' kernel/memory/VMem.c echo -e "\n=== Check for IDENTITY_MAP_SIZE definition ===" rg -nP 'IDENTITY_MAP_SIZE.*=' --type=c echo -e "\n=== Look for any differences in the page table clearing logic ===" echo "VMemUnmap clearing logic:" rg -nP -A5 'pt_virt\[pt_index\].*=' kernel/memory/VMem.c echo -e "\n=== Check if there are any TLB flush differences ===" rg -nP -C3 'VMemFlushTLB' kernel/memory/VMem.c

Length of output: 6079

@assembler-0, you're absolutely right about PHYS_TO_VIRT! I found the real issue now. Looking at the complete implementations, there's a critical difference in the locking and TLB flush pattern:

VMemUnmap (working):

for (uint64_t i = 0; i < num_pages; i++) { irq_flags_t flags = SpinLockIrqSave(&vmem_lock); // Lock per page // ... page table access ... if (pt_virt[pt_index] & PAGE_PRESENT) { pt_virt[pt_index] = 0; kernel_space.used_pages--; kernel_space.total_mapped -= PAGE_SIZE; } SpinUnlockIrqRestore(&vmem_lock, flags); // Unlock BEFORE TLB flush VMemFlushTLBSingle(current_vaddr); // TLB flush OUTSIDE lock }

VMemUnmapMMIO (crashing):

irq_flags_t irq_flags = SpinLockIrqSave(&vmem_lock); // Lock ONCE for entire operation for (uint64_t i = 0; i < num_pages; i++) { // ... page table access ... if (pt_table[pt_index] & PAGE_PRESENT) { pt_table[pt_index] = 0; VMemFlushTLBSingle(current_vaddr); // TLB flush INSIDE lock! } } SpinUnlockIrqRestore(&vmem_lock, irq_flags); // Unlock at end

The issue: VMemUnmapMMIO calls VMemFlushTLBSingle() while holding the vmem_lock. The invlpg instruction can be expensive and holding a spinlock during TLB operations can cause deadlocks or timing issues.

Fix: Change VMemUnmapMMIO to match VMemUnmap's pattern:

void VMemUnmapMMIO(uint64_t vaddr, uint64_t size) { // ... existing validation code ... uint64_t num_pages = size / PAGE_SIZE; uint64_t pml4_phys = VMemGetPML4PhysAddr(); for (uint64_t i = 0; i < num_pages; i++) { uint64_t current_vaddr = vaddr + (i * PAGE_SIZE); irq_flags_t flags = SpinLockIrqSave(&vmem_lock); // Lock per page // ... existing page table walking code ... if (pt_table[pt_index] & PAGE_PRESENT) { pt_table[pt_index] = 0; } SpinUnlockIrqRestore(&vmem_lock, flags); // Unlock BEFORE TLB flush VMemFlushTLBSingle(current_vaddr); // TLB flush OUTSIDE lock } }

This matches the proven working pattern from VMemUnmap and should fix the system death!

🧬 Code Graph Analysis Results

File: arch/x86_64/cpu/Cpu.c, lines 40-42
CpuFeatures* GetCpuFeatures(void) {
return &cpu_features;
}

🧠 Learnings used

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:58.963Z Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#79 File: drivers/PCI/PCI.c:154-154 Timestamp: 2025-08-13T07:19:55.274Z Learning: VMemUnmapMMIO in VoidFrame kernel should use the same page table access pattern as the working VMemUnmap function: `uint64_t* pt_table = (uint64_t*)(pt_phys + KERNEL_VIRTUAL_OFFSET);` rather than PHYS_TO_VIRT macro or identity mapping checks.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#90 File: kernel/memory/MemOps.c:103-124 Timestamp: 2025-08-16T05:28:34.904Z Learning: In VoidFrame kernel, the virtual address space management uses a simple linear allocator (kernel_space.next_vaddr) without any reclamation mechanism. VMemFree and VMemUnmapMMIO clear page tables but don't free virtual address space for reuse, causing eventual virtual address space exhaustion. The solution requires implementing a free list with coalescing in the VirtAddrSpace structure and updating allocation/deallocation functions to reuse freed ranges.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#53 File: kernel/core/Kernel.c:256-259 Timestamp: 2025-08-09T04:47:56.883Z Learning: The VoidFrame kernel memory crash occurs when VIRT_ADDR_SPACE_START is set to 0xFFFF800000000000ULL instead of 0xFFFFFFFF80000000ULL, causing the heap to allocate 128TB away from where the kernel is mapped, resulting in page faults when accessing heap memory.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#82 File: kernel/process/Process.h:0-0 Timestamp: 2025-08-14T02:11:27.395Z Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#47 File: kernel/core/Kernel.c:434-434 Timestamp: 2025-07-16T01:04:23.790Z Learning: VMemAlloc() in VoidFrame kernel fails with error code -2 (VMEM_ERROR_INVALID_ADDR) due to virtual address space layout conflict where the heap start address (_bss_end + KERNEL_VIRTUAL_OFFSET + 1MB margin) exceeds VIRT_ADDR_SPACE_END (0xFFFFFFFF80000000ULL), causing VMemMap validation to fail.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#80 File: README.md:7-8 Timestamp: 2025-08-13T08:43:54.750Z Learning: User assembler-0 is working on VoidFrame kernel with a tight deadline (2-3 days before summer ends) and specifically requested runtime safety checks similar to __stack_chk_fail but for deadlocks and other kernel issues. The kernel already has extensive safety mechanisms including stack canaries, guard pages, heap integrity checks, and process security validation.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#58 File: drivers/VesaBIOSExtension.c:77-85 Timestamp: 2025-08-09T10:31:44.439Z Learning: In VoidFrame kernel, process stacks are allocated using VMemAllocWithGuards() which returns virtual addresses, but TerminateProcess incorrectly uses FreePage() (which expects physical addresses) to free them, causing "Page index out of bounds" errors. The correct function to use is VMemFreeWithGuards(proc->stack, STACK_SIZE).

Learnt from: assembler-0 PR: assembler-0/VoidFrame#90 File: kernel/memory/MemOps.c:103-124 Timestamp: 2025-08-16T05:27:10.739Z Learning: In VoidFrame kernel VMemAlloc implementation, different virtual addresses on consecutive allocations after freeing is expected behavior because VMemAlloc uses a simple bump allocator (kernel_space.next_vaddr) that only moves forward and doesn't track or reuse freed virtual address ranges.

Learnt from: assembler-0 PR: assembler-0/VoidFrame#57 File: kernel/memory/VMem.h:64-66 Timestamp: 2025-08-09T09:29:00.630Z Learning: In the VoidFrame kernel, removing locks from KernelMemoryAlloc/KernelMemoryFree causes heap corruption where shell command strings (like "clear" = 0x61656C63) overwrite heap block magic values due to race conditions between allocation and free operations.

Faster memory operations

2d89e0c

coderabbitai bot reviewed Aug 16, 2025

View reviewed changes

assembler-0 merged commit 80ab703 into main Aug 16, 2025
1 check passed

coderabbitai bot mentioned this pull request Aug 30, 2025

Development #99

Merged

coderabbitai bot mentioned this pull request Oct 2, 2025

Development #139

Merged

Development #90

Development #90

Uh oh!

Conversation

assembler-0 commented Aug 16, 2025 • edited by coderabbitai bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai bot commented Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Sequence Diagram(s)

Estimated code review effort

Possibly related PRs

Poem

Chat

Support

CodeRabbit Commands (Invoked using PR/Issue comments)

Other keywords and placeholders

CodeRabbit Configuration File (.coderabbit.yaml)

Status, Documentation and Community

Uh oh!

coderabbitai bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

assembler-0 Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

assembler-0 Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

assembler-0 Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

assembler-0 Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

coderabbitai bot Aug 16, 2025

Choose a reason for hiding this comment

Virtual Address Space Allocator Design

Core Functions:

Integration Points:

Benefits:

Complete Virtual Address Space Allocator Implementation

1. Update VMem.h - Add Free Range Tracking:

2. Update VMem.c - Enhanced Allocation Logic:

3. Initialize Free List in VMemInit:

4. Optional: Add Debugging Function:

Benefits of This Solution:

Uh oh!

coderabbitai bot Aug 16, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

assembler-0 commented Aug 16, 2025 •

edited by coderabbitai bot

Loading

coderabbitai bot commented Aug 16, 2025 •

edited

Loading

CodeRabbit Configuration File (`.coderabbit.yaml`)

coderabbitai bot Aug 16, 2025 •

edited

Loading