-
Notifications
You must be signed in to change notification settings - Fork 1
Development #139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Development #139
Conversation
Caution Review failedThe pull request is closed. WalkthroughCMake now emits configure-time toolchain/generator/build-state messages and reorders includes; run target dependencies changed and runmin memory reduced. New macOS/Windows x64 toolchains added; Linux toolchain linker flags simplified. CPU features gain avx512f. New compile-time constants added. mm/MemOps implements SIMD-accelerated memcpy/memset/memcmp with public wrappers and restrict-qualified signatures. Changes
Sequence Diagram(s)sequenceDiagram
autonumber
participant Dev as Developer
participant CMake as CMake (configure)
participant TC as Toolchain File
participant RunCM as cmake/run.cmake
Dev->>CMake: cmake -S . -B build -DCMAKE_TOOLCHAIN_FILE=../cmake/toolchain/linux-x64.cmake -DCMAKE_BUILD_TYPE=Release -G Ninja
CMake->>TC: load toolchain (system, processor, compilers)
CMake->>RunCM: include run.cmake early
CMake-->>Dev: print toolchain/generator/build-type/source/bin/current-dir/host info
sequenceDiagram
autonumber
participant Caller as mem_* caller
participant Detect as CPU feature detection
participant Fast as FastMemcpy/FastMemset/FastMemcmp
participant Mem as Memory
Caller->>Fast: call memcpy/memset/memcmp(args)
Fast->>Detect: query features & size
alt AVX-512f & large & NT threshold
Fast->>Mem: prefetch → aligned NT stores → sfence
else AVX2
Fast->>Mem: AVX2 loads/stores → vzeroupper
else SSE2
Fast->>Mem: SSE2 loads/stores → vzeroupper
else scalar
Fast->>Mem: word/byte operations
end
Fast-->>Caller: return result
Estimated code review effort🎯 5 (Critical) | ⏱️ ~120 minutes Possibly related PRs
Poem
Pre-merge checks and finishing touches❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (1 passed)
📜 Recent review detailsConfiguration used: CodeRabbit UI Review profile: CHILL Plan: Pro 📒 Files selected for processing (2)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🧹 Nitpick comments (2)
CMakeLists.txt (1)
21-29
: Usemessage(STATUS ...)
for consistent CMake output formatting.The bare
message()
calls work but don't follow CMake conventions. Usingmessage(STATUS ...)
will:
- Prefix output with
--
for better readability- Integrate better with CMake's logging levels
- Align with the existing
message(STATUS ...)
used in lines 12, 16Apply this diff:
-message("CMake Toolchain: ${CMAKE_TOOLCHAIN_FILE}") -message("CMake Generator: ${CMAKE_GENERATOR}") -message("CMake Build Type: ${CMAKE_BUILD_TYPE}") -message("CMake Source Directory: ${CMAKE_SOURCE_DIR}") -message("CMake Binary Directory: ${CMAKE_BINARY_DIR}") -message("CMake Current Source Directory: ${CMAKE_CURRENT_SOURCE_DIR}") -message("CMake Current Binary Directory: ${CMAKE_CURRENT_BINARY_DIR}") -message("CMake Host System Name: ${CMAKE_HOST_SYSTEM_NAME}") -message("CMake Host System Processor: ${CMAKE_HOST_SYSTEM_PROCESSOR}") +message(STATUS "CMake Toolchain: ${CMAKE_TOOLCHAIN_FILE}") +message(STATUS "CMake Generator: ${CMAKE_GENERATOR}") +message(STATUS "CMake Build Type: ${CMAKE_BUILD_TYPE}") +message(STATUS "CMake Source Directory: ${CMAKE_SOURCE_DIR}") +message(STATUS "CMake Binary Directory: ${CMAKE_BINARY_DIR}") +message(STATUS "CMake Current Source Directory: ${CMAKE_CURRENT_SOURCE_DIR}") +message(STATUS "CMake Current Binary Directory: ${CMAKE_CURRENT_BINARY_DIR}") +message(STATUS "CMake Host System Name: ${CMAKE_HOST_SYSTEM_NAME}") +message(STATUS "CMake Host System Processor: ${CMAKE_HOST_SYSTEM_PROCESSOR}")README.md (1)
49-49
: Consider clarifying the toolchain selection with examples.The placeholder
<linux/windows/macos>-x64.cmake
might be unclear to new users. Consider adding a brief example or note.Apply this diff to add clarity:
-cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain/<linux/windows/macos>-x64.cmake -G Ninja +# Choose the appropriate toolchain file for your host system: +# Linux: -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain/linux-x64.cmake +# macOS: -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain/macos-x64.cmake +# Windows: -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain/windows-x64.cmake +cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_TOOLCHAIN_FILE=cmake/toolchain/linux-x64.cmake -G Ninja
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (7)
CMakeLists.txt
(2 hunks)README.md
(1 hunks)cmake/run.cmake
(1 hunks)cmake/toolchain/linux-x64.cmake
(1 hunks)cmake/toolchain/macos-x64.cmake
(1 hunks)cmake/toolchain/windows-x64.cmake
(1 hunks)docs/CMAKE.md
(0 hunks)
💤 Files with no reviewable changes (1)
- docs/CMAKE.md
🔇 Additional comments (4)
cmake/toolchain/macos-x64.cmake (1)
1-9
: LGTM! Toolchain configuration is consistent with cross-platform approach.The macOS toolchain correctly sets up a bare-metal/kernel build environment using Generic system, x86_64 processor, and standard Clang toolchain with NASM assembler. This aligns well with the linux-x64.cmake and windows-x64.cmake approach.
cmake/run.cmake (1)
29-29
: LGTM! Explicit dependencies improve build correctness.Adding
img
andextra-img
to the run target's dependencies ensures disk images referenced in the QEMU command (VoidFrameDisk.img, SataDisk.img, VirtioDisk.img) are created before launching. This makes the dependency graph explicit and prevents potential build ordering issues.cmake/toolchain/linux-x64.cmake (1)
1-9
: LGTM! Simplified toolchain with target-level linker configuration.The removal of
CMAKE_EXE_LINKER_FLAGS_INIT
andCMAKE_SHARED_LINKER_FLAGS_INIT
is appropriate since linker configuration (-fuse-ld=lld
) is now handled at the target level in CMakeLists.txt (line 245-251). This provides better flexibility and aligns with modern CMake practices.The updated header comment and structure now match the macos-x64.cmake and windows-x64.cmake patterns.
CMakeLists.txt (1)
43-43
: Verify run.cmake variable definitions before include.
run.cmake references QEMU_IMG, MKFS_EXT2, MKFS_FAT, LLVM_OBJDUMP but noset
orfind_program
calls were found defining them; ensure these variables are initialized prior toinclude(run)
or adjust the include order.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 6
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
.github/workflows/main.yaml
(1 hunks)README.md
(1 hunks)arch/x86_64/features/x64.c
(1 hunks)arch/x86_64/features/x64.h
(1 hunks)cmake/run.cmake
(2 hunks)cmake/toolchain/windows-x64.cmake
(1 hunks)cmake/variable.cmake
(1 hunks)mm/MemOps.c
(4 hunks)
🚧 Files skipped from review as they are similar to previous changes (2)
- cmake/toolchain/windows-x64.cmake
- README.md
🧰 Additional context used
🧠 Learnings (1)
📚 Learning: 2025-09-01T01:10:22.123Z
Learnt from: assembler-0
PR: assembler-0/VoidFrame#103
File: README.md:9-9
Timestamp: 2025-09-01T01:10:22.123Z
Learning: The current VoidFrame Registers struct in arch/x86_64/cpu/Cpu.h only saves basic GPRs, segment registers, and exception context (interrupt_number, error_code, rip, cs, rflags, rsp, ss). It's missing critical registers for deep fault tracing: SIMD registers (XMM/YMM/ZMM), control registers (CR0/CR2/CR3/CR4), debug registers (DR0-DR7), and extended FPU state. The struct has a "DO NOT TOUCH" comment indicating it must match interrupt ASM stack layout.
Applied to files:
arch/x86_64/features/x64.h
🧬 Code graph analysis (2)
arch/x86_64/features/x64.c (1)
kernel/etc/Console.c (1)
PrintKernelWarning
(220-225)
mm/MemOps.c (2)
arch/x86_64/features/x64.c (1)
GetCpuFeatures
(102-104)include/Io.h (2)
save_irq_flags
(38-42)restore_irq_flags
(44-46)
🔇 Additional comments (3)
cmake/run.cmake (3)
29-29
: LGTM! Correct dependency declaration.Adding
img
andextra-img
dependencies ensures that all disk images used by therun
target (VoidFrameDisk.img, VirtioDisk.img, SataDisk.img) are created before QEMU launches. This prevents runtime errors.
39-39
: LGTM! Appropriate memory reduction for minimal target.Reducing memory from 4G to 1G aligns with the minimal testing purpose of the
runmin
target. 1G should be sufficient for basic functionality verification.
41-41
: LGTM! Correct dependency optimization.Removing the
img
andextra-img
dependencies is appropriate since therunmin
target doesn't use any disk images. This optimization eliminates unnecessary build steps for the minimal test configuration.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 4
♻️ Duplicate comments (2)
mm/MemOps.c (2)
238-248
:FastMemcpy
small-copy fallback dereferences unaligned pointers.Casting
d
/s
touint64_t*
oruint32_t*
without verifying alignment is UB and can crash under strict-alignment builds. Replace these with unaligned-safe copies—load into a temporary via__builtin_memcpy
and store back, or drop to the byte loop.
49-63
: Fix unaligned wide stores in the smallmemset
fast path.
*(uint64_t*)d
/*(uint32_t*)d
assume natural alignment. With arbitrary buffers those casts are undefined behavior and can fault on platforms with strict alignment. Use an unaligned-safe copy (e.g.,__builtin_memcpy
) or byte writes instead of typed stores.- if (size >= 8) { - uint64_t val64 = 0x0101010101010101ULL * val; - *(uint64_t*)d = val64; - *(uint64_t*)(d + size - 8) = val64; - return dest; - } + if (size >= 8) { + uint64_t val64 = 0x0101010101010101ULL * val; + __builtin_memcpy(d, &val64, 8); + __builtin_memcpy(d + size - 8, &val64, 8); + return dest; + } @@ - if (size >= 4) { - uint32_t val32 = 0x01010101U * val; - *(uint32_t*)d = val32; - *(uint32_t*)(d + size - 4) = val32; - return dest; - } + if (size >= 4) { + uint32_t val32 = 0x01010101U * val; + __builtin_memcpy(d, &val32, 4); + __builtin_memcpy(d + size - 4, &val32, 4); + return dest; + }
🧹 Nitpick comments (1)
arch/x86_64/features/x64.h (1)
65-68
: Document split full‐fence pattern and align attributes with cli()/sti()The
__full_mem_prot_init
/__full_mem_prot_end
pair implements a split open/close full‐fence + serialize, unlikecli()
/sti()
which each include the full fence sandwich andhot, flatten
attributes. Add a brief comment explaining the intended bracketing usage and consider addinghot, flatten
to these functions for consistency.
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (6)
.github/workflows/main.yaml
(1 hunks)CMakeLists.txt
(2 hunks)arch/x86_64/features/x64.h
(2 hunks)cmake/variable.cmake
(1 hunks)mm/MemOps.c
(4 hunks)mm/MemOps.h
(1 hunks)
🚧 Files skipped from review as they are similar to previous changes (1)
- CMakeLists.txt
🧰 Additional context used
🧠 Learnings (3)
📚 Learning: 2025-08-14T02:11:58.963Z
Learnt from: assembler-0
PR: assembler-0/VoidFrame#82
File: kernel/process/Process.h:0-0
Timestamp: 2025-08-14T02:11:58.963Z
Learning: In VoidFrame kernel's FastMemcpy implementation (kernel/memory/MemOps.c), memory corruption occurs under high load due to interrupts occurring between SSE2 movdqu load and store instructions, corrupting XMM register state. The fix requires disabling interrupts around SSE operations and adding memory barriers (sfence) to ensure stores complete atomically.
Applied to files:
mm/MemOps.c
📚 Learning: 2025-08-14T02:11:27.395Z
Learnt from: assembler-0
PR: assembler-0/VoidFrame#82
File: kernel/process/Process.h:0-0
Timestamp: 2025-08-14T02:11:27.395Z
Learning: In VoidFrame kernel, FastMemcpy was using SSE2 movdqu instructions without preserving XMM registers, causing crashes when called from interrupt context (like the scheduler's FastSchedule function). The fix is to use only general-purpose registers with 64-bit unrolled copies to maintain performance while being interrupt-safe.
Applied to files:
mm/MemOps.c
📚 Learning: 2025-09-01T01:10:22.123Z
Learnt from: assembler-0
PR: assembler-0/VoidFrame#103
File: README.md:9-9
Timestamp: 2025-09-01T01:10:22.123Z
Learning: The current VoidFrame Registers struct in arch/x86_64/cpu/Cpu.h only saves basic GPRs, segment registers, and exception context (interrupt_number, error_code, rip, cs, rflags, rsp, ss). It's missing critical registers for deep fault tracing: SIMD registers (XMM/YMM/ZMM), control registers (CR0/CR2/CR3/CR4), debug registers (DR0-DR7), and extended FPU state. The struct has a "DO NOT TOUCH" comment indicating it must match interrupt ASM stack layout.
Applied to files:
arch/x86_64/features/x64.h
🧬 Code graph analysis (3)
mm/MemOps.h (1)
mm/MemOps.c (7)
FastMemset
(23-208)FastMemcpy
(210-434)FastMemcmp
(526-645)FastZeroPage
(436-524)memset
(11-13)memcpy
(15-17)memcmp
(19-21)
mm/MemOps.c (2)
include/Io.h (2)
save_irq_flags
(38-42)restore_irq_flags
(44-46)arch/x86_64/features/x64.c (1)
GetCpuFeatures
(102-104)
arch/x86_64/features/x64.h (1)
include/Io.h (5)
always_inline
(48-55)always_inline
(57-64)always_inline
(67-71)always_inline
(74-78)always_inline
(80-84)
🔇 Additional comments (1)
arch/x86_64/features/x64.h (1)
70-74
: Memory barrier implementation is correct.The
__full_mem_prot_end
function correctly implements the closing memory barrier with serialization. Together with__full_mem_prot_init
, this pair provides a complete fence sandwich around protected code sections.The implementation properly orders:
- Memory fences (ensuring all memory operations are visible)
- Compiler barrier (
__sync_synchronize()
)- Instruction serialization (
__builtin_ia32_serialize()
- ensures all prior instructions complete)Note: The attribute and documentation concerns apply to both functions and were mentioned in the review of
__full_mem_prot_init
.
Summary by CodeRabbit
New Features
Documentation
Chores
Bug Fixes
Refactor