-
Notifications
You must be signed in to change notification settings - Fork 1
Development #69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Development #69
Conversation
Caution Review failedThe pull request is closed. WalkthroughAdds a reusable PCI bus scanner with callback-based enumeration and a PciFindDevice API. Introduces packed Ethernet/ARP packet structs and an RTL8139 driver (init, TX, device struct). Integrates NIC init into kernel boot, adds an “arptest” shell command to send a raw ARP request, and updates the build to compile the new driver and include headers. Changes
Sequence Diagram(s)sequenceDiagram
participant Caller
participant PCI as PCI Core
participant CB as Callback
Caller->>PCI: PciEnumerate() / PciFindDevice()
PCI->>PCI: PciScanBus(buses 0–255, devs 0–31, funcs per header_type)
PCI->>CB: invoke callback(device) for each present device
alt FindDevice mode
CB-->>PCI: match → set flag to stop
PCI-->>Caller: return found device / status
else Enumerate mode
PCI-->>Caller: enumeration complete
end
sequenceDiagram
participant Kernel
participant PCI as PCI Core
participant RTL as RTL8139 Driver
participant NIC as RTL8139 HW
Kernel->>PCI: PciFindDevice(0x10EC,0x8139)
PCI-->>Kernel: PciDevice (if found)
Kernel->>RTL: Rtl8139_Init()
RTL->>NIC: Read BAR0, reset, enable RX/TX
RTL->>NIC: Read MAC, setup RX/TX buffers
RTL-->>Kernel: device ready
sequenceDiagram
participant User
participant Shell
participant RTL as RTL8139 Driver
participant NIC as NIC
User->>Shell: arptest
Shell->>Shell: build Ethernet+ARP frame
Shell->>RTL: Rtl8139_SendPacket(frame, len)
RTL->>NIC: copy to TX buffer, write TX addr/len, kick
NIC-->>RTL: TX complete (ISR)
Estimated code review effort🎯 4 (Complex) | ⏱️ ~45 minutes Possibly related PRs
Poem
📜 Recent review detailsConfiguration used: CodeRabbit UI 📒 Files selected for processing (6)
✨ Finishing Touches
🧪 Generate unit tests
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. 🪧 TipsChatThere are 3 ways to chat with CodeRabbit:
SupportNeed help? Create a ticket on our support page for assistance with any issues or questions. CodeRabbit Commands (Invoked using PR comments)
Other keywords and placeholders
CodeRabbit Configuration File (
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 10
🧹 Nitpick comments (11)
drivers/ethernet/Packet.h (2)
1-2
: Use a more specific include guard to avoid collisionsPACKET_H is very generic and likely to collide. Prefer a project-scoped guard (e.g., VOIDFRAME_ETHERNET_PACKET_H).
-#ifndef PACKET_H -#define PACKET_H +#ifndef VOIDFRAME_ETHERNET_PACKET_H +#define VOIDFRAME_ETHERNET_PACKET_H ... -#endif // PACKET_H +#endif // VOIDFRAME_ETHERNET_PACKET_H
10-15
: Optional: define well-known constants for readabilityDefining ETHERTYPE_ARP, ETHERTYPE_IPV4, ARP_HTYPE_ETHERNET, ARP_OPCODE_REQUEST improves callsite clarity.
+#define ETHERTYPE_ARP 0x0806 +#define ETHERTYPE_IPV4 0x0800 +#define ARP_HTYPE_ETHERNET 0x0001 +#define ARP_OPCODE_REQUEST 0x0001Also applies to: 18-28
drivers/PCI/PCI.h (3)
1-2
: Use a project-scoped include guardPCI_H is too generic and prone to collision in larger codebases. Prefer VOIDFRAME_PCI_H.
-#ifndef PCI_H -#define PCI_H +#ifndef VOIDFRAME_PCI_H +#define VOIDFRAME_PCI_H ... -#endif // PCI_H +#endif // VOIDFRAME_PCI_H
4-4
: Prefer angle brackets for std headers for clarityUse <stdint.h> to denote system header (or your shim) consistently; the build already adds -I include for freestanding use.
-#include "stdint.h" // Or your equivalent for standard types +#include <stdint.h> // Or your equivalent for standard types
19-21
: Prototype style: explicitly take void for functions without parametersMinor C-style improvement for clarity and to prevent accidental K&R-style mismatches.
-void PciEnumerate(); +void PciEnumerate(void);kernel/etc/Shell.c (4)
17-18
: Include path consistencyElsewhere (Kernel.c) we use "ethernet/RTL8139.h". Consider using "ethernet/Packet.h" and "ethernet/RTL8139.h" here for consistency, or rely solely on the added include dirs but keep paths uniform.
-#include "Packet.h" -#include "RTL8139.h" +#include "ethernet/Packet.h" +#include "ethernet/RTL8139.h"
78-80
: Zero-initialize the packet to avoid stray bytesEven with packed structs and all fields set explicitly, zeroing the whole buffer is safer and future-proof if the layout changes.
void ArpRequestTestProcess() { - FullArpPacket packet; + FullArpPacket packet; + FastMemset(&packet, 0, sizeof(packet));
101-106
: Minor readability: use named constants for protocol types/opcodesUsing the constants in Packet.h makes intent obvious at callsites.
- packet.arp.hardware_type = HTONS(1); // 1 = Ethernet - packet.arp.protocol_type = HTONS(0x0800); // 0x0800 = IPv4 + packet.arp.hardware_type = HTONS(ARP_HTYPE_ETHERNET); + packet.arp.protocol_type = HTONS(ETHERTYPE_IPV4); ... - packet.eth.ethertype = HTONS(0x0806); + packet.eth.ethertype = HTONS(ETHERTYPE_ARP);Also applies to: 96-98
127-129
: Use sizeof(packet) to prevent size mismatch bugsKeeps packet length tied to the actual struct size, avoiding drift if the struct changes.
- uint32_t packet_size = sizeof(EthernetHeader) + sizeof(ArpPacket); + uint32_t packet_size = sizeof(packet); // or sizeof(FullArpPacket)drivers/PCI/PCI.c (2)
33-33
: Use appropriate type for bus iteration.The bus variable can reach 256 in the loop condition, but PCI bus numbers are 8-bit values (0-255). Use uint8_t and adjust the loop.
- for (int bus = 0; bus < 256; bus++) { + for (uint16_t bus = 0; bus <= 255; bus++) {
83-90
: Early termination optimization for device search.Once the device is found, the callback continues to be called for remaining devices. Consider adding a mechanism to stop the scan early.
Modify PciScanBus to return a status that allows early termination:
-static void PciScanBus(PciDeviceCallback callback) { +static int PciScanBus(PciDeviceCallback callback) { for (int bus = 0; bus < 256; bus++) { for (int device = 0; device < 32; device++) { // ... existing code ... - callback(pci_dev); + if (callback(pci_dev) != 0) { + return 1; // Early termination requested + } } } + return 0; }
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (8)
drivers/PCI/PCI.c
(2 hunks)drivers/PCI/PCI.h
(1 hunks)drivers/ethernet/Packet.h
(1 hunks)drivers/ethernet/RTL8139.c
(1 hunks)drivers/ethernet/RTL8139.h
(1 hunks)kernel/core/Kernel.c
(2 hunks)kernel/etc/Shell.c
(5 hunks)meson.build
(2 hunks)
🧰 Additional context used
🧬 Code Graph Analysis (6)
drivers/ethernet/RTL8139.c (5)
kernel/etc/Console.c (3)
PrintKernel
(173-189)PrintKernelHex
(218-241)PrintKernelInt
(243-267)drivers/PCI/PCI.c (2)
PciFindDevice
(93-108)PciConfigReadDWord
(14-25)include/Io.h (3)
outb
(6-8)inb
(10-14)outl
(26-28)kernel/memory/KernelHeap.c (1)
KernelMemoryAlloc
(185-227)kernel/memory/MemOps.c (1)
FastMemcpy
(74-97)
kernel/core/Kernel.c (2)
kernel/etc/Console.c (2)
PrintKernel
(173-189)PrintKernelSuccess
(199-203)drivers/ethernet/RTL8139.c (1)
Rtl8139_Init
(12-67)
kernel/etc/Shell.c (5)
drivers/ethernet/RTL8139.c (2)
GetRtl8139Device
(97-99)Rtl8139_SendPacket
(69-95)kernel/memory/MemOps.c (2)
FastMemset
(41-72)FastMemcpy
(74-97)kernel/etc/Console.c (1)
PrintKernel
(173-189)kernel/etc/StringOps.c (1)
FastStrCmp
(18-28)kernel/process/Process.c (1)
CreateProcess
(1064-1066)
drivers/PCI/PCI.h (1)
drivers/PCI/PCI.c (3)
PciEnumerate
(77-81)PciFindDevice
(93-108)PciConfigReadDWord
(14-25)
drivers/PCI/PCI.c (1)
kernel/etc/Console.c (2)
PrintKernel
(173-189)PrintKernelHex
(218-241)
drivers/ethernet/RTL8139.h (1)
drivers/ethernet/RTL8139.c (3)
Rtl8139_Init
(12-67)Rtl8139_SendPacket
(69-95)GetRtl8139Device
(97-99)
🔇 Additional comments (4)
meson.build (2)
48-50
: Include dirs look correctAdding drivers/PCI and drivers/ethernet makes the new headers reachable across the tree.
97-97
: Driver source added to buildRTL8139.c is properly included; matches run-time device '-device rtl8139'.
kernel/etc/Shell.c (1)
171-184
: Shell command wiring and help entry look goodHelp text and command handler for arptest are consistent; process creation is appropriate for isolation.
Also applies to: 275-279
drivers/ethernet/RTL8139.h (1)
1-60
: Well-structured header with clear register definitions.The header file is well-organized with clear register definitions, command bits, and a proper device structure. The API surface is minimal and appropriate for a network driver.
// Configure Rx register: accept broadcast, multicast, and packets for our MAC (AB+AM+APM) | ||
// and wrap packets that are too long. | ||
outl(rtl_device.io_base + REG_RX_CONFIG, 0x0F); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
RX configuration may be too restrictive.
The RX config value of 0x0F only accepts specific packet types. For a more robust configuration, consider using:
- Accept Broadcast (AB) = bit 3
- Accept Multicast (AM) = bit 2
- Accept Physical Match (APM) = bit 1
- Accept All Physical (AAP) = bit 0
- Wrap = bit 7
- // Configure Rx register: accept broadcast, multicast, and packets for our MAC (AB+AM+APM)
- // and wrap packets that are too long.
- outl(rtl_device.io_base + REG_RX_CONFIG, 0x0F);
+ // Configure RX: Accept Broadcast, Multicast, Physical Match, and enable wrap
+ // Bits: Wrap(7) | AB(3) | AM(2) | APM(1)
+ outl(rtl_device.io_base + REG_RX_CONFIG, (1 << 7) | (1 << 3) | (1 << 2) | (1 << 1));
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Configure Rx register: accept broadcast, multicast, and packets for our MAC (AB+AM+APM) | |
// and wrap packets that are too long. | |
outl(rtl_device.io_base + REG_RX_CONFIG, 0x0F); | |
// Configure RX: Accept Broadcast, Multicast, Physical Match, and enable wrap | |
// Bits: Wrap(7) | AB(3) | AM(2) | APM(1) | |
outl(rtl_device.io_base + REG_RX_CONFIG, (1 << 7) | (1 << 3) | (1 << 2) | (1 << 1)); |
🤖 Prompt for AI Agents
In drivers/ethernet/RTL8139.c around lines 62 to 65, the RX config is set to
0x0F which omits the Wrap and Accept All Physical (AAP) bits; update the written
value to include Wrap (bit 7) plus AB (bit 3), AM (bit 2), APM (bit 1) and AAP
(bit 0) so the register accepts broadcast, multicast, physical match,
all-physical and wraps long packets — i.e. set the combined bitmask
(Wrap|AB|AM|APM|AAP) instead of 0x0F.
// Get the I/O and memory addresses for the current transmit descriptor | ||
int tx_index = rtl_device.current_tx_buffer; | ||
uint32_t tx_addr_reg = REG_TX_ADDR_0 + (tx_index * 4); | ||
uint32_t tx_stat_reg = REG_TX_STATUS_0 + (tx_index * 4); | ||
uint8_t* tx_buffer = rtl_device.tx_buffers[tx_index]; | ||
|
||
// Copy the packet data to our DMA-safe buffer | ||
FastMemcpy(tx_buffer, data, len); | ||
|
||
// Tell the card where the data is (physical address) | ||
uint32_t tx_phys_addr = (uint32_t)tx_buffer; | ||
outl(rtl_device.io_base + tx_addr_reg, tx_phys_addr); | ||
|
||
// Tell the card the length of the data and start sending! | ||
outl(rtl_device.io_base + tx_stat_reg, len); | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Check if previous transmission completed before reusing TX buffer.
The code doesn't verify if the previous transmission using this buffer has completed. This could lead to corrupting an in-flight packet.
Add a check for the TX_HOST_OWNS bit in the status register before using the buffer:
// Get the I/O and memory addresses for the current transmit descriptor
int tx_index = rtl_device.current_tx_buffer;
uint32_t tx_addr_reg = REG_TX_ADDR_0 + (tx_index * 4);
uint32_t tx_stat_reg = REG_TX_STATUS_0 + (tx_index * 4);
+
+ // Check if the buffer is available (bit 13 = TX_HOST_OWNS)
+ uint32_t tx_status = inl(rtl_device.io_base + tx_stat_reg);
+ if (!(tx_status & (1 << 13))) {
+ PrintKernel("TX buffer not available, transmission in progress.\n");
+ return;
+ }
+
uint8_t* tx_buffer = rtl_device.tx_buffers[tx_index];
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
// Get the I/O and memory addresses for the current transmit descriptor | |
int tx_index = rtl_device.current_tx_buffer; | |
uint32_t tx_addr_reg = REG_TX_ADDR_0 + (tx_index * 4); | |
uint32_t tx_stat_reg = REG_TX_STATUS_0 + (tx_index * 4); | |
uint8_t* tx_buffer = rtl_device.tx_buffers[tx_index]; | |
// Copy the packet data to our DMA-safe buffer | |
FastMemcpy(tx_buffer, data, len); | |
// Tell the card where the data is (physical address) | |
uint32_t tx_phys_addr = (uint32_t)tx_buffer; | |
outl(rtl_device.io_base + tx_addr_reg, tx_phys_addr); | |
// Tell the card the length of the data and start sending! | |
outl(rtl_device.io_base + tx_stat_reg, len); | |
// Get the I/O and memory addresses for the current transmit descriptor | |
int tx_index = rtl_device.current_tx_buffer; | |
uint32_t tx_addr_reg = REG_TX_ADDR_0 + (tx_index * 4); | |
uint32_t tx_stat_reg = REG_TX_STATUS_0 + (tx_index * 4); | |
// Check if the buffer is available (bit 13 = TX_HOST_OWNS) | |
uint32_t tx_status = inl(rtl_device.io_base + tx_stat_reg); | |
if (!(tx_status & (1 << 13))) { | |
PrintKernel("TX buffer not available, transmission in progress.\n"); | |
return; | |
} | |
uint8_t* tx_buffer = rtl_device.tx_buffers[tx_index]; | |
// Copy the packet data to our DMA-safe buffer | |
FastMemcpy(tx_buffer, data, len); | |
// Tell the card where the data is (physical address) | |
uint32_t tx_phys_addr = (uint32_t)tx_buffer; | |
outl(rtl_device.io_base + tx_addr_reg, tx_phys_addr); | |
// Tell the card the length of the data and start sending! | |
outl(rtl_device.io_base + tx_stat_reg, len); |
🤖 Prompt for AI Agents
In drivers/ethernet/RTL8139.c around lines 75 to 90, the code writes a new
packet into the current TX buffer without verifying the previous transmission
finished; read the TX status register (REG_TX_STATUS_0 + tx_index*4) and check
the TX_HOST_OWNS (or equivalent "buffer in use") bit before copying/writing the
new packet, and if the bit indicates the NIC still owns the descriptor either
wait/poll with a timeout (or return an error) to avoid overwriting an in-flight
packet; only proceed with FastMemcpy and the outl() writes when the status shows
the host owns the buffer, and handle timeout/error paths (restore
state/increment index) appropriately.
drivers/PCI/PCI.c
Outdated
static PciDevice found_device; | ||
static int device_found_flag; | ||
static uint16_t target_vendor_id; | ||
static uint16_t target_device_id; | ||
// Callback function pointer type | ||
typedef void (*PciDeviceCallback)(PciDevice device); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Move typedef to header and address thread-safety concerns.
The PciDeviceCallback
typedef should be in the header file (PCI.h) since it's part of the API contract. Additionally, the static globals used for device searching are not thread-safe.
Move the typedef to PCI.h and consider passing search context through the callback or protecting with locks:
-static PciDevice found_device;
-static int device_found_flag;
-static uint16_t target_vendor_id;
-static uint16_t target_device_id;
-// Callback function pointer type
-typedef void (*PciDeviceCallback)(PciDevice device);
+// Move typedef to PCI.h
+
+// Search context structure
+typedef struct {
+ uint16_t vendor_id;
+ uint16_t device_id;
+ PciDevice* out_device;
+ int found;
+} PciSearchContext;
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
static PciDevice found_device; | |
static int device_found_flag; | |
static uint16_t target_vendor_id; | |
static uint16_t target_device_id; | |
// Callback function pointer type | |
typedef void (*PciDeviceCallback)(PciDevice device); | |
// Move typedef to PCI.h | |
// Search context structure | |
typedef struct { | |
uint16_t vendor_id; | |
uint16_t device_id; | |
PciDevice* out_device; | |
int found; | |
} PciSearchContext; |
🤖 Prompt for AI Agents
In drivers/PCI/PCI.c around lines 7 to 12, the PciDeviceCallback typedef belongs
in the public header and the static globals (found_device, device_found_flag,
target_vendor_id, target_device_id) make the search logic not thread-safe; move
the typedef declaration into PCI.h and update callers to include that header,
then eliminate the static globals by making search state local to the search
function and passing any needed context through the callback (e.g., change the
callback signature to accept a void* context or a search-state struct), or
alternatively protect shared state with a mutex if true shared state is
required; ensure all uses are updated to the new callback signature and that
concurrent searches use independent state or proper locking.
PrintKernel("[INFO] Initializing RTL8139 Driver...\n"); | ||
Rtl8139_Init(); | ||
PrintKernelSuccess("[SYSTEM] RTL8139 Driver initialized\n"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
Avoid unconditional “Driver initialized” message when device may be absent
Rtl8139_Init() returns void and, per driver docs, may early-return if the NIC is not found. Logging success unconditionally is misleading.
- PrintKernel("[INFO] Initializing RTL8139 Driver...\n");
- Rtl8139_Init();
- PrintKernelSuccess("[SYSTEM] RTL8139 Driver initialized\n");
+ PrintKernel("[INFO] Probing RTL8139 device...\n");
+ Rtl8139_Init(); // TODO: change to return status (0 on success), or expose Rtl8139_IsPresent()
+ PrintKernelSuccess("[SYSTEM] RTL8139 probe complete (see driver logs)\n");
Alternatively, change Rtl8139_Init to return an int and gate the success log accordingly.
📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
PrintKernel("[INFO] Initializing RTL8139 Driver...\n"); | |
Rtl8139_Init(); | |
PrintKernelSuccess("[SYSTEM] RTL8139 Driver initialized\n"); | |
PrintKernel("[INFO] Probing RTL8139 device...\n"); | |
Rtl8139_Init(); // TODO: change to return status (0 on success), or expose Rtl8139_IsPresent() | |
PrintKernelSuccess("[SYSTEM] RTL8139 probe complete (see driver logs)\n"); |
🤖 Prompt for AI Agents
In kernel/core/Kernel.c around lines 385-387, the code unconditionally logs
"[SYSTEM] RTL8139 Driver initialized" after calling Rtl8139_Init(), but the init
function may early-return when the NIC is absent; change Rtl8139_Init() to
return an int/bool status (0/false on not found or failure, 1/true on success),
update its prototype in the RTL8139 header and all callers, then call it like
`if (Rtl8139_Init()) PrintKernelSuccess(...)` otherwise log or skip the success
message so the success log is only emitted when the driver actually initialized.
const Rtl8139Device* nic = GetRtl8139Device(); | ||
if (!nic) { | ||
// Handle error: NIC not initialized | ||
return; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🛠️ Refactor suggestion
GetRtl8139Device() likely never returns NULL; add a real readiness check
Per provided snippet, GetRtl8139Device() returns the address of a static device, so this NULL check won’t fire. If the NIC isn’t present/initialized, SendPacket may touch invalid state. Expose and check a driver “present/initialized” flag or a Rtl8139_IsReady() API.
- const Rtl8139Device* nic = GetRtl8139Device();
- if (!nic) {
- // Handle error: NIC not initialized
- return;
- }
+ const Rtl8139Device* nic = GetRtl8139Device();
+ if (!Rtl8139_IsReady()) { // new API: returns true when device was found & initialized
+ PrintKernel("[NET] RTL8139 not ready; aborting ARP test.\n");
+ return;
+ }
If you prefer not to change the public API, export a const boolean (e.g., Rtl8139Present) or a getter.
Committable suggestion skipped: line range outside the PR's diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup ok, also we have the same problem again, if a process is killed normally, there should be 3 message
[SYSTEM] Process PID 4 exited normally
[SYSTEM] Terminating PID 4 Reason: 0
and clean up terminated process, but theres a chance that only 2 will shown and the kernel will crash, hmm interesting, could this be that the kernel pics up the zombie process before clean up terminated process gets called, this problem is literally so annoying
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find process termination and cleanup related code
echo "=== Searching for process termination messages ==="
rg "Process PID.*exited normally" -A 5 -B 5
echo -e "\n=== Searching for Terminating PID messages ==="
rg "Terminating PID.*Reason:" -A 5 -B 5
echo -e "\n=== Searching for cleanup terminated process ==="
rg -i "clean.*terminated.*process" -A 5 -B 5
echo -e "\n=== Looking for zombie process handling ==="
rg -i "zombie" -A 3 -B 3
echo -e "\n=== Finding process states/status definitions ==="
rg "PROCESS_STATE_|ProcessState|TERMINATED|ZOMBIE" -A 2 -B 2
Length of output: 12437
🏁 Script executed:
#!/bin/bash
# Look for process exit and termination functions
echo "=== Finding TerminateProcess function ==="
ast-grep --pattern 'void TerminateProcess($$$) {
$$$
}'
echo -e "\n=== Finding ExitProcess or process exit handling ==="
rg "ExitProcess|ProcessExit|exit.*normally" -A 10 -B 5
echo -e "\n=== Looking for termination reason codes ==="
rg "TERM_KILLED|TERM_NORMAL|TerminationReason" -A 2 -B 2
echo -e "\n=== Finding where 'Terminating PID' is printed ==="
rg "Terminating" -A 5 -B 5
echo -e "\n=== Check scheduler's process selection logic ==="
rg "FastSchedule|SelectNextProcess" -A 20 -B 5
Length of output: 18766
🏁 Script executed:
#!/bin/bash
# Look at termination queue implementation
echo "=== Finding termination queue implementation ==="
rg "AddToTerminationQueueAtomic|RemoveFromTerminationQueueAtomic" -A 10 -B 5
echo -e "\n=== Looking at process selection in scheduler ==="
ast-grep --pattern 'if ($_ == PROC_DYING || $_ == PROC_ZOMBIE || $_) {
$$$
}'
echo -e "\n=== Finding where current_running is set ==="
rg "MLFQscheduler\.current_running\s*=" -A 5 -B 5
echo -e "\n=== Check atomic operations ==="
rg "AtomicCmpxchg|AtomicRead|AtomicInc" -A 2 -B 2
echo -e "\n=== Look for process slot validation ==="
rg "slot >= MAX_PROCESSES|slot < MAX_PROCESSES" -A 3 -B 3
Length of output: 11106
@assembler-0, I've analyzed the race condition in your process termination flow. The issue is a critical ordering problem in TerminateProcess()
:
The Race Condition
In TerminateProcess()
(lines 262-263 of Process.c), the operations occur in this order:
AddToTerminationQueueAtomic(slot); // Line 262
proc->state = PROC_ZOMBIE; // Line 263
This creates a window where the process is in the termination queue but not yet marked as ZOMBIE. Here's the problematic sequence:
- CPU 1: Process exits normally → prints "Process PID X exited normally"
- CPU 1: Calls
TerminateProcess()
→ prints "Terminating PID X Reason: 0" - CPU 1: Adds process to termination queue (but state is still PROC_DYING)
- CPU 2:
CleanupTerminatedProcesses()
picks it up from queue - CPU 2: Checks
proc->state != PROC_ZOMBIE
and finds it's still PROC_DYING - CPU 2: Prints warning but continues cleanup anyway, clearing the process memory
- CPU 1: Tries to set
proc->state = PROC_ZOMBIE
on already-cleared memory - Crash: Scheduler might access the corrupted/cleared process structure
Suggested Fix
Reverse the order and add proper memory barriers:
// In TerminateProcess(), around line 262-263:
proc->state = PROC_ZOMBIE; // Set state FIRST
__sync_synchronize(); // Full memory barrier
AddToTerminationQueueAtomic(slot); // Then add to queue
// In CleanupTerminatedProcesses(), add stricter validation:
if (proc->state != PROC_ZOMBIE) {
PrintKernelWarning("[SYSTEM] Cleanup found non-zombie process...");
// Don't continue! Re-queue or skip this process
continue; // Skip cleanup instead of proceeding
}
Additionally, RemoveFromTerminationQueueAtomic()
lacks proper atomic operations - it should use atomic reads for the queue head.
This explains why you sometimes see only 2 messages before crashing - the cleanup happens before the zombie state is set, leading to memory corruption.
🧬 Code Graph Analysis Results
- File: kernel/process/Process.c, Lines 341-343
void KillProcess(uint32_t pid) {
TerminateProcess(pid, TERM_KILLED, 1);
}
- File: include/stdlib.h, Lines 18-44
static inline int atoi(const char* str) {
int result = 0;
int sign = 1;
int i = 0;
// Skip leading whitespace
while (str[i] == ' ' || str[i] == '\t' || str[i] == '\n' ||
str[i] == '\r' || str[i] == '\f' || str[i] == '\v') {
i++;
}
// Handle sign
if (str[i] == '-') {
sign = -1;
i++;
} else if (str[i] == '+') {
i++;
}
// Convert digits
while (str[i] >= '0' && str[i] <= '9') {
result = result * 10 + (str[i] - '0');
i++;
}
return result * sign;
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah its the same, still stucks after that, also when it works the [SYSTEM] Process PID 8 exited normaly, doesnt glow yellow, but when it dont, it glows, hmm very interesting
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
# Find where "exited normally" is printed and what function is used
echo "=== Finding 'exited normally' print statements ==="
rg "exited normally" -B 2 -A 2
echo -e "\n=== Looking for PrintKernelSuccess and PrintKernelWarning implementations ==="
rg "PrintKernelSuccess|PrintKernelWarning" --type c -A 3 -B 1 | head -50
echo -e "\n=== Finding console color codes or attributes ==="
rg "COLOR_YELLOW|YELLOW|0x0E|14" --type c -A 2 -B 2 | grep -E "(Console|Print|Color)" | head -30
echo -e "\n=== Check if there are multiple exit paths that print similar messages ==="
rg "Process PID|exited|terminated" --type c -B 3 -A 3 | grep -E "(Print|exit|Process PID)"
echo -e "\n=== Look for the actual exit function implementation ==="
ast-grep --pattern 'void $_($_) {
$$$
PrintKernelSuccess($$$exited normally$$$)
$$$
}'
echo -e "\n=== Check for race conditions in print functions ==="
rg "static.*current_color|console.*lock|print.*mutex" -A 3 -B 3
Length of output: 5195
🏁 Script executed:
#!/bin/bash
# Look at the exact termination sequence and PrintKernelWarning implementation
echo "=== Finding PrintKernelWarning implementation ==="
ast-grep --pattern 'void PrintKernelWarning($_) {
$$$
}'
echo -e "\n=== Finding where console color is stored ==="
rg "console\.color|current.*color|ConsoleSetColor" -B 3 -A 3
echo -e "\n=== Looking at the exact ExitProcess implementation ==="
ast-grep --pattern 'void ExitProcess($_) {
$$$
}'
echo -e "\n=== Finding TerminateProcess full implementation ==="
rg -A 30 "void TerminateProcess\("
echo -e "\n=== Check for unsafe memory operations in cleanup ==="
rg "FastMemset.*proc|memset.*process|bzero.*proc" -B 2 -A 2
echo -e "\n=== Look for the console structure definition ==="
rg "typedef struct.*[Cc]onsole|struct.*console.*{" -A 10
Length of output: 13982
@assembler-0, I found the critical bug! The issue is in CleanupTerminatedProcesses()
. Even though it detects non-zombie processes, it still continues with the cleanup:
// Line from CleanupTerminatedProcesses (based on earlier output)
if (proc->state != PROC_ZOMBIE) {
PrintKernelWarning("[SYSTEM] Cleanup found non-zombie process...");
// BUG: It continues anyway instead of skipping!
}
// Later in the same function:
FastMemset(proc, 0, sizeof(Process)); // This still happens!
The color inconsistency you're seeing (yellow when failing, no glow when working) is a symptom of memory corruption. Here's what's happening:
-
Race sequence:
- Process exits → prints "exited normally" with
PrintKernelWarning()
(should be yellow) TerminateProcess()
adds to queue (but state not yet ZOMBIE)CleanupTerminatedProcesses()
picks it up, sees it's not ZOMBIE, prints warning- BUG: Continues to
FastMemset(proc, 0, sizeof(Process))
anyway! - This corrupts memory while other operations are in flight
- Process exits → prints "exited normally" with
-
Why the color changes:
- When no race: Message prints yellow correctly
- When race occurs: Memory corruption affects console state during printing
Fix needed:
// In CleanupTerminatedProcesses(), change this:
if (proc->state != PROC_ZOMBIE) {
PrintKernelWarning("[SYSTEM] Cleanup found non-zombie process...");
continue; // MUST skip this iteration, don't process!
}
// And in TerminateProcess(), reverse the order:
proc->state = PROC_ZOMBIE; // Set state FIRST
__sync_synchronize(); // Memory barrier
AddToTerminationQueueAtomic(slot); // Then add to queue
The continue
statement is crucial - without it, the cleanup proceeds even for non-zombie processes, causing the crashes and weird behavior you're seeing.
🧬 Code Graph Analysis Results
-
File: drivers/PS2.c (lines 101-108)
- GetChar function
- Code:
char GetChar(void) { if (buffer_count == 0) return 0; char c = input_buffer[buffer_head]; buffer_head = (buffer_head + 1) % 256; buffer_count--; return c; }
-
File: drivers/PS2.c (lines 110-112)
- HasInput function
- Code:
int HasInput(void) { return buffer_count > 0; }
-
File: kernel/etc/Console.c (lines 173-189)
- PrintKernel function
- Code:
void PrintKernel(const char* str) { if (!str) return; SpinLock(&lock); if (use_vbe) { VBEConsolePrint(str); } else { const uint8_t original_color = console.color; for (const char* p = str; *p; p++) { ConsolePutchar(*p); } console.color = original_color; } SpinUnlock(&lock); SerialWrite(str); }
-
File: kernel/etc/Console.c (lines 199-203)
- PrintKernelSuccess function
- Code:
void PrintKernelSuccess(const char* str) { ConsoleSetColor(VGA_COLOR_WHITE); PrintKernel(str); ConsoleSetColor(VGA_COLOR_DEFAULT); }
-
File: kernel/process/Process.c (lines 908-919)
- Yield function
- Code:
void Yield() { Process* current = GetCurrentProcess(); if (current) { // A process that yields is ready to run again, just giving up its timeslice. // Setting it to PROC_BLOCKED was incorrect as there was no corresponding unblock mechanism. current->state = PROC_READY; } RequestSchedule(); // This instruction halts the CPU until the next interrupt (e.g., the timer), // which will then trigger the scheduler. __asm__ __volatile__("hlt"); }
-
File: kernel/process/Process.c (lines 1539-1574)
- ListProcesses function
- Code:
void ListProcesses(void) { PrintKernel("\n--- Enhanced Process List ---\n"); PrintKernel("PID\tState \tPrio\tCPU%\tI/O\tPreempt\n"); PrintKernel("-----------------------------------------------\n"); uint64_t total_cpu_time = 1; // Avoid division by zero for (int i = 0; i < MAX_PROCESSES; i++) { if (i == 0 || processes[i].pid != 0) { total_cpu_time += processes[i].cpu_time_accumulated; } } for (int i = 0; i < MAX_PROCESSES; i++) { if (i == 0 || processes[i].pid != 0) { const Process* p = &processes[i]; uint32_t cpu_percent = (uint32_t)((p->cpu_time_accumulated * 100) / total_cpu_time); PrintKernelInt(p->pid); PrintKernel("\t"); PrintKernel(GetStateString(p->state)); PrintKernel("\t"); PrintKernelInt(p->priority); PrintKernel("\t"); PrintKernelInt(cpu_percent); PrintKernel("%\t"); PrintKernelInt(p->io_operations); PrintKernel("\t"); PrintKernelInt(p->preemption_count); PrintKernel("\n"); } } PrintKernel("-----------------------------------------------\n"); PrintKernel("Total CPU time: "); PrintKernelInt((uint32_t)total_cpu_time); PrintKernel(" ticks\n"); }
-
File: kernel/memory/Memory.c (lines 319-369)
- GetDetailedMemoryStats function
- Code:
void GetDetailedMemoryStats(MemoryStats* stats) { irq_flags_t flags = SpinLockIrqSave(&memory_lock); stats->total_physical_bytes = total_pages * PAGE_SIZE; stats->used_physical_bytes = used_pages * PAGE_SIZE; stats->free_physical_bytes = (total_pages - used_pages) * PAGE_SIZE; stats->allocation_count = allocation_count; stats->free_count = free_count; stats->allocation_failures = allocation_failures; stats->huge_pages_allocated = huge_pages_allocated; // Calculate fragmentation score and largest free block uint64_t free_fragments = 0; uint64_t current_fragment = 0; uint64_t largest_fragment = 0; for (uint64_t i = 0x100000 / PAGE_SIZE; i < total_pages; i++) { if (IsPageFree(i)) { current_fragment++; } else { if (current_fragment > 0) { free_fragments++; if (current_fragment > largest_fragment) { largest_fragment = current_fragment; } current_fragment = 0; } } } // Handle case where last pages are free if (current_fragment > 0) { free_fragments++; if (current_fragment > largest_fragment) { largest_fragment = current_fragment; } } stats->largest_free_block = largest_fragment * PAGE_SIZE; // Fragmentation score: more fragments = higher score uint64_t total_free_pages = total_pages - used_pages; if (total_free_pages > 0) { stats->fragmentation_score = (free_fragments * 100) / (total_free_pages / 10 + 1); if (stats->fragmentation_score > 100) stats->fragmentation_score = 100; } else { stats->fragmentation_score = 0; } SpinUnlockIrqRestore(&memory_lock, flags); }
-
File: kernel/memory/KernelHeap.c (lines 310-340)
- PrintHeapStats function
- Code:
void PrintHeapStats(void) { const irq_flags_t flags = SpinLockIrqSave(&kheap_lock); size_t free_blocks = 0, used_blocks = 0; size_t free_bytes = 0, used_bytes = 0; size_t corrupted = 0; for (HeapBlock* block = heap_head; block; block = block->next) { if (!ValidateBlock(block, "stats")) { corrupted++; continue; } if (block->is_free) { free_blocks++; free_bytes += block->size; } else { used_blocks++; used_bytes += block->size; } } SpinUnlockIrqRestore(&kheap_lock, flags); PrintKernel("[HEAP] Blocks: "); PrintKernelInt(used_blocks); PrintKernel(" used, "); PrintKernelInt(free_blocks); PrintKernel(" free\n"); PrintKernel("[HEAP] Memory: "); PrintKernelInt(used_bytes); PrintKernel(" used, "); PrintKernelInt(free_bytes); PrintKernel(" free\n"); PrintKernel("[HEAP] Peak allocated: "); PrintKernelInt(peak_allocated); PrintKernel(", corrupted blocks: "); PrintKernelInt(corrupted); PrintKernel("\n"); }
-
File: kernel/memory/VMem.c (lines 525-539)
- PrintVMemStats function
- Code:
void PrintVMemStats(void) { irq_flags_t flags = SpinLockIrqSave(&vmem_lock); uint64_t used = kernel_space.used_pages; uint64_t mapped = kernel_space.total_mapped; uint64_t allocs = vmem_allocations; uint64_t frees = vmem_frees; uint64_t flushes = tlb_flushes; SpinUnlockIrqRestore(&vmem_lock, flags); PrintKernel("[VMEM] Stats:\n"); PrintKernel(" Used pages: "); PrintKernelInt(used); PrintKernel("\n"); PrintKernel(" Mapped: "); PrintKernelInt(mapped / (1024 * 1024)); PrintKernel("MB\n"); PrintKernel(" Allocs: "); PrintKernelInt(allocs); PrintKernel(", Frees: "); PrintKernelInt(frees); PrintKernel("\n"); PrintKernel(" TLB flushes: "); PrintKernelInt(flushes); PrintKernel("\n"); }
-
File: include/stdlib.h (lines 18-44)
- atoi function
- Code:
static inline int atoi(const char* str) { int result = 0; int sign = 1; int i = 0; // Skip leading whitespace while (str[i] == ' ' || str[i] == '\t' || str[i] == '\n' || str[i] == '\r' || str[i] == '\f' || str[i] == '\v') { i++; } // Handle sign if (str[i] == '-') { sign = -1; i++; } else if (str[i] == '+') { i++; } // Convert digits while (str[i] >= '0' && str[i] <= '9') { result = result * 10 + (str[i] - '0'); i++; } return result * sign; }
-
File: kernel/etc/StringOps.c (lines 18-28)
- FastStrCmp function
- Code:
int FastStrCmp(const char* str1, const char* str2) { if (!str1 || !str2) return (str1 == str2) ? 0 : (str1 ? 1 : -1); // Simple byte-by-byte comparison to avoid alignment issues while (*str1 && *str1 == *str2) { str1++; str2++; } return (unsigned char)*str1 - (unsigned char)*str2; }
-
File: fs/VFS.c (lines 267-286)
- VfsIsDir function
- Code:
int VfsIsDir(const char* path) { VfsMountStruct* mount = VfsFindMount(path); if (!mount) return 0; const char* local_path = VfsStripMount(path, mount); switch (mount->type) { case VFS_RAMFS: { FsNode* node = FsFind(local_path); return node && node->type == FS_DIRECTORY; } case VFS_FAT12: { extern int fat12_initialized; if (!fat12_initialized) return 0; // Use new directory detection function return Fat12IsDirectory(local_path); } } return 0; }
-
File: fs/VFS.c (lines 126-165)
- VfsReadFile function
- Code:
int VfsReadFile(const char* path, void* buffer, uint32_t max_size) { SerialWrite("[VFS] VfsReadFile called\n"); if (!path || !buffer || max_size == 0) { SerialWrite("[VFS] Invalid parameters\n"); return -1; } VfsMountStruct* mount = VfsFindMount(path); if (!mount) { SerialWrite("[VFS] No mount found\n"); return -1; } const char* local_path = VfsStripMount(path, mount); if (!local_path) { SerialWrite("[VFS] Path strip failed\n"); return -1; } switch (mount->type) { case VFS_RAMFS: { if (FastStrlen(local_path, 2) == 0) local_path = "/"; FsNode* node = FsFind(local_path); if (!node || node->type != FS_FILE) return -1; if (node->size == 0) return 0; if (node->data == NULL) return 0; uint32_t copy_size = (node->size < max_size) ? node->size : max_size; FastMemcpy(buffer, node->data, copy_size); return copy_size; } case VFS_FAT12: { extern int fat12_initialized; if (!fat12_initialized) return -1; // Use new path-aware function return Fat12ReadFile(local_path, buffer, max_size); } } return -1; }
-
File: drivers/ethernet/RTL8139.c (lines 97-99)
- GetRtl8139Device function
- Code:
const Rtl8139Device* GetRtl8139Device() { return &rtl_device; }
-
File: drivers/PCI/PCI.c (lines 77-81)
- PciEnumerate function
- Code:
void PciEnumerate() { PrintKernel("--- PCI Bus Enumeration ---\n"); PciScanBus(PrintPciDeviceInfo); PrintKernel("---------------------------\n"); }
✏️ Learnings added
Learnt from: assembler-0
PR: assembler-0/VoidFrame#69
File: kernel/etc/Shell.c:82-86
Timestamp: 2025-08-11T07:18:24.590Z
Learning: In VoidFrame's process cleanup system, `CleanupTerminatedProcesses()` must skip (using `continue`) when encountering non-zombie processes in the termination queue, as continuing with `FastMemset()` on non-zombie processes causes memory corruption and kernel crashes.
Learnt from: assembler-0
PR: assembler-0/VoidFrame#69
File: kernel/etc/Shell.c:82-86
Timestamp: 2025-08-11T07:18:24.590Z
Learning: The order of operations in VoidFrame's `TerminateProcess()` is critical: the process state must be set to PROC_ZOMBIE before adding to the termination queue with a memory barrier between them, otherwise CleanupTerminatedProcesses may process the entry before the zombie state is set.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
Summary by CodeRabbit