-
Notifications
You must be signed in to change notification settings - Fork 0
gemeni_study_plan
Here is a highly aggressive, code-first 6-month roadmap designed to bridge your expertise in userspace daemons and diagnostics with kernel internals and modern compute architecture.
This plan skips the beginner tutorials. It is structured around the heavy compute and system architecture requirements typical of major CPU/GPU semiconductor companies, ensuring you remain firmly on the core developer track.
Goal: Upgrade your C/C++11 foundation to C++20, focusing on zero-cost abstractions and lock-free programming for high-throughput environments.
-
Week 1: Move Semantics & Memory Management * Action: Write code using
std::unique_ptr,std::shared_ptr, perfect forwarding, and rvalue references.-
Code: Refactor a legacy C-style struct parser to use purely modern C++ memory management without a single raw
newordelete.
-
Code: Refactor a legacy C-style struct parser to use purely modern C++ memory management without a single raw
-
Week 2: Compile-Time Execution & Templates
-
Action: Master
constexpr,consteval, and C++20 Concepts. - Code: Build a compile-time static configuration parser. The goal is to parse a string of configurations entirely at compile time, resulting in zero runtime overhead.
-
Action: Master
-
Week 3: Advanced Concurrency & Memory Models
-
Action: Study
std::atomic, memory barriers (acquire/release semantics), and C++ thread pools. -
Code: Implement a custom thread pool using modern C++ that dispatches tasks to worker threads using
std::condition_variable.
-
Action: Study
-
Week 4: Lock-Free Data Structures
- Action: Learn how mutexes destroy throughput in IPC and logging daemons.
- Project 1: Lock-Free Logging Daemon. Rewrite the core of a multi-client logging daemon (similar to your previous work) using a Single-Producer/Single-Consumer (SPSC) lock-free ring buffer in C++20. Measure the latency difference against a mutex-based approach.
Goal: Evolve your DWARF/ELF parsing expertise into the kernel space by writing eBPF programs to trace execution without modifying kernel code.
-
Week 5: Introduction to eBPF & BCC
-
Action: Set up an eBPF development environment. Learn the
bpf()syscall and the eBPF verifier limits. -
Code: Write a simple Python/BCC script to hook into
sys_cloneand print a message every time a new process is spawned.
-
Action: Set up an eBPF development environment. Learn the
-
Week 6: libbpf & CO-RE (Compile Once, Run Everywhere)
-
Action: Move away from BCC (which requires LLVM on the target) to
libbpfusing C. - Code: Write a C-based eBPF program that hooks into the kernel's network stack to drop packets from a specific IP, utilizing BPF maps to store the IP addresses.
-
Action: Move away from BCC (which requires LLVM on the target) to
-
Week 7: Tracepoints, Kprobes & Uprobes
- Action: Learn how to attach eBPF to kernel functions (Kprobes) and userspace functions (Uprobes).
-
Code: Write an eBPF program that attaches to
malloc(Uprobe) in a target C++ application to track memory allocations and detect potential leaks.
-
Week 8: Advanced eBPF Observability
-
Project 2: System-Wide I/O Profiler. Build an eBPF tool using
libbpfthat intercepts Block I/O layer tracepoints to measure latency per block device. Connect this to your existing knowledge of ELF parsing to resolve kernel stack traces into readable function names in user space.
-
Project 2: System-Wide I/O Profiler. Build an eBPF tool using
Goal: Cross the syscall boundary. Understand how the kernel manages memory and exposes hardware capabilities to userspace.
-
Week 9: Loadable Kernel Modules (LKMs) & Character Drivers
-
Action: Learn the structure of modern kernel modules,
init/exitmacros, and registering a character device. -
Code: Write a basic kernel module that creates a
/dev/custom_nulldevice. Implementread,write, andioctlfile operations.
-
Action: Learn the structure of modern kernel modules,
-
Week 10: Kernel Memory Allocation (kmalloc, vmalloc)
- Action: Understand physical vs. virtual memory, page frames, and the Slub allocator.
-
Code: Write a module that allocates pages of memory using
alloc_pagesand inspects the physical addresses.
-
Week 11: Memory Mapping (mmap) & Zero-Copy
-
Action: Learn how to share memory directly between the kernel and userspace to bypass
copy_to_user. -
Code: Expand your character driver to support the
mmapsyscall. Allocate a contiguous buffer in the kernel and map it directly into a userspace C++ test application.
-
Action: Learn how to share memory directly between the kernel and userspace to bypass
-
Week 12: Concurrency in the Kernel
- Project 3: High-Performance Shared Memory Bridge. Write a kernel module that utilizes spinlocks and atomic operations to safely manage a shared memory buffer between two concurrent userspace processes, demonstrating pure zero-copy IPC.
Goal: Master the backbone of modern heterogeneous compute systems (GPUs, NPUs).
-
Week 13: PCIe Architecture & Enumeration
- Action: Study the PCIe configuration space, Base Address Registers (BARs), and how the kernel enumerates devices.
-
Code: Write a kernel module that walks the PCI bus using
pci_get_device(), reading and printing the configuration space (Vendor ID, Device ID, BAR addresses) of a specific device.
-
Week 14: Interrupts & Tasklets
- Action: Understand hardware interrupts, top-halves, bottom-halves (tasklets/workqueues), and MSI/MSI-X.
-
Code: Set up a virtual PCIe device using QEMU. Write a driver that registers an interrupt handler (
request_irq) to catch virtual interrupts.
-
Week 15: Direct Memory Access (DMA)
- Action: Learn how hardware writes directly to RAM without CPU intervention (cache coherence, DMA mapping).
-
Code: Implement a basic DMA mapping sequence in your virtual PCIe driver using
dma_alloc_coherent.
-
Week 16: PCIe Compute Simulation
-
Project 4: Virtual PCIe Accelerator Driver. Using QEMU's
edu(educational virtual PCIe device), write a complete Linux driver that maps the device's BARs, sets up a DMA transfer to pass a block of data to the "device", triggers a hardware interrupt, and reads the processed data back.
-
Project 4: Virtual PCIe Accelerator Driver. Using QEMU's
Goal: Connect your hardware/microcontroller expertise with kernel driver development, focusing on the USB subsystem.
-
Week 17: Linux USB Subsystem & URBs
- Action: Study USB Request Blocks (URBs), endpoints, and communication classes.
-
Code: Write a simple kernel module that registers with the USB core (
usb_register) and probes for a specific Vendor ID/Product ID.
-
Week 18: Writing a Custom USB Driver
-
Action: Bypass generic kernel drivers (like
cdc_acm) to take direct control of a device. - Code: Take a microcontroller acting as a simple CDC device. Write a custom kernel driver that claims its interface and sends bulk USB transfers to toggle an LED or read a sensor state.
-
Action: Bypass generic kernel drivers (like
-
Week 19: Netlink Sockets & Kernel-to-User Eventing
- Action: Learn the modern way to push asynchronous events from kernel space to userspace.
- Code: Modify your custom USB driver to broadcast a message over a Netlink socket whenever the hardware state changes.
-
Week 20: Hardware Hotplug Integration
- Project 5: Advanced USB Event Daemon. Write a modern C++20 userspace daemon that listens to the Netlink socket from your custom driver. When the CDC device connects or sends an event, the daemon updates a local state machine and logs the latency using a lock-free queue.
Goal: Combine Modern C++, eBPF, and Kernel Internals into a single, portfolio-defining architectural project.
-
Week 21-24: System-Wide Hardware Trace Bridge
- The Concept: Build a modern tracing infrastructure that bridges hardware events to userspace analysis without bringing down system performance.
- Component 1 (eBPF): Write an eBPF program that hooks into the kernel's PCIe or USB subsystems to capture highly granular timing data on hardware interrupts or DMA completion events.
- Component 2 (Kernel): If necessary, write a lightweight kernel module to expose specific hardware registers that eBPF cannot easily access.
- Component 3 (Userspace C++20): Build a daemon using lock-free data structures that ingests the eBPF maps in real-time. Use your previous ELF/DWARF expertise to map these raw execution addresses back to human-readable symbols in the running applications.
- Outcome: A zero-latency observability platform that proves you can architect solutions spanning from the physical hardware interconnect all the way up to advanced C++ userspace processing.