Skip to content

gemeni_study_plan

Madrajib Lab edited this page May 16, 2026 · 1 revision

6-Month Advanced Systems Engineering Study Plan

Here is a highly aggressive, code-first 6-month roadmap designed to bridge your expertise in userspace daemons and diagnostics with kernel internals and modern compute architecture.

This plan skips the beginner tutorials. It is structured around the heavy compute and system architecture requirements typical of major CPU/GPU semiconductor companies, ensuring you remain firmly on the core developer track.

Month 1: Modern C++17/20 & Zero-Latency Concurrency

Goal: Upgrade your C/C++11 foundation to C++20, focusing on zero-cost abstractions and lock-free programming for high-throughput environments.

  • Week 1: Move Semantics & Memory Management * Action: Write code using std::unique_ptr, std::shared_ptr, perfect forwarding, and rvalue references.
    • Code: Refactor a legacy C-style struct parser to use purely modern C++ memory management without a single raw new or delete.
  • Week 2: Compile-Time Execution & Templates
    • Action: Master constexpr, consteval, and C++20 Concepts.
    • Code: Build a compile-time static configuration parser. The goal is to parse a string of configurations entirely at compile time, resulting in zero runtime overhead.
  • Week 3: Advanced Concurrency & Memory Models
    • Action: Study std::atomic, memory barriers (acquire/release semantics), and C++ thread pools.
    • Code: Implement a custom thread pool using modern C++ that dispatches tasks to worker threads using std::condition_variable.
  • Week 4: Lock-Free Data Structures
    • Action: Learn how mutexes destroy throughput in IPC and logging daemons.
    • Project 1: Lock-Free Logging Daemon. Rewrite the core of a multi-client logging daemon (similar to your previous work) using a Single-Producer/Single-Consumer (SPSC) lock-free ring buffer in C++20. Measure the latency difference against a mutex-based approach.

Month 2: eBPF & Next-Gen Observability

Goal: Evolve your DWARF/ELF parsing expertise into the kernel space by writing eBPF programs to trace execution without modifying kernel code.

  • Week 5: Introduction to eBPF & BCC
    • Action: Set up an eBPF development environment. Learn the bpf() syscall and the eBPF verifier limits.
    • Code: Write a simple Python/BCC script to hook into sys_clone and print a message every time a new process is spawned.
  • Week 6: libbpf & CO-RE (Compile Once, Run Everywhere)
    • Action: Move away from BCC (which requires LLVM on the target) to libbpf using C.
    • Code: Write a C-based eBPF program that hooks into the kernel's network stack to drop packets from a specific IP, utilizing BPF maps to store the IP addresses.
  • Week 7: Tracepoints, Kprobes & Uprobes
    • Action: Learn how to attach eBPF to kernel functions (Kprobes) and userspace functions (Uprobes).
    • Code: Write an eBPF program that attaches to malloc (Uprobe) in a target C++ application to track memory allocations and detect potential leaks.
  • Week 8: Advanced eBPF Observability
    • Project 2: System-Wide I/O Profiler. Build an eBPF tool using libbpf that intercepts Block I/O layer tracepoints to measure latency per block device. Connect this to your existing knowledge of ELF parsing to resolve kernel stack traces into readable function names in user space.

Month 3: Kernel Modules & Memory Management Deep Dive

Goal: Cross the syscall boundary. Understand how the kernel manages memory and exposes hardware capabilities to userspace.

  • Week 9: Loadable Kernel Modules (LKMs) & Character Drivers
    • Action: Learn the structure of modern kernel modules, init/exit macros, and registering a character device.
    • Code: Write a basic kernel module that creates a /dev/custom_null device. Implement read, write, and ioctl file operations.
  • Week 10: Kernel Memory Allocation (kmalloc, vmalloc)
    • Action: Understand physical vs. virtual memory, page frames, and the Slub allocator.
    • Code: Write a module that allocates pages of memory using alloc_pages and inspects the physical addresses.
  • Week 11: Memory Mapping (mmap) & Zero-Copy
    • Action: Learn how to share memory directly between the kernel and userspace to bypass copy_to_user.
    • Code: Expand your character driver to support the mmap syscall. Allocate a contiguous buffer in the kernel and map it directly into a userspace C++ test application.
  • Week 12: Concurrency in the Kernel
    • Project 3: High-Performance Shared Memory Bridge. Write a kernel module that utilizes spinlocks and atomic operations to safely manage a shared memory buffer between two concurrent userspace processes, demonstrating pure zero-copy IPC.

Month 4: High-Speed Interconnects (PCIe) & DMA

Goal: Master the backbone of modern heterogeneous compute systems (GPUs, NPUs).

  • Week 13: PCIe Architecture & Enumeration
    • Action: Study the PCIe configuration space, Base Address Registers (BARs), and how the kernel enumerates devices.
    • Code: Write a kernel module that walks the PCI bus using pci_get_device(), reading and printing the configuration space (Vendor ID, Device ID, BAR addresses) of a specific device.
  • Week 14: Interrupts & Tasklets
    • Action: Understand hardware interrupts, top-halves, bottom-halves (tasklets/workqueues), and MSI/MSI-X.
    • Code: Set up a virtual PCIe device using QEMU. Write a driver that registers an interrupt handler (request_irq) to catch virtual interrupts.
  • Week 15: Direct Memory Access (DMA)
    • Action: Learn how hardware writes directly to RAM without CPU intervention (cache coherence, DMA mapping).
    • Code: Implement a basic DMA mapping sequence in your virtual PCIe driver using dma_alloc_coherent.
  • Week 16: PCIe Compute Simulation
    • Project 4: Virtual PCIe Accelerator Driver. Using QEMU's edu (educational virtual PCIe device), write a complete Linux driver that maps the device's BARs, sets up a DMA transfer to pass a block of data to the "device", triggers a hardware interrupt, and reads the processed data back.

Month 5: Advanced Hardware Interfaces & Custom USB

Goal: Connect your hardware/microcontroller expertise with kernel driver development, focusing on the USB subsystem.

  • Week 17: Linux USB Subsystem & URBs
    • Action: Study USB Request Blocks (URBs), endpoints, and communication classes.
    • Code: Write a simple kernel module that registers with the USB core (usb_register) and probes for a specific Vendor ID/Product ID.
  • Week 18: Writing a Custom USB Driver
    • Action: Bypass generic kernel drivers (like cdc_acm) to take direct control of a device.
    • Code: Take a microcontroller acting as a simple CDC device. Write a custom kernel driver that claims its interface and sends bulk USB transfers to toggle an LED or read a sensor state.
  • Week 19: Netlink Sockets & Kernel-to-User Eventing
    • Action: Learn the modern way to push asynchronous events from kernel space to userspace.
    • Code: Modify your custom USB driver to broadcast a message over a Netlink socket whenever the hardware state changes.
  • Week 20: Hardware Hotplug Integration
    • Project 5: Advanced USB Event Daemon. Write a modern C++20 userspace daemon that listens to the Netlink socket from your custom driver. When the CDC device connects or sends an event, the daemon updates a local state machine and logs the latency using a lock-free queue.

Month 6: The Capstone - Full Stack Compute Architecture

Goal: Combine Modern C++, eBPF, and Kernel Internals into a single, portfolio-defining architectural project.

  • Week 21-24: System-Wide Hardware Trace Bridge
    • The Concept: Build a modern tracing infrastructure that bridges hardware events to userspace analysis without bringing down system performance.
    • Component 1 (eBPF): Write an eBPF program that hooks into the kernel's PCIe or USB subsystems to capture highly granular timing data on hardware interrupts or DMA completion events.
    • Component 2 (Kernel): If necessary, write a lightweight kernel module to expose specific hardware registers that eBPF cannot easily access.
    • Component 3 (Userspace C++20): Build a daemon using lock-free data structures that ingests the eBPF maps in real-time. Use your previous ELF/DWARF expertise to map these raw execution addresses back to human-readable symbols in the running applications.
    • Outcome: A zero-latency observability platform that proves you can architect solutions spanning from the physical hardware interconnect all the way up to advanced C++ userspace processing.

Clone this wiki locally