Course: CSE 4502 Base Project Authors: Bo Song, Haoliang Zhang (2015)
Platform: mCertikOS (a formally verified x86 microkernel OS)
- Base Project Description
- Added Modules & Features
- System Architecture
- Technologies Used
- Problems Encountered & Solutions
- Setup & Execution Guide (macOS)
The mCertikOS project is a formally verified, x86 teaching operating system originally developed at Yale. The base code includes:
- A bootloader (
boot0,boot1) that loads the kernel off a disk image - A microkernel (
kern/) with memory management (paging, containers), interrupt handling, a cooperative thread scheduler, and an IDE disk driver - A file system (
kern/fs/) implementing a simple Unix-like inode/directory structure - A collection of user-space processes (
user/): shell, ping/pong (IPC demo), ls, cat, rot13, echo, fstest - A basic shell (
user/shell/shell.c) that provided only a handful of hardcoded built-in commands in a read-eval-print loop (REPL), with no process isolation, no pipe support, and no signal handling
The provided shell was essentially a monolithic interpreter: every command (even cat) ran inside the shell process itself. A crash in any command would take down the entire shell.
What was added:
The shell was refactored to detect uppercase process commands (LS, CAT, ROT13) and launch them as fully isolated child processes using the spawn_with_fds() kernel interface, instead of executing them as C functions inside the shell.
Files modified:
user/shell/shell.c—elf_id_for_process(),is_process_cmd(),launch_process_stage()kern/trap/TSyscall/TSyscall.c—sys_spawn()
How it enhances the project:
If LS panics, only its child process dies. The parent shell survives and re-prints the >: prompt cleanly. This mirrors how real Unix shells use fork()/exec().
// Shell maps a command name to its compiled ELF binary ID
static int elf_id_for_process(const char *name) {
if (strcmp(name, "CAT") == 0) return 6;
if (strcmp(name, "ROT13") == 0) return 7;
if (strcmp(name, "LS") == 0) return 9;
return -1;
}What was added:
sys_spawn was given two extra parameters — stdin_fd and stdout_fd. When spawning a child:
- If the fd is
-1, the kernel assigns a console handle (FD_CONSOLE_IN/FD_CONSOLE_OUT). - If the fd is a valid file descriptor, the kernel duplicates the parent's
struct file *pointer into the child's Thread Control Block (TCB), incrementing the reference count.
Files modified:
kern/trap/TSyscall/TSyscall.c—sys_spawn()user/lib/proc.c—spawn_with_fds()
How it enhances the project:
Child processes are completely I/O-agnostic. CAT reads from FD 0 without knowing whether that's a keyboard, a file on disk, or a pipe. This is the fundamental abstraction that makes Unix pipelines possible.
// Inside sys_spawn — duplicate parent's FD into child's slot 0
if (stdin_fd >= 0 && parent_fds[stdin_fd] != 0) {
f = file_dup(parent_fds[stdin_fd]);
tcb_set_openfiles(new_pid, 0, f);
}What was added:
A full Unix-style pipeline parser and runner:
parse_pipeline()— splits the command line on|into ordered stagesrun_pipeline()— creates kernel pipes between stages, spawns each process with the correct FDs, then closes the parent's copies of pipe endslaunch_process_stage()— prepares and callsspawn_with_fdsfor each stage
Files modified:
user/shell/shell.ckern/trap/TSyscall/TSyscall.c—sys_pipe()
How it enhances the project:
Commands can now be chained. CAT test.txt | ROT13 streams the file through the cipher without either process knowing about the other. The crucial deadlock prevention — the parent shell must close its own copy of each pipe end after spawning — is correctly implemented.
// Parent creates pipe, uses write-end for stage i, then closes it immediately
sys_pipe(pipefd);
out_fd = pipefd[1];
pids[i] = launch_process_stage(&stages[i], in_fd, out_fd);
close(pipefd[1]); // <-- critical: drop parent's write reference
in_fd = pipefd[0]; // pass read-end to next stageWhat was added:
sys_exit()— properly closes all open file descriptors (preventing pipe deadlocks), marks the processTSTATE_DEAD, removes it from the scheduler queue, and refunds its resource quota to the parent containersys_waitpid()— spin-yields until the target process reachesTSTATE_DEAD
Files modified:
kern/trap/TSyscall/TSyscall.c
How it enhances the project:
The shell correctly pauses after spawning a pipeline and only re-prints the >: prompt when all children have exited. Without this, prompt output would interleave with process output.
// Shell blocks until each child is DEAD
for (i = 0; i < nstages; i++) {
if (pids[i] >= 0)
sys_waitpid(pids[i]);
}What was added:
sys_spawn now copies the parent shell's current working directory inode into the child's TCB so that relative paths resolve identically in every spawned process.
Files modified:
kern/trap/TSyscall/TSyscall.c
tcb_set_cwd(new_pid, inode_dup((struct inode*)tcb_get_cwd(curid)));What was added:
shell_kill()— sends a signal to any running process by PID (kill -9 <pid>)shell_trap()— registers a custom signal handler for the shell itself viasigactionsys_kill()in the kernel — handlesSIGKILL(immediate termination) and queues other signals as pendingsys_sigaction()— stores per-process signal handlers in the TCBsys_sigreturn()— restores saved register context after returning from a user-space signal handler
Files modified:
user/shell/shell.ckern/trap/TSyscall/TSyscall.c
How it enhances the project:
Processes can now be cleanly terminated or interrupted. SIGKILL is unblockable and immediately sets the process to TSTATE_DEAD. Other signals are delivered asynchronously the next time the process enters kernel mode.
The following built-in commands were added or hardened with correctness fixes:
| Command | Description |
|---|---|
write <str> <file> |
Write a string directly into a file (overwrite) |
append <str> <file> |
Append a string to an existing file |
kill -<sig> <pid> |
Send a signal to a process |
trap <signum> |
Register shell's own signal handler |
spawn <elf_id> |
Manually spawn a predefined user process |
help |
List all available commands |
Bug fixes applied to existing built-ins:
cd— fixed crash from passing acharas a pointerrm— fixed inverted bounds check (pathIdx > argc + 1→pathIdx >= argc)cat— fixed single-read truncation (now reads in a loop until EOF)is_dir()— fixed use of uninitialized fd whenis_file_exist()returns 0shell_touch()/shell_kill()— fixedreturn;inint-returning functions
+-------------------------------------------------------+
| User Space (Ring 3) |
| |
| shell.c ←→ ls ←→ cat ←→ rot13 ←→ echo |
| | File Descriptors (FD 0/1) |
| spawn_with_fds() → sys_spawn syscall |
+-------------------------------------------------------+
↕ Trap / Syscall Gate (int 0x30)
+-------------------------------------------------------+
| Kernel Space (Ring 0) |
| |
| TSyscall.c |
| sys_spawn → proc_create, file_dup, tcb_set_cwd |
| sys_pipe → pipe_alloc, file_dup |
| sys_waitpid → spin-yield on TSTATE_DEAD |
| sys_exit → file_close * NOFILE, thread_exit |
| sys_kill → tcb_add_pending_signal |
| sys_sigaction → tcb_set_sigaction |
| |
| Thread Scheduler (PTQueue / PThread) |
| File System (inode / dir / path / sysfile) |
| Memory Mgr (MContainer / MPTOp) |
| Device Drivers (IDE disk, keyboard, console, LAPIC) |
+-------------------------------------------------------+
↕ Hardware (x86 i386)
Shell Process (pid=2)
│
├── sys_pipe() → pipefd[0]=read, pipefd[1]=write
│
├── spawn CAT (pid=511)
│ FD 0 → test.txt (inode on disk)
│ FD 1 → pipefd[1] (write-end of pipe)
│
├── close(pipefd[1]) ← CRITICAL: drop parent's write ref
│
├── spawn ROT13 (pid=512)
│ FD 0 → pipefd[0] (read-end of pipe)
│ FD 1 → console (terminal screen)
│
├── close(pipefd[0]) ← drop parent's read ref
│
├── sys_waitpid(511) ← waits for CAT
└── sys_waitpid(512) ← waits for ROT13
| Component | Technology |
|---|---|
| OS Kernel | mCertikOS (x86 i386, formally verified C) |
| Compiler | i686-elf-gcc (cross-compiler for bare-metal) |
| Linker | i686-elf-ld |
| Emulator | QEMU (qemu-system-x86_64) |
| Build System | GNU Make + Python (make_image.py) |
| Language | C (GNU89 standard) |
| Disk Image | Raw x86 disk image with custom MBR |
| Shell Source | user/shell/shell.c |
| Kernel Syscalls | kern/trap/TSyscall/TSyscall.c |
Root cause: Shell was printing >: immediately after spawning child processes without waiting for them.
Solution: Implemented sys_waitpid() which spin-yields in the kernel until tcb_get_state(target_pid) == TSTATE_DEAD. The shell now calls this for every spawned PID before reprinting the prompt.
Root cause: After spawning CAT and ROT13, the parent shell still held an open write-end reference to the pipe. When CAT exited, the pipe's write reference count dropped from 2 to 1 (not 0), so ROT13's read() kept blocking waiting for more data.
Solution: The parent shell close(pipefd[1]) immediately after spawning each stage. This drops the reference count to 1 (owned only by the child), so when the child exits the count hits 0 and the read-end gets a proper EOF.
Root cause: The original sys_exit just marked the process dead and called thread_exit(), leaving pipe file structures with dangling open references.
Solution: Added a loop in sys_exit that iterates all NOFILE slots in the exiting process's TCB and calls file_close() on each open file before terminating.
Root cause: sys_spawn created a new process with a NULL CWD inode, so spawned LS or CAT couldn't resolve relative paths.
Solution: Added tcb_set_cwd(new_pid, inode_dup(...tcb_get_cwd(curid)...)) in sys_spawn to duplicate the parent shell's CWD inode into the child.
Root cause: The original code passed a char literal '\0' (integer 0) as a pointer to strcpy, causing a null-pointer dereference.
Solution: Changed to properly initialize path[0] = '\0' and pass the array pointer.
Root cause: The original _shell_cat called read() once with a 99-byte buffer and stopped.
Solution: Changed _shell_cat to loop with while ((n = read(fd, buf, sizeof(buf)-1)) > 0) until EOF.
Install all required tools using Homebrew:
# Install Homebrew if not already installed
/bin/bash -c "$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/HEAD/install.sh)"
# Install QEMU (x86 emulator)
brew install qemu
# Install the i386-elf cross-compiler toolchain
brew tap nativeos/i386-elf-toolchain
brew install i386-elf-binutils i386-elf-gccVerify the tools are available:
qemu-system-x86_64 --version # Should print QEMU version
i686-elf-gcc --version # Should print GCC cross-compiler versionNote: Python 3 is also required for the disk image build script. It is typically pre-installed on macOS. Check with
python3 --version.
git clone <your-repo-url> mcertikos
cd mcertikosmakeA successful build ends with:
All targets of boot loader are done.
All targets of kernel are compiled.
All targets of user are done.
Building Certikos Image...
All done.
All targets are done.
This produces certikos.img (bootable disk image) and certikos_disk.img (file system disk).
Option A — With output in your terminal (no-X mode, recommended):
make qemu-noxOption B — With a separate QEMU window:
make qemuTo exit QEMU at any time, press
Ctrl-athenx.
After boot you will see:
********Welcome to left-mid-right shell*********
********This is the final project for CPSC 422/522 Operating Systems in Yale********
********Author: Bo Song, Haoliang Zhang********
********Date: 12/18/2015 ********
>:
The >: prompt means the shell is ready to accept commands.
>: ls # List files in current directory
>: pwd # Print working directory
>: mkdir mydir # Create a directory
>: cd mydir # Change into that directory
>: touch hello.txt # Create an empty file
>: write "hello world" hello.txt # Write text to the file
>: cat hello.txt # Print the file contents → hello world
>: append " again" hello.txt
>: cat hello.txt # → hello world again
>: rm hello.txt # Remove the file
>: cd .. # Go back up
>: rm -r mydir # Remove directory recursively>: LS # Spawns ls as a child process
>: CAT test.txt # Spawns cat to read test.txt>: CAT test.txt | ROT13Expected output: Contents of test.txt cipher-shifted (A→N, B→O, etc.)
>: CAT test.txt | ROT13
Gur dhvpx oebja sbk whzcf bire gur ynml qbt.
>:
>: spawn 1 # Spawn the ping process (elf_id=1)
>: spawn 2 # Spawn the pong process (elf_id=2)
>: kill -9 3 # Send SIGKILL to process PID 3
>: trap 2 # Register shell's SIGINT handler
>: help # Print all available commands| Issue | Fix |
|---|---|
i686-elf-gcc: command not found |
Run brew install i386-elf-gcc |
qemu-system-x86_64: command not found |
Run brew install qemu |
make: python3: No such file |
Run brew install python3 |
| QEMU freezes at boot | Try make clean && make then retry |
| Black screen in QEMU window | Use make qemu-nox instead |
certikos_disk.img not found |
Run make mkfs to regenerate the filesystem disk |
make clean # Remove all compiled objects and disk image
make # Rebuild from scratch| File | Purpose |
|---|---|
user/shell/shell.c |
Shell REPL, built-in commands, pipeline runner |
kern/trap/TSyscall/TSyscall.c |
Kernel syscall implementations (spawn, exit, waitpid, pipe, kill, sigaction) |
kern/fs/sysfile.c |
File system syscalls (open, read, write, pipe) |
kern/fs/file.c |
File struct management (file_dup, file_close, console_init_fds) |
user/lib/proc.c |
spawn_with_fds() user-library wrapper |
user/cat/cat.c |
Standalone CAT process binary |
user/ls/ls.c |
Standalone LS process binary |
user/rot13/rot13.c |
Standalone ROT13 process binary |
Makefile |
Top-level build orchestration |
make_image.py |
Assembles bootable disk image |