Custom C Compiler with Function Call Optimizations

A custom C compiler written in Python using pycparser that implements advanced optimizations to reduce function call overhead and enable zero-latency kernel flag access.

Features

Indexed-Jump Function Calls: Functions smaller than 1024 bytes are co-located in memory and invoked via indexed-jump instructions.
Metamorphic Return Sites: For functions with a single return site, the caller writes the return address bytes directly into the instruction itself, avoiding stack-based return address storage and saving 8 bytes.
Quantized Call-Backs: Return sites are memory-aligned to 16 bytes, allowing the offset to be stored in a single byte.
SIMD Bit-Packing: Global variables with 1-bit to 7-bit types are automatically packed into the last SIMD register (xmm15), which is typically ignored by standard compilers. This eliminates memory reads for frequently accessed kernel flags.
Zero-Latency Kernels: Key kernel flags are accessed via inline assembly directly from the SIMD register, eliminating memory reads during hardware interrupt callbacks. This prevents pipeline stalls that occur with traditional global variable access.

Installation

pip install -r requirements.txt

Usage

python compiler.py input.c -o output.asm

Architecture

parser.py: C code parsing using pycparser, extracts functions and global variables
analyzer.py: Function analysis (size, return sites) and global variable analysis for SIMD bit-packing
codegen.py: Code generation with optimizations including SIMD bit-packing and zero-latency access
compiler.py: Main compiler entry point

SIMD Bit-Packing Details

Global variables with bit-widths of 1-7 bits are automatically detected and packed into the xmm15 SIMD register. This includes:

Bit-field declarations (e.g., int flag : 1)
Custom bit-width types (e.g., int3_t, uint5_t)
Small integer types that fit in 1-7 bits

The compiler generates:

Initialization code that packs variables into the SIMD register at startup
Inline assembly for zero-latency read/write operations
Special handling for interrupt callback functions (detected by naming patterns like isr_*, irq_*, *_handler, *_callback)

Zero-Latency Access

During interrupt callbacks, accessing packed global variables uses direct SIMD register operations instead of memory reads. This eliminates:

Memory access latency
Pipeline stalls
Cache misses

All operations are register-to-register, providing true zero-latency access for critical kernel flags.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
Tests		Tests
BENCHMARK_README.md		BENCHMARK_README.md
BUILD_AND_RUN.md		BUILD_AND_RUN.md
HOW_TO_TEST.md		HOW_TO_TEST.md
Makefile		Makefile
QEMU_RUN.md		QEMU_RUN.md
QUICK_TEST.md		QUICK_TEST.md
README.md		README.md
TESTING.md		TESTING.md
analyzer.py		analyzer.py
asm_parser.py		asm_parser.py
benchmark.py		benchmark.py
build.sh		build.sh
codegen.py		codegen.py
compiler.py		compiler.py
parser.py		parser.py
pycparser_wrapper.h		pycparser_wrapper.h
register_allocator.py		register_allocator.py
requirements.txt		requirements.txt
run_qemu.sh		run_qemu.sh
symbol_collector.py		symbol_collector.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Custom C Compiler with Function Call Optimizations

Features

Installation

Usage

Architecture

SIMD Bit-Packing Details

Zero-Latency Access

About

Uh oh!

Releases

Packages

Languages

OpenSourceJesus/C-Compiler

Folders and files

Latest commit

History

Repository files navigation

Custom C Compiler with Function Call Optimizations

Features

Installation

Usage

Architecture

SIMD Bit-Packing Details

Zero-Latency Access

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages