A (partial) C compiler with a few backends to choose from.
- LLVM: A modular and reusable compiler toolchain (also used by clang, rustc, swift, etc).
- Cranelift: A newer compiler backend toolchain prioritising compilation time over optimization (used by wasmtime and maybe rustc debug builds eventually).
- Aarch64 ASM: Emitting the assembly myself. Very very poor performance compared to real backends.
- Custom VM: Directly execute my IR. Useful for narrowing down whether bugs are in parsing or use of backend apis. Performance is not a priority.
This is my first self-directed foray into writing compilers after reading Crafting Interpreters. It's not an optimising compiler, but it's certainly a compiling compiler, and that ought to be enough for anybody. Besides, LLVM can probably fix anything pathological on its own.
There's a simple CLI for compiling a single source file. See the bottom of bin/cli.rs for usage details.
- float/double, unsigned int/long
+ - * / % >= <= > < ==
- variables, pointers (
&a, *b, c[2]
), arrays - if/else, loops (for, while, do while), break/continue
- functions
- structs, typedefs
See examples in tests.rs
.
- Pointers are assumed to be 64 bits.
- Many features are implemented as syntax de-sugaring to other features which may have unexpected runtime performance characteristics.
- No debug info is emitted.
- This does not include a preprocessor, so it depends on something else for macros, includes, etc.
- Cranelift backend does not support calling variadic functions (like printf).
- Can only target macOS because they use a slightly different calling convention.
- Outputs text, not binary executables, so it still depends on an assembler and linker.
- Not optimised. Function calls and phi nodes generate unnecessary move instructions.
- Missing features: arrays.
switch, arrays in struct fields, nested arrays, goto, enums, pass structs by value, signed values, correct char size, bitwise ops, and/or short-circuiting, struct/array init literals, write variadic functions, global/static variables, function pointers, const, unions.
- brainfuck backend (no libc, just
puts
&gets
) - CLI
- better test system
- use real linker instead of inline asm macro
- print which backend each works on separately
- steal tests from a real c compiler
- compile to wasm and make demo ui
- run in vm
- speed up the vm. its super cringe if its slower than python.
- run tests in github actions
- The lexer (converting source code to a stream of tokens) is generated by Logos.
- all backends
- LLVM (with the llvm-sys rust bindings).
- llvm backend only
This gives an easy way to opt out of the complicated dependencies that enable more sophisticated backends. All backends are enabled by default.
llvm
: Enable the llvm backend.logging
: For debugging the compiler. Log ast, ir, etc.