Skip to content

A (WIP) C compiler targeting LLVM IR or aarch64 assembly.

Notifications You must be signed in to change notification settings

LukeGrahamLandry/seesea

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

seesea

A (partial) C compiler with a few backends to choose from.

  • LLVM: A modular and reusable compiler toolchain (also used by clang, rustc, swift, etc).
  • Cranelift: A newer compiler backend toolchain prioritising compilation time over optimization (used by wasmtime and maybe rustc debug builds eventually).
  • Aarch64 ASM: Emitting the assembly myself. Very very poor performance compared to real backends.
  • Custom VM: Directly execute my IR. Useful for narrowing down whether bugs are in parsing or use of backend apis. Performance is not a priority.

This is my first self-directed foray into writing compilers after reading Crafting Interpreters. It's not an optimising compiler, but it's certainly a compiling compiler, and that ought to be enough for anybody. Besides, LLVM can probably fix anything pathological on its own.

There's a simple CLI for compiling a single source file. See the bottom of bin/cli.rs for usage details.

Supported Features

  • float/double, unsigned int/long
    • + - * / % >= <= > < ==
  • variables, pointers (&a, *b, c[2]), arrays
  • if/else, loops (for, while, do while), break/continue
  • functions
  • structs, typedefs

See examples in tests.rs.

Limitations

  • Pointers are assumed to be 64 bits.
  • Many features are implemented as syntax de-sugaring to other features which may have unexpected runtime performance characteristics.
  • No debug info is emitted.
  • This does not include a preprocessor, so it depends on something else for macros, includes, etc.
  • Cranelift backend does not support calling variadic functions (like printf).

Aarch64 Backend Specific

  • Can only target macOS because they use a slightly different calling convention.
  • Outputs text, not binary executables, so it still depends on an assembler and linker.
  • Not optimised. Function calls and phi nodes generate unnecessary move instructions.
  • Missing features: arrays.

TODO

switch, arrays in struct fields, nested arrays, goto, enums, pass structs by value, signed values, correct char size, bitwise ops, and/or short-circuiting, struct/array init literals, write variadic functions, global/static variables, function pointers, const, unions.

  • brainfuck backend (no libc, just puts & gets)
  • CLI
  • better test system
    • use real linker instead of inline asm macro
    • print which backend each works on separately
    • steal tests from a real c compiler
  • compile to wasm and make demo ui
    • run in vm
  • speed up the vm. its super cringe if its slower than python.
  • run tests in github actions

Libraries

  • The lexer (converting source code to a stream of tokens) is generated by Logos.
    • all backends
  • LLVM (with the llvm-sys rust bindings).
    • llvm backend only

This gives an easy way to opt out of the complicated dependencies that enable more sophisticated backends. All backends are enabled by default.

  • llvm: Enable the llvm backend.
  • logging: For debugging the compiler. Log ast, ir, etc.

About

A (WIP) C compiler targeting LLVM IR or aarch64 assembly.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published