Skip to content

A compiler written in rust targeting a subset of features in the C language

License

Notifications You must be signed in to change notification settings

Jakersnell/micro-c

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Micro-C

A small handwritten compiler focusing on a subset of the C language.


These elements include but are not limited to

  • Primitive types (int, char, float, double, long)
  • Arrays, pointers, and structs
  • Functions (not function pointers, yet!)
  • Control flow (if, else, while, for)
  • Expressions (arithmetic, logical, bitwise, and relational)
  • Variable declarations and assignments
  • Comments (single-line and multi-line)
  • Built-in library functions (printf, malloc, free, etc.)
  • Casting (implicit and explicit)

Notice: This compiler only support operation on MacOS arm64 systems

I plan on creating a web playground that will allow access to the compiler in the future. But for now the only way to use this compiler is to run it natively.


Directory


About the Project

Phases of the compiler

  1. Lexical analysis (Reading literals, keywords, operators, etc) {done}
  2. Parsing (Reading the logical structure of the code) {done}
  3. Semantic analysis (Type checking, variable validation, and the like) {in progress}
  4. Codegen (Creating the IR, linking the object files, creating a binary)

Goal

The goal of this project is to create a simple, yet powerful, compiler that can take in a file written in Micro-C and output an executable file. I chose to implement this fashion of C as to focus on core features such as loops, functions, etc. While not focusing on implementing every C language feature so that the result was still achievable.

Why

I have been programming for a few years now and compilers always felt like this elusive black box of magic. I was extremely curious about how this thing may work as going from point A to B seemed so beyond me. I stand by the belief that anyone can achieve nearly anything so long as they work hard at it, so I began doing research. I would like to give a special thanks to Immo Landwerth on YouTube as I watched his livestream series where he writes Minsk and that gave me the inspiration to feel that it was possible to write my own compiler. I first wrote simple stuff like the .bf interpreter and .bf compiler on my GitHub. Then after doing more research I decided to start writing this compiler. I chose Rust as I wanted more experience in the language, and I really enjoy the language.

How - more detailed descriptions can be found via the directory below

Compilers can be broken up into two main phases, frontend and backend. The frontend of a compiler deals with high level language concepts such as the rules for the language. The frontend is implemented from scratch as the functionality is highly specific to this language. This deals with tokenizing the source text program. Then representing the program as a tree and performing syntactic validation. The program will then be lowered into a more low level tree form, the mid-level intermediary representation or MLIR. This form is used to perform semantic analysis on the program, to both verify its integrity, and process it further. After this the program is briefly represented as a graph to perform control flow analysis on the program. If the program makes it to this point without rejection it is passed to the backend. The backend of the compiler works at a much lower level and deals with optimizing binary output and such. There are a ton of frameworks that make the backend much easier on the developer. The rust compiler uses LLVM. The one I am using is Cranelift, a compiler backend framework designed for the WasmTime runtime. I initially learned about this while studying the Saltwater compiler project. I chose this my backend framework as LLVM is extremely complex and very overkill for this project. Cranelift is much easier to pick up and still provides the necessary features for this project. After the backend is finished with its work it then emits a binary executable that can be run by the computer.

Testing

The compiler makes use of Rust testing with the annotation #[cfg(test)] for conditional test compilation and #[test] to mark test functions. As per Rust conventions the tests are within the file of the items they are testing. There are an array of unit and integration tests throughout the compiler to verify integrity of the system. The only code ever pushed to this repository is code that passes all of these tests. In src/_c_test_files you will see integration tests that correspond to which type of test they are based on the directory they are within.


Resources I have found helpful

About

A compiler written in rust targeting a subset of features in the C language

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages