Skip to content

agnarrarendelle/Racket-Compiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

74 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Racket Compiler in Rust

This project is a work in progress and is not yet complete.

This is an implementation of a compiler for a subset of the Racket programming language, written in Rust. The compiler is structured into a frontend, a middle-end with several transformation passes, and a backend for code generation.

Compiler Passes

The compilation process involves several passes to transform the source code from a high-level Racket representation to low-level machine code.

Frontend

Lexer

The lexer takes the raw Racket source code as a string and breaks it down into a stream of tokens. For example, the code (let ([x 10]) (+ x 5)) would be tokenized into a sequence like LPAREN, LET, LPAREN, LBRACKET, ID(x), INT(10), RBRACKET, RPAREN, LPAREN, PLUS, ID(x), INT(5), RPAREN, RPAREN.

Parser

The parser takes the stream of tokens from the lexer and builds an Abstract Syntax Tree (AST). The AST is a hierarchical representation of the code's structure.

Middle-end Passes

Uniquifier

This pass ensures that all variable names in the program are unique. This avoids naming conflicts in nested scopes.

Example:

; Before
(let ([x 1])
  (let ([x 2])
    x))
; After
(let ([x.1 1])
  (let ([x.2 2])
    x.2))

Shrinker

The shrinker simplifies complex expressions by breaking them down into a series of simpler assignments to temporary variables.

Example:

; Before
(+ 1 (+ 2 3))
; After
(let ([tmp 5])
  (+ 1 tmp))

Complex Operator Remover

This pass ensures that the arguments to operators are atomic (i.e., variables or immediate values). It does this by creating temporary variables for any complex expressions found as arguments.

Example 1: Binary Operation

; Before
(+ (+ 42 10) (- 10))
; After
(let ([tmp_0 (+ 42 10)])
  (let ([tmp_1 (- 10)])
    (+ tmp_0 tmp_1)))

Example 2: If Condition

; Before
(if (< (+ 1 2) 5) 10 20)
; After
(if (let ([tmp_0 (+ 1 2)]) (< tmp_0 5)) 10 20)

Explicate Control Converter

This pass makes the control flow of the program explicit. It converts the program into a control-flow graph consisting of basic blocks. This is a form of "Continuation-Passing Style" (CPS).

Example:

; Before
(let ([x 1])
  (if (eq? x 0)
      42
      777))
; After (in C0 representation)
start:
  x := 1;
  if (eq? x 0) goto block_0 else goto block_1

block_0:
  return 42

block_1:
  return 777

Instruction Selector

This pass translates the intermediate representation of the program into machine-specific instructions (in this case, for x86-64).

Example:

; IR
(movq $10 %rax)
; x86
movq $10, %rax

Home Assigner

This pass assigns a "home" location (a register or a stack slot) for each variable. It replaces abstract variable references with concrete memory locations on the stack, relative to the base pointer (%rbp).

Example:

; Before
movq $42, x
; After
movq $42, -8(%rbp)

Instruction Patcher

This pass ensures that each instruction adheres to the x86 constraint that at most one argument can be a memory reference. It does this by introducing temporary register moves.

Example:

; Before
movq -8(%rbp), -16(%rbp)
; After
movq -8(%rbp), %rax
movq %rax, -16(%rbp)

Register Allocator

This pass assigns machine registers to the variables in the program to minimize memory access and improve performance.

Prelude and Conclusion

This pass adds the necessary boilerplate code at the beginning and end of the program, such as setting up the stack frame and returning a value.

Backend

x86 Code Generation

The final stage of the compiler, where the x86-64 assembly code is generated from the transformed and optimized intermediate representation.

About

A mimimum Racket to x86 compiler in Rust

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors