Skip to content

A complete compiler for the Fun functional programming language, implemented in Scala 3. Features lexical analysis, parser combinators, static type checking, CPS transformation, and LLVM IR code generation. Includes example programs (factorial, Towers of Hanoi, Mandelbrot set) and generates executable native code.

Notifications You must be signed in to change notification settings

codingkabs/FunCompiler

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fun Language Compiler Project

A complete compiler implementation for the Fun programming language, written in Scala 3. This project demonstrates the full compilation pipeline from source code to executable LLVM IR.

Project Overview

This is a functional programming language compiler that translates Fun source code into LLVM Intermediate Representation (IR), which can then be compiled to native machine code using LLVM tools like clang.

Key Features

  • Lexical Analysis: Custom tokenizer using regular expressions
  • Parsing: Parser combinator-based recursive descent parser
  • Type System: Static type checking with support for Int, Double, and Void types
  • CPS Transformation: Continuation-Passing Style transformation for proper compilation
  • LLVM Code Generation: Direct compilation to LLVM IR
  • Built-in Functions: Support for I/O operations (print_int, print_char, print_string, etc.)

Language Features

The Fun language supports:

Data Types

  • Integers (Int): 32-bit signed integers
  • Floating-point (Double): Double-precision floating-point numbers
  • Void: For functions with no return value
  • Character constants: Single characters in single quotes
  • String literals: String constants in double quotes

Language Constructs

  • Function definitions with typed parameters and return types
  • Conditional expressions (if-then-else)
  • Arithmetic operations: +, -, *, /, %
  • Comparison operations: ==, !=, <, <=, >, >=
  • Function calls with argument passing
  • Variable bindings with val declarations
  • Sequential execution with semicolon (;) operator

Built-in Functions

  • print_int(x: Int): Print an integer
  • print_char(c: Int): Print a character
  • print_string(s: String): Print a string
  • print_star(): Print an asterisk
  • print_space(): Print a space
  • new_line(): Print a newline
  • skip(): No-op function

Project Structure

cflresit2025-kabir505-main/
├── README.md                 # This file
├── resit/
│   ├── resit.sc             # Main compiler implementation
│   ├── fun_parser.sc        # Parser combinator implementation
│   ├── fun_tokens.sc        # Lexical analysis and tokenization
│   ├── examples/            # Example Fun programs
│   │   ├── fact.fun         # Factorial computation
│   │   ├── hanoi.fun        # Towers of Hanoi
│   │   ├── mand.fun         # Mandelbrot set (basic)
│   │   ├── mand2.fun        # Mandelbrot set (with chars)
│   │   └── sqr.fun          # Square numbers
│   └── *.ll                 # Generated LLVM IR files

Architecture

1. Lexical Analysis (fun_tokens.sc)

  • Uses regular expressions to define language tokens
  • Implements Brzozowski derivatives for efficient tokenization
  • Supports keywords, identifiers, numbers, operators, and punctuation

2. Parsing (fun_parser.sc)

  • Parser combinator library for building recursive descent parsers
  • Grammar rules for expressions, statements, and declarations
  • Handles operator precedence and associativity

3. Type System (resit.sc)

  • Static type checking with type inference
  • Type environment management
  • Support for function types and variable bindings

4. CPS Transformation

  • Converts direct-style expressions to Continuation-Passing Style
  • Enables proper compilation of complex control flow
  • Handles function calls, conditionals, and sequencing

5. LLVM Code Generation

  • Direct translation from CPS intermediate representation to LLVM IR
  • Type-aware code generation for integers and floating-point
  • Proper handling of function calls and control flow

Example Programs

Factorial (fact.fun)

def fact(n: Int) : Int =
  if n == 0 then 1 else n * fact(n - 1);

def facT(n: Int, acc: Int) : Int =
  if n == 0 then acc else facT(n - 1, n * acc);

def top() : Void = {
  print_int(fact(6));
  print_char('\n')
};

top()

Towers of Hanoi (hanoi.fun)

def hanoi(n: Int, a: Int, b: Int, c: Int) : Void =
  if n != 0 then {
    hanoi(n - 1, a, c, b);
    print_int(a);
    print_char('-'); print_char('>');
    print_int(b);
    print_char('\n');
    hanoi(n - 1, c, b, a)
  } else skip();

hanoi(4,1,2,3)

Mandelbrot Set (mand.fun)

val Ymin: Double = -1.3;
val Ymax: Double =  1.3;
val Ystep: Double = 0.05;

def m_iter(m: Int, x: Double, y: Double, zr: Double, zi: Double) : Void = {
  if Maxiters <= m
  then print_star() 
  else {
    if 4.0 <= zi*zi+zr*zr then print_space() 
    else m_iter(m + 1, x, y, x+zr*zr-zi*zi, 2.0*zr*zi+y) 
  }
};

y_iter(Ymin)

Usage

Prerequisites

  • Scala 3.x or Ammonite (amm)
  • LLVM/Clang for compiling generated IR to executable code

Compilation Process

  1. Compile Fun source to LLVM IR:

    amm resit.sc -- fact.fun
    # or
    scala-cli resit.sc -- fact.fun
  2. Compile LLVM IR to executable:

    clang fact.ll -o fact
  3. Run the executable:

    ./fact

One-step Compilation and Execution

amm resit.sc -- run fact.fun

This will:

  1. Parse the Fun source code
  2. Generate LLVM IR
  3. Compile to native executable
  4. Execute the program

Technical Details

Type System

  • Type Inference: Automatic type inference for expressions
  • Type Checking: Static type checking with error reporting
  • Type Environment: Maintains variable and function type information

CPS Transformation

The compiler uses Continuation-Passing Style transformation to:

  • Convert direct-style function calls to continuation-based calls
  • Handle complex control flow properly
  • Enable efficient code generation

LLVM Code Generation

  • Type-aware compilation: Different code generation for Int vs Double
  • Function calls: Proper parameter passing and return value handling
  • Control flow: Conditional branches and function returns
  • Built-in functions: Integration with C standard library via printf

Error Handling

  • Parse errors: Detailed error messages with location information
  • Type errors: Type mismatch detection and reporting
  • Runtime errors: Graceful handling of undefined variables/functions

Generated LLVM IR Example

For the factorial function, the compiler generates:

define i32 @fact(i32 %n) {
   %tmp_1 = icmp eq i32 %n, 0
   br i1 %tmp_1, label %then_5, label %else_6
then_5:
   ret i32 1
else_6:
   %tmp_2 = sub i32 %n, 1
   %tmp_3 = call i32 @fact(i32 %tmp_2)
   %tmp_4 = mul i32 %n, %tmp_3
   ret i32 %tmp_4
}

Development Notes

This compiler demonstrates several advanced compiler techniques:

  • Parser combinators for clean, composable parsing
  • Regular expression derivatives for efficient lexical analysis
  • CPS transformation for proper compilation of functional programs
  • Type inference and static type checking
  • Direct LLVM code generation without intermediate representations

The implementation is modular and extensible, making it easy to add new language features or optimize the generated code.

About

A complete compiler for the Fun functional programming language, implemented in Scala 3. Features lexical analysis, parser combinators, static type checking, CPS transformation, and LLVM IR code generation. Includes example programs (factorial, Towers of Hanoi, Mandelbrot set) and generates executable native code.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published