Skip to content

tennix/mml

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

MML Compiler

A compiler for MML (Modern ML), a strictly-typed, functional-first systems language with ML-style syntax.

Features

Implemented

  • Hindley-Milner Type Inference with mandatory top-level annotations
  • Algebraic Data Types (ADTs) with pattern matching
  • Exhaustiveness Checking for match expressions
  • First-Class Functions and closures with environment capture
  • Module System using structs as first-class modules
  • Privacy Control with pub keyword
  • Records with field access and sorted field ordering
  • Tuples with projection
  • External Function Interface (FFI) for C interop
  • LLVM Backend generating optimized native code
  • Executable Programs with C runtime linking

Type System

  • Hindley-Milner type inference with let-generalization
  • Polymorphic type schemes
  • ADTs with nullary and unary constructors
  • Record types with structural typing
  • Function types with automatic currying
  • Type annotations required at top level

Language Features

Algebraic Data Types

type Option t = | None | Some t;
type Result t e = | Ok t | Err e;

val unwrap_or : Option I32 -> I32 -> I32 = fn opt => fn default =>
  match opt with
    | None => default
    | Some(x) => x;

Pattern Matching

val classify : I32 -> String = fn n =>
  match n with
    | 0 => "zero"
    | 1 => "one"
    | _ => "many";

Closures and Higher-Order Functions

val compose : (I32 -> I32) -> (I32 -> I32) -> I32 -> I32 =
  fn f => fn g => fn x => f(g(x));

val double : I32 -> I32 = fn x => x * 2;
val triple : I32 -> I32 = fn x => x * 3;
val times6 : I32 -> I32 = compose(double)(triple);

Module System

struct Math = {
  pub val pi = 3.14;
  pub val double = fn (x : I32) => x * 2;
  pub val add = fn (x : I32) => fn (y : I32) => x + y;
};

val result : I32 = Math.double(21);

External Functions

external puts : String -> I32 = "puts" [];
external printf : String -> I32 = "printf" [];

val main : I32 = puts("Hello, World!");

Usage

Building the Compiler

cargo build --release

Running Programs

# Build to LLVM IR
cargo run -- build input.mml

# Build and run
cargo run -- run input.mml

CLI Commands

  • build <input> - Compile to LLVM IR
    • -o, --output <file> - Specify output file (default: out.ll)
  • run <input> - Compile and execute
  • test - Run built-in tests
  • repl - Interactive REPL (TODO)
  • fmt <paths> - Format source code (TODO)

Examples

See the examples/ directory for complete programs:

  • hello_world.mml - Basic "Hello, World!" program
  • closures.mml - Higher-order functions and closures
  • option_type.mml - Option type with utility functions
  • modules.mml - Module system with structs

Implementation

Architecture

MML Source → Lexer → Parser → Desugar → Type Checker → Code Generator → LLVM IR → Native Binary

Compiler Phases

  1. Lexing (src/lexer.rs)

    • Token-based lexer using logos
    • Handles keywords, identifiers, literals, operators
  2. Parsing (src/parser.rs)

    • Parser combinator using chumsky
    • Produces surface AST (src/ast.rs)
  3. Desugaring (src/desugar.rs)

    • Converts surface AST to core AST
    • Simplifies complex patterns
    • Desugars syntactic sugar
  4. Type Checking (src/typechecker.rs)

    • Hindley-Milner type inference
    • Exhaustiveness checking for patterns
    • Module and signature conformance checking
  5. Code Generation (src/codegen.rs)

    • LLVM IR generation using inkwell
    • Closure conversion with environment capture
    • ADT representation as tagged unions
    • C-compatible main function generation

Data Representations

  • Closures: {fn_ptr: ptr, env_ptr: ptr} - 16 bytes
  • ADTs: {tag: i64, data: i64} - 16 bytes
  • Records: Sorted field array of i64 values
  • Tuples: Array of i64 values

Type Schemes

All types are represented internally as:

  • I32, I64 - Integer types
  • F64 - Float type
  • Bool - Boolean type
  • String - String type
  • t -> u - Function types
  • (t1, t2, ...) - Tuple types
  • {field1: t1, field2: t2, ...} - Record types
  • Con args - Type constructors

Testing

Test Organization

Tests are organized in the tests/ directory:

tests/
├── phase1/          # Core language features
├── phase2/          # Type system and error handling
├── phase3/          # Module system
├── phase4/          # Executable programs
├── integration/     # Full integration tests
└── bugs/           # Bug reproduction and regression tests

Running Tests

# Run Rust tests
cargo test

# Run all phase tests
for f in tests/phase*/*.mml; do
    cargo run -- run "$f"
done

# Run integration tests
for f in tests/integration/*.mml; do
    cargo run -- run "$f"
done

# Run specific test
cargo run -- run tests/phase1/test_phase1.mml

See tests/README.md for detailed test documentation.

Known Limitations

  • Constructor applications in function bodies may cause forward reference issues in some cases
  • No proper string operations beyond literal strings
  • No standard library beyond basic C FFI
  • No garbage collection (uses malloc without free)
  • Limited error messages

Future Work

  • REPL implementation
  • Code formatter
  • Comprehensive standard library
  • Better error messages with suggestions
  • Implicit structs and type classes
  • Task/concurrency primitives
  • Garbage collection or ownership system

License

MIT License

References

  • MML Language Specification - Version 4.0, 2026 Edition
  • Based on ML family languages (OCaml, SML, F#)
  • Type system: Hindley-Milner with extensions

About

Modern ML

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages