Skip to content

Design and implementation of a divergence check algorithm for recursive Match Types in the Scala 3 compiler

Notifications You must be signed in to change notification settings

Lluc24/divergence

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

54 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Static Divergence Checking for Scala 3 Match Types

A termination guarantee for recursive match type reduction

Scala 3 PR Issue

Overview

This repository contains the research, implementation, and documentation for a static divergence checking algorithm for Scala 3 match types. The work eliminates the compiler's unsafe reliance on catching StackOverflowError exceptions by providing formal termination guarantees through compile-time divergence detection.

Quick Links:

Project Context:

I undertook this advanced master-level project as an undergraduate bachelor student during my exchange at EPFL. Exchange students were allowed to enroll in any course regardless of level, which gave me the unique opportunity to work on this challenging compiler research project.


The sections below provide a summary of the work accomplished during this project.

The Problem

Match types in Scala 3 enable type-level pattern matching and computation:

type Elem[X] = X match
  case String      => Char
  case Array[t]    => Elem[t]
  case Iterable[t] => Elem[t]
  case _           => X

However, not all recursive match types have proper terminating conditions:

// Cyclic reduction
type Loop[X] = X match
  case Int    => Loop[Float]
  case Float  => Loop[Double]
  case Double => Loop[Int]

// Growing complexity
type Wrap[X] = X match
  case AnyVal => Wrap[List[X]]
  case AnyRef => Wrap[List[X]]

Current Approach: The compiler allows all reductions to proceed and eventually catches StackOverflowError exceptions.

Critical Issues (documented in OpenJDK bugs JDK-8177802, JDK-8318888, JDK-8217475):

  • Finally blocks bypassed
  • Locks remain held after recovery
  • Platform-dependent behavior
  • Undefined behavior per JVM specification

The Solution

This project implements a static divergence checking algorithm that detects potentially infinite reductions before stack overflow occurs. The algorithm is adapted from SIP-31's implicit resolution divergence checking, extending it from single types to argument lists.

Core Algorithm

The divergence checker tracks reduction history and detects two forms of divergence:

  1. Equivalence Domination (cycle detection): Arguments repeat exactly
  2. Strict Growth Domination: Covering sets remain constant while complexity increases

Key Definitions:

  • Complexity vector c(T): Structural sizes (AST node counts) of each argument
  • Covering set cs(T): Union of all type symbols mentioned in arguments

Arguments T dominate U when:

  • They are type-equivalent (cycle), OR
  • cs(T) = cs(U) AND c(Ti) ≥ c(Ui) for all i AND ∃j: c(Tj) > c(Uj) (strict growth)

Algorithm Evolution

The implementation went through three major iterations during the semester:

  1. Strict size-decreasing (November): Required ∑c(Ti) < ∑c(Ui). Too restrictive.
  2. Lexicographical comparison (December): Allowed same-size steps if non-cyclic. Better but still limited.
  3. Pointwise domination with covering sets (January): Current algorithm with compiletime.ops exemptions. Allows all reasonable recursive patterns including Peano arithmetic and accumulators.

Termination Proof

The algorithm is proven to terminate using three lemmas:

  1. Lemma 1 (Finite Covering Sets): Only finitely many covering sets can appear (≤ 2^|S_prog|)
  2. Lemma 2 (Non-Decreasing Subsequence): Every infinite sequence in ℕⁿ contains a non-decreasing subsequence (generalized Dickson's Lemma)
  3. Lemma 3 (Finite Constructible Types): Fixed covering set + fixed complexity = finitely many distinct types (Pigeonhole Principle)

Proof: By contradiction—any infinite sequence would either trigger growth detection (Lemma 2 + Lemma 1) or cycle detection (Lemma 3).

See report/Report.md for the complete formal proof.

Implementation

The implementation is available in Pull Request #24661, addressing Issue #22587.

Key Features

  • Zero false positives: All existing Scala 3 tests pass unchanged
  • Zero false negatives: Correctly detects all divergent patterns
  • Clear error messages: Includes reduction traces, complexity vectors, and covering sets
  • Special handling: Exempts compiletime.ops patterns (Peano arithmetic, counters) that would be false positives
  • Performance: Minimal overhead (linear in recursion depth × number of type parameters)

Where It Works

The divergence check is implemented in Type.tryNormalize via a new guardedReduce method. See reduction-workflow.md for a detailed explanation of how match types are reduced in the compiler.

Additional Issues Resolved

During the development of this project, two related compiler issues were also tackled and closed:

  • Issue #24120: Separate MatchReducer from TypeComparer. Refactored the match type reduction code for better maintainability and clarity.
  • Issue #24730: Fix typeSize mismatch on tuples. Corrected complexity calculation for tuple types to ensure accurate divergence detection.

Repository Structure

.
├── report/
│   ├── report.pdf          # Compiled PDF report (for quick reading)
│   ├── report.tex          # Full LaTeX report with formal proofs
│   ├── references.bib      # Bibliography
│   └── Report.md           # Markdown version (recommended)
├── slides/
│   ├── slides.pdf          # Compiled PDF slides
│   └── slides.tex          # Presentation slides (LaTeX Beamer)
├── reduction-workflow.md   # Match type reduction workflow in Scala 3 compiler
└── Readme.md               # This file

Building the Documents

Report

cd report
pdflatex report.tex
bibtex report
pdflatex report.tex
pdflatex report.tex

Or use your preferred LaTeX editor (TeXShop, Overleaf, VS Code with LaTeX Workshop, etc.).

Slides

cd slides
pdflatex slides.tex
pdflatex slides.tex  # Run twice for TOC and references

Key Contributions

  1. Formal divergence checking algorithm extending SIP-31 from single types to argument lists
  2. Rigorous termination proof using Pigeonhole Principle and inductive reasoning on complexity vectors
  3. Production implementation in the Scala 3 compiler eliminating unsafe stack overflow catching
  4. Comprehensive validation through test cases demonstrating correct divergence detection with no false positives or negatives

Understanding Match Types

Match types are a powerful Scala 3 feature enabling type-level computation through pattern matching. They can express:

  • Type transformations: Elem[List[String]]Char
  • Compile-time computation: Peano arithmetic, binary counters
  • Generic derivation: Automatic instance generation for type classes

Why they diverge: Like any recursive computation, match types can enter infinite loops when:

  • Reduction cycles through a closed set of types
  • Complexity grows unboundedly with constant type constructors
  • There's no terminating base case

For a deep dive into the match type reduction workflow in the Scala 3 compiler, see reduction-workflow.md.

Related Work

Acknowledgements

I am deeply grateful to:

  • Martin Odersky for trusting me with such an amazing yet ambitious project and providing invaluable mentorship throughout
  • Matthieu Bovel for weekly meetings, detailed code reviews, and continuous feedback that shaped both the theoretical framework and practical implementation
  • The Scala Team and LAMP for their welcoming research environment

This project has been an incredible learning journey into type systems, compiler implementation, and formal verification.

License

This research project and its documentation are part of academic work at EPFL. The implementation in the Scala 3 compiler follows the Apache 2.0 license.

About

Design and implementation of a divergence check algorithm for recursive Match Types in the Scala 3 compiler

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages