Types and Programming Languages Chapter 3 Untyped Arithmetic Expressions

Tom Stuart edited this page Feb 13, 2017 · 12 revisions
Clone this wiki locally

Summary

The big picture

Programming languages are an engineering invention, but they can be studied mathematically. If we use mathematical language to describe the syntax and execution of a simple programming language, we can use mathematical techniques to investigate and analyse it. By applying operational semantics, we can detect certain classes of errors in programs without running them.

Notation

Familiarity with the following set notation is required for the chapter:

  • {foo, bar}
    • the set containing two elements, foo and bar
  • t1T
    • t1 is an element of the set T
  • {foo, bar} ⊆ T
    • the set containing foo and bar is a subset of the set T (i.e. foo and bar are both members of T)
  • {foo, bar} ∪ T
    • the union of the set containing foo and bar and the set T

Syntax

There are several mathematical ways to describe the possible expressions (“terms”) of a programming language. Unless the language is very simple, the set of all possible terms is infinitely large. That infinite set can either be indirectly characterised as containing all the terms that satisfy certain rules, or more directly constructed as the limit of an infinite sequence of explicit finite sets of terms.

Because the number of terms is usually infinite, we can’t prove anything about “all terms” by actually checking every term. Instead we need mathematical techniques, like proof by induction, which allow us to prove things about all possible terms simultaneously.

Execution

There are several mathematical ways to describe the execution behaviour of a programming language. TAPL chooses a technique called “operational semantics” which treats a term as the state of an imaginary machine; a single step of the machine turns the term into some new term, and it may take many steps of the machine for it to reach a final state (if it ever does). Operational semantics mathematically captures the execution of a programming language by defining the rules of this imaginary machine.

The machine’s rules are written as two-dimensional “inference rules” which describe both the prerequisites and the result of a particular machine step. We can arrange these rules into trees to show how a specific term is affected by a single step of the machine.

We can nominate certain kinds of terms (e.g. true and false) as “values”, and say that it’s an error when the machine has a non-value as its current term but can’t take any more steps. (We will see later that the point of a type system is to allow us to predict these errors without actually running the machine.)


3.2.3, page 27

For each natural number i, define a set Si as follows…

  • @tomstuart: Here’s a little Ruby interpretation of definition 3.2.3:

    require 'set'
    
    terms = Set.new
    i = 0
    
    loop do
      puts "iteration #{i}"
      puts "terms = #{terms.entries}"
    
      gets # hit enter to continue
    
      terms =
        Set.new(['true', 'false', '0']) +
    
        terms.flat_map { |t|
          ["succ (#{t})", "pred (#{t})", "iszero (#{t})"]
        } +
    
        terms.flat_map { |t1|
          terms.flat_map { |t2|
            terms.flat_map { |t3|
              "if #{t1} then #{t2} else #{t3}"
            }
          }
        }
    
      i += 1
    end

    In which context, exercise 3.2.5 means “prove that this :point_up: program has the same output if you change line 12 to terms +=”.

3.5.8, page 38

If t is in normal form, then t is a value.

  • @tomstuart: Notice that this is only true because the language has booleans and nothing else, so it happens to be impossible to “make a mistake” when you write a term.

    The condition in an if … then … else … term inevitably evaluates to a boolean because that’s the only kind of thing there is, so either E-IfTrue or E-IfFalse must eventually apply, and both rules dismantle the if … then … else … and replace it with a smaller expression.

    If the language supported other kinds of thing (e.g. numbers) then this wouldn’t be true any more: a badly-written if … then … else … term with a non-boolean condition doesn’t have an evaluation rule, so it’s in normal form, but it isn’t a value.

3.5.12, page 39

First, we choose some well-founded set S and give a function f mapping “machine states” (here, terms) into S.

  • @tomstuart: “well-founded set” means “you always run out of smaller elements” (see 2.2.10 on page 18).

    For example, the natural numbers are a well-founded set because zero is smaller than every other natural number. (An example of a non–well-founded set is the integers, because there is no smallest integer: -1 is smaller than 0, but -2 is even smaller, -3 is smaller again, and so on.)

    The termination proof assumes you realise that a) the size of a term is a natural number, and b) the natural numbers are well founded. In other words: if you can prove that evaluation always reduces the size of a term, then evaluation must eventually finish because the size can’t go below zero.

  • @tuzz: I watched this video which helped me understand this idea, although I'm not sure if it's technically the same thing.