In [1]:
import { requireCytoscape, requireCarbon } from "./lib/draw";

requireCarbon();
requireCytoscape();

# Lexing and Parsing

## Where Were We?

1. Language primitives (i.e., building blocks of languages)
2. Language paradigms (i.e., combinations of language primitives)
    - Last time: laziness in TypeScript via thunks and generators
    - This time: **first-class laziness** in Haskell
3. **Building a language** (i.e., designing your own language)
    - Last time: laziness
    - This time: **lexing** and **parsing**

## LambdaTS

- Very simple language that contains expressions only (100% pure).
- This language is **Turing-complete**: can compute anything that any other language can.

### Construct 1: Math expressions with integers

```ts
5 + 6
```

```ts
(5 + 6)*2
```

```ts
(2 + 3) / (5 - 3)
```

### Construct 2: Conditional expressions

```ts
5 ? 2 : 3                // true if condition is non-zero, false if zero
```

### Construct 3: One-parameter anonymous function

```ts
λx => x + 1              // (x) => x + 1
```

```ts
λx => x ? 2 : 3          // (x) => x ? 2 : 3
```

- Functions are first-class and always pure.
- Can be parameter or return value of other functions.
- These are higher-order functions.
- Side-node, we can encode a two-parameter anonymous function as
```ts
(λarg1 => λarg2 => arg1 + arg2)
```

### Construct 4: Function application

```ts
(λx => x + 1)(5)           // (results in 6)
```

### That's it!

1. Math expressions on integers
2. Conditional expressions
3. One parameter anonymous function
4. Function application

### Side Note: Lambda Calculus

- This language is known as *lambda calculus* with natural numbers.
- You might also notice that TypeScript is a superset of this language.

### Question: What data structure can we use to encode an entire programming language?
    
- For starters, we could always use the string representation.
- But is there a more structured representation that is ammenable to algorithms?

In [2]:
import { draw, treeLayout } from "./lib/draw";
import * as T from "./lib/lambdats/token";
import * as E from "./lib/lambdats/expr";
import * as Parser from "./lib/lambdats/parser";

// Ignore me for now
function drawProg(prog: string|E.Expr): void {
    if (typeof prog === 'string') {
        draw(E.cytoscapify(Parser.parse(prog)), 800, 350, treeLayout);
    } else {
        draw(E.cytoscapify(prog), 800, 350, treeLayout);
    }
}

In [3]:
drawProg("5 + 6");

In [4]:
drawProg("(5 + 6) * 2");

In [5]:
drawProg("(2 + 3) / (5 - 3)");

In [6]:
drawProg("5 ? 2 : 3");

In [7]:
drawProg("(5 - 3) ? 2 : 3");

In [8]:
drawProg("λx => x + 1");

In [9]:
drawProg("λx => x ? 2 : 3");

In [10]:
drawProg("(λx => x + 1)(5)");

### Answer: We can use a tree encoded as an algebraic data type!

- This tree is called an *abstract syntax tree* or *AST*.
- Together *lexing* and *parsing* are the steps to take to convert a string into an AST.
- *lexing* takes a string into a stream of tokens.
- *parsing* takes a stream of tokens into an AST.

## Outline of interpreter / compiler

* Given **string of source code**.
* Convert that to **stream of tokens**. This is called **lexing**.
* Convert that to **abstract syntax tree (AST)**. This is called **parsing**.
* Then you can:
    * **Interpret** the AST to get the answer (easy to implement, runs slowly, this was `node`).
    * **Transpile** the AST to another language (medium to implement, runs more quickly, this `tsc`).
    * **Compile** the AST to machine code executable (hard to implement, runs fastest).

## A Look Ahead

1. Lexing and parsing (today).
2. Interpreting.
3. Transpiling.
4. Introspection (look at AST of TypeScript).

## Lexing

- Short for **lexical analysis**.
- "Lexical" means "related to words".
- Converts string of characters to stream of words, which are called **tokens**.
- Typically skips whitespace and comments.
- Important to separate lexing and parsing because the parser shouldn't concern itself with the details of the string.

### Tokens

- identifier: string of letters (upper case or lower case).
- number: string of digits (integer).
- binary operators: `+ - * /`
- other symbols: `( ) ? : λ =>`

In [11]:
type BinaryOperator = "+" | "-" | "*" | "/";
type Symbol = "(" | ")" | "=>" | "?" | ":" | "λ";
type Identifier = {
    tag: "IDENTIFIER";
    name: string;  // typescript variable names
};
type NumericConstant = {
    tag: "NUMBER";
    value: number; // 42, 200
}
type Error = {
    tag: "ERROR";
    ch: string;
}
type Token = Identifier | BinaryOperator | Symbol | NumericConstant | Error;

### Optional: Lexing Code

In [12]:
const CODE_POINT_ZERO = "0".codePointAt(0) as number;
const CODE_POINT_NINE = "9".codePointAt(0) as number;

/**
 * If "ch" is a digit, return its value. Otherwise return undefined.
 */
function getDigit(ch: string): number | undefined {
    const code = ch.codePointAt(0);
    if (code === undefined || code < CODE_POINT_ZERO || code > CODE_POINT_NINE) {
        return undefined;
    } else {
        return code - CODE_POINT_ZERO;
    }
}

/**
 * Return whether "ch" is a letter (A-Z in either case).
 */
function isLetter(ch: string): boolean {
    return ch.match(/^[A-Za-z]/) !== null;
}

In [13]:
/**
 * Generate tokens from the input file.
 * Question: What are we using here?
 */
function* getTokens(input: string): Generator<Token> {
    let i = 0;

    while (i < input.length) {
        const ch = input[i];
        if (ch === " ") {
            // Skip whitespace.
            i += 1;
        } else if (ch === "+" || ch === "-" || ch === "*" || ch === "/" ||
            ch === "(" || ch === ")" || ch === "?" || ch === ":" || ch === "λ") {

            yield ch;
            i += 1;
        } else if (ch === "=" && i + 1 < input.length && input[i + 1] === ">") {  // arrow token
            yield "=>";
            i += 2;
        } else if (getDigit(ch) !== undefined) {
            let value = 0;
            while (i < input.length) {
                const digitValue = getDigit(input[i]);
                if (digitValue === undefined) {
                    break;
                } else {
                    value = value*10 + digitValue;
                    i += 1;
                }
            }

            yield {
                tag: "NUMBER",
                value: value,
            };
        } else if (isLetter(ch)) {
            let name = "";
            while (i < input.length && isLetter(input[i])) {
                name += input[i];
                i += 1;
            }
            yield {
                tag: "IDENTIFIER",
                name: name,
            };
        } else {
            yield {
                tag: "ERROR",
                ch: ch,
            }
            // End lexing.
            break;
        }
    }
}

### Examples

In [14]:
Array.from(getTokens("+ -   *    asdfasdfasdf  /"));

[ [32m'+'[39m, [32m'-'[39m, [32m'*'[39m, { tag: [32m'IDENTIFIER'[39m, name: [32m'asdfasdfasdf'[39m }, [32m'/'[39m ]


Notice that spaces were skipped. Tokens with associated values:

In [15]:
Array.from(getTokens("x + 5"));

[ { tag: [32m'IDENTIFIER'[39m, name: [32m'x'[39m }, [32m'+'[39m, { tag: [32m'NUMBER'[39m, value: [33m5[39m } ]


In [16]:
Array.from(getTokens("(λx => x + 1)(5)"));

[
  [32m'('[39m,
  [32m'λ'[39m,
  { tag: [32m'IDENTIFIER'[39m, name: [32m'x'[39m },
  [32m'=>'[39m,
  { tag: [32m'IDENTIFIER'[39m, name: [32m'x'[39m },
  [32m'+'[39m,
  { tag: [32m'NUMBER'[39m, value: [33m1[39m },
  [32m')'[39m,
  [32m'('[39m,
  { tag: [32m'NUMBER'[39m, value: [33m5[39m },
  [32m')'[39m
]


- Note that `=>` is a single token, not two.
- Parser doesn't want to see the individual characters.
- We can also change `=>` to `->` or `.` or a Unicode `→` symbol, or anything on [this page](https://unicode-table.com/en/sets/arrow-symbols/#right-arrows) without changing the parser.
- In fact the lexer could accept _all_ of those and just generate the `=>` token for the parser.

### Optional: Helper Class

Helper class to make it possible to peek ahead at the token stream:

In [17]:
/**
 * Converts a generated stream of tokens into one that can be peeked ahead.
 */
class TokenStream {
    public readonly stream: Generator<Token>;
    public token: Token | undefined;

    constructor(stream: Generator<Token>) {
        this.stream = stream;
        this.token = this.getNext();
    }

    /**
     * Fetch the next token. Does not update the "token" field.
     */
    private getNext(): Token | undefined {
        const next = this.stream.next();
        return next.done ? undefined : next.value;
    }

    /**
     * Peeks at the next token. Does not advance the stream.
     */
    public peek(): Token | undefined {
        return this.token;
    }

    /**
     * Gets the next token and advances the stream.
     */
    public next(): Token | undefined {
        const oldToken = this.token;
        this.token = this.getNext();
        return oldToken;
    }
}

In [18]:
const lexer = new TokenStream(getTokens("2 + 3"));
console.log(Array.from(getTokens("2 + 3")));

console.log(lexer.peek());
console.log(lexer.peek());
console.log(lexer.peek());
console.log(lexer.next());

console.log(lexer.peek());
console.log(lexer.next());

console.log(lexer.peek());
console.log(lexer.next());

console.log(lexer.peek());
console.log(lexer.next());


[ { tag: [32m'NUMBER'[39m, value: [33m2[39m }, [32m'+'[39m, { tag: [32m'NUMBER'[39m, value: [33m3[39m } ]
{ tag: [32m'NUMBER'[39m, value: [33m2[39m }
{ tag: [32m'NUMBER'[39m, value: [33m2[39m }
{ tag: [32m'NUMBER'[39m, value: [33m2[39m }
{ tag: [32m'NUMBER'[39m, value: [33m2[39m }
+
+
{ tag: [32m'NUMBER'[39m, value: [33m3[39m }
{ tag: [32m'NUMBER'[39m, value: [33m3[39m }
[90mundefined[39m
[90mundefined[39m


## Parsing

* Takes a stream of tokens and generates an **abstract syntax tree (AST)**.
* Many ways to do this:
    * Top-down parser
        * **Recursive descent**   -- what we'll do today
        * LL(1)
    * Bottom-up parser
        * LR
            * LR(0)
            * SLR(1)
            * LALR(1)  -- YACC
            * CLR(1)
        * Operator precedence
        

### Abstract Syntax Tree

- We'll now give the abstract syntax tree (AST) for LambdaTS.
- drawProg is drawing the data structure below.

In [19]:
type BinaryExpr = { // Binary expression on numbers
    tag: "BINARY";
    operator: BinaryOperator;
    left: Expr;
    right: Expr;
};

type ConditionalExpr = { // Ternary conditional expression.
    tag: "CONDITIONAL";
    condExpr: Expr;
    thenExpr: Expr;
    elseExpr: Expr;
};

type FunctionExpr = { // A function expression. (Not a call, just the function.)
    tag: "FUNCTION";
    parameter: string;
    body: Expr;
};

type CallExpr = { // A function call expression
    tag: "CALL";
    func: Expr;
    argument: Expr;
};

// Any expression
type Expr = BinaryExpr | ConditionalExpr | FunctionExpr | CallExpr | Identifier | NumericConstant;

### Parsing atoms

Let's start by just parsing only atoms:

In [20]:
function parse(input: string): Expr {
    return parseAtom(new TokenStream(getTokens(input)));
}

// Note: mutually-recursive function with parseAtom
function parseExpr(lexer: TokenStream): Expr {
    return parseAtom(lexer);
}

/**
 * Parse an atom:
 *
 *     n
 *     x
 *     (e)
 */
function parseAtom(lexer: TokenStream): Expr {
    const token = lexer.peek();
    if (typeof token === "object") {
        if (token.tag === "NUMBER") {  // n
            lexer.next();
            return token;
        }
        if (token.tag === "IDENTIFIER") {  // x
            lexer.next();
            return token;
        }
    }

    if (token === "(") { // (e)
        lexer.next();
        const expr = parseExpr(lexer);
        if (lexer.next() !== ")") {
            throw new Error("Missing close parenthesis");
        }

        return expr;
    }

    throw new Error("Can't parse token: " + JSON.stringify(token));
}

In [21]:
parse("523");

{ tag: [32m'NUMBER'[39m, value: [33m523[39m }


In [22]:
drawProg("523");

In [23]:
parse("asdfasdf");

{ tag: [32m'IDENTIFIER'[39m, name: [32m'asdfasdf'[39m }


In [24]:
drawProg("asdfasdf");

In [25]:
parse("(((asdfasdf)))");

{ tag: [32m'IDENTIFIER'[39m, name: [32m'asdfasdf'[39m }


In [26]:
drawProg("(((asdfasdf)))");

- Note parentheses are gone.
- The grouping role of parentheses is replaced with the hierarchy of the AST.

### Parsing sums

- Sums are left-associative.
- Easiest to parse them in a loop. (Only `parseSum()` is new here, but we have to replace all other functions.)

In [27]:
1 - 2 - 3

[33m-4[39m


In [28]:
function parse(input: string): Expr {
    return parseExpr(new TokenStream(getTokens(input)));
}

function parseExpr(lexer: TokenStream): Expr {
    return parseSum(lexer); // Note: changed, which changes what parseAtom refers to
}

function parseSum(lexer: TokenStream): Expr {
    let left = parseAtom(lexer);

    // Sums are left-associative, we can't recurse on the right. Just keep getting more
    // sum expressions and grouping them on the left.
    while (true) {
        const token = lexer.peek();
        if (token === "+" || token === "-") {
            lexer.next();
            const right = parseAtom(lexer);
            left = {
                tag: "BINARY",
                operator: token,
                left: left,
                right: right,
            };
        } else {
            break;
        }
    }

    return left;
}

function parseAtom(lexer: TokenStream): Expr {
    const token = lexer.peek();
    if (typeof token === "object") {
        if (token.tag === "NUMBER") {
            lexer.next();
            return token;
        }
        if (token.tag === "IDENTIFIER") {
            lexer.next();
            return token;
        }
    }

    if (token === "(") {
        lexer.next();
        const expr = parseExpr(lexer);
        if (lexer.next() !== ")") {
            throw new Error("Missing close parenthesis");
        }

        return expr;
    }

    throw new Error("Can't parse token: " + JSON.stringify(token));
}

In [29]:
parse("5 + 6");

{
  tag: [32m'BINARY'[39m,
  operator: [32m'+'[39m,
  left: { tag: [32m'NUMBER'[39m, value: [33m5[39m },
  right: { tag: [32m'NUMBER'[39m, value: [33m6[39m }
}


In [30]:
drawProg("5 + 6");

In [31]:
parse("5 - (6 + 2)");

{
  tag: [32m'BINARY'[39m,
  operator: [32m'-'[39m,
  left: { tag: [32m'NUMBER'[39m, value: [33m5[39m },
  right: {
    tag: [32m'BINARY'[39m,
    operator: [32m'+'[39m,
    left: { tag: [32m'NUMBER'[39m, value: [33m6[39m },
    right: { tag: [32m'NUMBER'[39m, value: [33m2[39m }
  }
}


In [32]:
drawProg("5 - (6 + 2)");

In [33]:
parse("a - (b + c)");

{
  tag: [32m'BINARY'[39m,
  operator: [32m'-'[39m,
  left: { tag: [32m'IDENTIFIER'[39m, name: [32m'a'[39m },
  right: {
    tag: [32m'BINARY'[39m,
    operator: [32m'+'[39m,
    left: { tag: [32m'IDENTIFIER'[39m, name: [32m'b'[39m },
    right: { tag: [32m'IDENTIFIER'[39m, name: [32m'c'[39m }
  }
}


In [34]:
drawProg("a - (b + c)");

### Parsing products and conditionals

- These are just like sums.
- They're broken out into their own function to enforce precedence.
- They are implemented internally either as:
    * a loop for left-associative (sum, product).
    * recursion for right-associative (conditional).

In [35]:
function parse(input: string): Expr {
    return parseExpr(new TokenStream(getTokens(input)));
}

function parseExpr(lexer: TokenStream): Expr {
    return parseConditional(lexer);
}

/**
 * Parse a ternary conditional expression:
 *
 *     e ? e : e
 *
 * The "then" clause is used if the "conditional" clause is non-zero. Otherwise the "else" clause is used.
 */
function parseConditional(lexer: TokenStream): Expr {
    const condExpr = parseSum(lexer);

    if (lexer.peek() === "?") {
        lexer.next();

        const thenExpr = parseConditional(lexer);

        if (lexer.next() !== ":") {
            throw new Error("Colon not found in conditional");
        }

        const elseExpr = parseConditional(lexer);

        return {
            tag: "CONDITIONAL",
            condExpr: condExpr,
            thenExpr: thenExpr,
            elseExpr: elseExpr,
        };
    }

    return condExpr;
}

function parseSum(lexer: TokenStream): Expr {
    let left = parseProduct(lexer);

    // Sums are left-associative, we can't recurse on the right. Just keep getting more
    // sum expressions and grouping them on the left.
    while (true) {
        const token = lexer.peek();
        if (token === "+" || token === "-") {
            lexer.next();
            const right = parseProduct(lexer);
            left = {
                tag: "BINARY",
                operator: token,
                left: left,
                right: right,
            };
        } else {
            break;
        }
    }

    return left;
}

/**
 * Parse a product expression:
 *
 *     e * e
 *     e / e
 *
 * The division is truncated toward zero.
 */
function parseProduct(lexer: TokenStream): Expr {
    let left = parseAtom(lexer);

    // Products are left-associative, we can't recurse on the right. Just keep getting more
    // product expressions and grouping them on the left.
    while (true) {
        const token = lexer.peek();
        if (token === "*" || token === "/") {
            lexer.next();
            const right = parseAtom(lexer);
            left = {
                tag: "BINARY",
                operator: token,
                left: left,
                right: right,
            };
        } else {
            break;
        }
    }

    return left;
}

function parseAtom(lexer: TokenStream): Expr {
    const token = lexer.next();
    if (typeof token === "object") {
        if (token.tag === "NUMBER") {
            return token;
        }
        if (token.tag === "IDENTIFIER") {
            return token;
        }
    }

    if (token === "(") {
        const expr = parseExpr(lexer);
        if (lexer.next() !== ")") {
            throw new Error("Missing close parenthesis");
        }

        return expr;
    }

    throw new Error("Can't parse token: " + JSON.stringify(token));
}

In [36]:
parse("(2 * 3) + (a * b)");

{
  tag: [32m'BINARY'[39m,
  operator: [32m'+'[39m,
  left: {
    tag: [32m'BINARY'[39m,
    operator: [32m'*'[39m,
    left: { tag: [32m'NUMBER'[39m, value: [33m2[39m },
    right: { tag: [32m'NUMBER'[39m, value: [33m3[39m }
  },
  right: {
    tag: [32m'BINARY'[39m,
    operator: [32m'*'[39m,
    left: { tag: [32m'IDENTIFIER'[39m, name: [32m'a'[39m },
    right: { tag: [32m'IDENTIFIER'[39m, name: [32m'b'[39m }
  }
}


In [37]:
drawProg("(2 * 3) + (a * b)");

In [38]:
parse("x ? 2 : 3");

{
  tag: [32m'CONDITIONAL'[39m,
  condExpr: { tag: [32m'IDENTIFIER'[39m, name: [32m'x'[39m },
  thenExpr: { tag: [32m'NUMBER'[39m, value: [33m2[39m },
  elseExpr: { tag: [32m'NUMBER'[39m, value: [33m3[39m }
}


In [39]:
drawProg("x ? 2 : 3");

Note difference between **right associative** (2 and 3 are grouped):

In [40]:
parse("x ? 1 : y ? 2 : 3");

{
  tag: [32m'CONDITIONAL'[39m,
  condExpr: { tag: [32m'IDENTIFIER'[39m, name: [32m'x'[39m },
  thenExpr: { tag: [32m'NUMBER'[39m, value: [33m1[39m },
  elseExpr: {
    tag: [32m'CONDITIONAL'[39m,
    condExpr: { tag: [32m'IDENTIFIER'[39m, name: [32m'y'[39m },
    thenExpr: { tag: [32m'NUMBER'[39m, value: [33m2[39m },
    elseExpr: { tag: [32m'NUMBER'[39m, value: [33m3[39m }
  }
}


In [41]:
drawProg("x ? 1 : y ? 2 : 3");

and **left associative** (1 and 2 are grouped):

In [42]:
parse("1 + 2 + 3");

{
  tag: [32m'BINARY'[39m,
  operator: [32m'+'[39m,
  left: {
    tag: [32m'BINARY'[39m,
    operator: [32m'+'[39m,
    left: { tag: [32m'NUMBER'[39m, value: [33m1[39m },
    right: { tag: [32m'NUMBER'[39m, value: [33m2[39m }
  },
  right: { tag: [32m'NUMBER'[39m, value: [33m3[39m }
}


In [43]:
drawProg("1 + 2 + 3");

# Parsing functions and calls

In [44]:
function parse(input: string): Expr {
    const lexer = new TokenStream(getTokens(input));
    return parseExpr(lexer);
}

function parseExpr(lexer: TokenStream): Expr {
    return parseFunction(lexer);
}

/**
 * Parse a function definition:
 *
 *     λx => e
 */
function parseFunction(lexer: TokenStream): Expr {
    const token = lexer.peek();

    if (token === "λ") {
        lexer.next();
        const parameter = lexer.next();
        if (typeof parameter === "object" && parameter.tag === "IDENTIFIER") {
            if (lexer.next() !== "=>") {
                throw new Error("Missing arrow");
            }

            const body = parseExpr(lexer);

            return {
                tag: "FUNCTION",
                parameter: parameter.name,
                body: body,
            };
        } else {
            throw new Error("Parameter must be an identifier: " + parameter);
        }
    } else {
        return parseConditional(lexer);
    }
}

function parseConditional(lexer: TokenStream): Expr {
    const condExpr = parseSum(lexer);

    if (lexer.peek() === "?") {
        lexer.next();

        const thenExpr = parseExpr(lexer);

        if (lexer.next() !== ":") {
            throw new Error("Colon not found in conditional");
        }

        const elseExpr = parseExpr(lexer);

        return {
            tag: "CONDITIONAL",
            condExpr: condExpr,
            thenExpr: thenExpr,
            elseExpr: elseExpr,
        };
    }

    return condExpr;
}

function parseSum(lexer: TokenStream): Expr {
    let left = parseProduct(lexer);

    // Sums are left-associative, we can't recurse on the right. Just keep getting more
    // sum expressions and grouping them on the left.
    while (true) {
        const token = lexer.peek();
        if (token === "+" || token === "-") {
            lexer.next();
            const right = parseProduct(lexer);
            left = {
                tag: "BINARY",
                operator: token,
                left: left,
                right: right,
            };
        } else {
            break;
        }
    }

    return left;
}

function parseProduct(lexer: TokenStream): Expr {
    let left = parseCall(lexer);

    // Products are left-associative, we can't recurse on the right. Just keep getting more
    // product expressions and grouping them on the left.
    while (true) {
        const token = lexer.peek();
        if (token === "*" || token === "/") {
            lexer.next();
            const right = parseCall(lexer);
            left = {
                tag: "BINARY",
                operator: token,
                left: left,
                right: right,
            };
        } else {
            break;
        }
    }

    return left;
}

function parseCall(lexer: TokenStream): Expr {
    let func = parseAtom(lexer);

    // Calls are left-associative, we can't recurse on the right. Just keep getting more
    // call expressions and grouping them on the left.
    while (true) {
        if (lexer.peek() === "(") {
            lexer.next();
            const argument = parseExpr(lexer);
            if (lexer.next() !== ")") {
                throw new Error("Missing close parenthesis");
            }

            func = {
                tag: "CALL",
                func: func,
                argument: argument,
            };
        } else {
            break;
        }
    }

    return func;
}

function parseAtom(lexer: TokenStream): Expr {
    const token = lexer.next();
    if (typeof token === "object") {
        if (token.tag === "NUMBER") {
            return token;
        }
        if (token.tag === "IDENTIFIER") {
            return token;
        }
    }

    if (token === "(") {
        const expr = parseExpr(lexer);
        if (lexer.next() !== ")") {
            throw new Error("Missing close parenthesis");
        }

        return expr;
    }

    throw new Error("Can't parse token: " + JSON.stringify(token));
}


In [45]:
parse("(λx => x + 1)(5)");

{
  tag: [32m'CALL'[39m,
  func: {
    tag: [32m'FUNCTION'[39m,
    parameter: [32m'x'[39m,
    body: { tag: [32m'BINARY'[39m, operator: [32m'+'[39m, left: [36m[Object][39m, right: [36m[Object][39m }
  },
  argument: { tag: [32m'NUMBER'[39m, value: [33m5[39m }
}


In [46]:
drawProg("(λx => x + 1)(5)");

Two-parameter function:

In [47]:
parse("λa => λb => a + b");

{
  tag: [32m'FUNCTION'[39m,
  parameter: [32m'a'[39m,
  body: {
    tag: [32m'FUNCTION'[39m,
    parameter: [32m'b'[39m,
    body: { tag: [32m'BINARY'[39m, operator: [32m'+'[39m, left: [36m[Object][39m, right: [36m[Object][39m }
  }
}


In [48]:
drawProg("λa => λb => a + b");

# Traversing AST

Let's convert an AST to a string.

In [49]:
/**
 * Returns the string representation of the expression, for debugging.
 */
function exprToString(expr: Expr): string {
    switch (expr.tag) {
        case "IDENTIFIER":
            return expr.name;
        case "BINARY":
            return "(" + exprToString(expr.left) + expr.operator + exprToString(expr.right) + ")";
        case "CONDITIONAL":
            return "(" + exprToString(expr.condExpr) + "?" + exprToString(expr.thenExpr) + ":" + exprToString(expr.elseExpr);
        case "FUNCTION":
            return "(λ" + expr.parameter + "=>" + exprToString(expr.body) + ")";
        case "CALL":
            return exprToString(expr.func) + "(" + exprToString(expr.argument) + ")";
        case "NUMBER":
            return expr.value.toString();
    }
}

In [50]:
exprToString(parse("(λx=>(x+1))(5)"));

(λx=>(x+1))(5)


In [51]:
exprToString(parse("(λf => (λx => λy => f(x(x))(y)) (λa => λb => f(a(a))(b)))(λfact => λn => n ? n*fact(n - 1) : 1)(5)"));

(λf=>(λx=>(λy=>f(x(x))(y)))((λa=>(λb=>f(a(a))(b)))))((λfact=>(λn=>(n?(n*fact((n-1))):1)))(5)


# Summary

* **Lexing** turns a string into a stream of **tokens**.
* **Parsing** turns those tokens into an **abstract syntax tree (AST)**.
* The AST is given to the interpreter, transpiler, or compiler.
* Next time: **Interpreting** the AST to get the value of the expression.