In [1]:
import { display } from "tslab";
import { readFileSync } from "fs";

const css = readFileSync("../style.css", "utf8");
display.html(`<style>${css}</style>`);

## Imports and Setup

In [2]:
import { buildParser } from '@lezer/generator';
import { LRParser } from '@lezer/lr';
import { TreeCursor } from '@lezer/common';
import { Tuple, RecursiveSet } from "recursive-set";

# Dealing with Conflicts in Lezer

This notebook demonstrates how **Lezer** deals with *shift-reduce* conflicts.

The following grammar is **ambiguous** because it does not specify the precedence of the arithmetic operators:

```lezer
Expr {
  Expr "+" Expr |
  Expr "*" Expr |
  Number
}
```

In Lezer such ambiguity usually results in a build-time error. However, we can use the `!ambig` marker to force the parser generator to accept the conflict, effectively letting it resolve conflicts by shifting (default behavior).

## Grammar Definition
We use @precedence { ambig } and the !ambig marker to suppress strict conflict errors.

In [3]:
const grammarDefinition = `
  @precedence { ambig }
  @top Program { Expr }
  @tokens {
    Number { "0" | $[1-9] $[0-9]* }
    OpPlus { "+" }
    OpMult { "*" }
    space { $[ \t\n\r]+ }
  }
  @skip { space }
  Expr {
    Expr !ambig OpPlus Expr |
    Expr !ambig OpMult Expr |
    Number
  }
`;

let parser: LRParser; 
try {
    parser = buildParser(grammarDefinition);
    console.log("SUCCESS: Parser generated.");
} catch (e: any) {
    console.error("CRITICAL ERROR: Parser generation failed.", e.message);
}

[31mCRITICAL ERROR: Parser generation failed. shift/reduce conflict between
  Expr -> Expr · OpPlus Expr
and
  Expr -> Expr OpPlus Expr
With input:
  Expr OpPlus Expr · OpPlus …
The reduction of Expr is allowed before OpPlus because of this rule:
  Expr -> Expr · OpPlus Expr
Shared origin: @top -> · Expr
  via Expr -> Expr OpPlus · Expr
    Expr -> Expr · OpPlus Expr[39m


## Specification of the AST Builder
### Helper Functions and Type Definitions

Before we traverse the parse tree, we define the structure of our AST nodes and some utility functions.

1.  **`AstNode` Type**:
    We define `AstNode` as a recursive type using a union. An AST node is either:
    *   A simple `number` (leaf node).
    *   A `Tuple` containing an operator string and two child `AstNode`s (internal node). This structure allows us to build a tree representation of the arithmetic expression.

2.  **`getNodeText`**:
    A simple helper that extracts the raw text from the input string corresponding to a specific node's position (`from` and `to`).

3.  **`isBinaryExpr`**:
    This function inspects a `TreeCursor` pointing to an `Expr` node to determine if it represents a **binary operation** (like `1 + 2`) or a simple wrapper/passthrough.
    *   It iterates through the node's children.
    *   If it finds a child named `OpPlus` or `OpMult`, it confirms that the node is a binary expression.
    *   This distinction is crucial for the AST builder to decide whether to create a new `Tuple` or recursively descend to the next child.

In [4]:
type AstNode = number | Tuple<[string, AstNode, AstNode]>;

function getNodeText(from: number, to: number, input: string): string {
    return input.slice(from, to);
}

function isBinaryExpr(cursor: TreeCursor): boolean {
    if (!cursor.firstChild()) return false;
    let hasOp = false;
    do {
        if (cursor.name === "OpPlus" || cursor.name === "OpMult") hasOp = true;
    } while (cursor.nextSibling());
    cursor.parent();
    return hasOp;
}

We implement a function `buildAst` that traverses the **Concrete Syntax Tree (CST)** generated by Lezer and transforms it into an **Abstract Syntax Tree (AST)** composed of `Tuple` objects.

Since the grammar is ambiguous, the resulting tree structure will depend on Lezer's internal conflict resolution (typically shifting), often leading to right-associative trees.

In [5]:
function buildAst(cursor: TreeCursor, input: string): AstNode {
    const name = cursor.name;
    switch (name) {
        case "Number": {
            const val = Number(getNodeText(cursor.from, cursor.to, input));
            if (!Number.isFinite(val)) throw new Error("Invalid number");
            return val;
        }
        case "Program": {
            if (!cursor.firstChild()) throw new Error("Program empty");
            const res = buildAst(cursor, input);
            cursor.parent();
            return res;
        }
        case "Expr": {
            if (!isBinaryExpr(cursor)) {
                if (cursor.firstChild()) {
                    const res = buildAst(cursor, input);
                    cursor.parent();
                    return res;
                }
                throw new Error("Empty Expr node");
            }
            let left: AstNode | undefined;
            let right: AstNode | undefined;
            let op: string | undefined;
            let childIdx = 0;
            if (cursor.firstChild()) {
                do {
                    const childName = cursor.name;
                    switch (childName) {
                        case "OpPlus":
                        case "OpMult":
                            op = getNodeText(cursor.from, cursor.to, input);
                            break;
                        default:
                            const node = buildAst(cursor, input);
                            if (childIdx === 0) left = node; else right = node;
                            childIdx++;
                            break;
                    }
                } while (cursor.nextSibling());
                cursor.parent();
            }
            if (left !== undefined && right !== undefined && op !== undefined) {
                return new Tuple(op, left, right);
            }
            throw new Error("Invalid binary expression structure");
        }
        default: throw new Error(`Unexpected node type: ${name}`);
    }
}

## Testing the Parser

The function `test(s)` takes a string `s`, parses it, and prints the resulting AST.

We use `RecursiveSet` to store the results. This ensures that structurally identical ASTs are stored only once (Value Semantics).

In [6]:
const test = (input: string): AstNode => {
    try {
        const tree = parser.parse(input);
        const ast = buildAst(tree.cursor(), input);
        console.log(`Input: "${input}"`);
        console.log(`AST:   ${ast.toString()}`);
        return ast;
    } catch (e) {
        console.error(`Error parsing "${input}":`, e);
        throw e;
    }
};

const s = new RecursiveSet<AstNode>();

### Executing Test Case 1: Ambiguity Check

When Lezer encounters a grammar with *shift/reduce* conflicts (like our ambiguous `Expr` rules), it throws a **Generation Error** (`GenError`) and **refuses to create the parser**.

In this section, we define the ambiguous grammar *without* any conflict-resolving markers to demonstrate this strict behavior. **We expect the parser generation to fail.**

In [10]:
if (!parser) {
    console.log("Parser is not initialized!");
}else{
    console.log("--- Test Case 1 ---");
    s.add(test("1*2+3")); 
}

Parser is not initialized!
