First introduced in 1965 by Knuth [1], LR is a bottom-up parsing technique that works by constructing an automaton, then traveling through it while consuming input one by one and maintaining a stack. A more detailed description about LR can be found [here](https://rahul.gopinath.org/post/2024/07/01/lr-parsing/).

Creating a grammar in LR(1), or even LR(k), can be difficult. Ideally, you want your grammar to be intuitive, easily understood, and readable. This is important because you'll make mistakes or change your mind in the future, and modifying a large grammar while making sure that it is in LR(1) is a painful process. Luckily, a more powerful parser, that can handle all context-free grammar, called Generalised LR (GLR) is available. This post outlines the implementation details of RNGLR and BRNGLR algorithms, both of which are presented in Economopoulos's PhD dissertation [2].
### Generalised LR
#### Eliminating Nondeterminism
If you are familiar with LR, you probably know about *shift/reduce conflicts* (choices between shifting and reducing) and *reduce/reduce conflicts* (choices between reducing different rules), a normal LR parser cannot handle conflicts, as it does not know which choice to make. What we can do is incorporating a bit of breadth-first search, so the parser can try all options, and that is the main idea behind GLR.

For example, consider the following ambiguous grammar:
$$\begin{split} S &\rightarrow a \ B \ c & \hspace{1cm} (1)\\
S &\rightarrow a\ D \ c &\hspace{1cm} (2) \\
B &\rightarrow b &\hspace{1cm} (3) \\
D & \rightarrow b &\hspace{1cm} (4)\end{split}$$
For this grammar, the LR(1) automaton is as below:
![sss](images/lr1_gram.png)
And the LR(1) parse table is:

| state | a   | b   | c               | $       | S   | B   | D   |
| ----- | --- | --- | --------------- | ------- | --- | --- | --- |
| 0     | p2  |     |                 |         | p1  |     |     |
| 1     |     |     |                 | acc     |     |     |     |
| 2     |     | p4  |                 |         |     | p5  | p3  |
| 3     |     |     | p7              |         |     |     |     |
| 4     |     |     | r(B, 3)/r(D, 4) |         |     |     |     |
| 5     |     |     | p6              |         |     |     |     |
| 6     |     |     |                 | r(S, 1) |     |     |     |
| 7     |     |     |                 | r(S, 2) |     |     |     |

In this table, "$pk$" is shift action, it means "go to state $k$" and $r(X, m)$ is the reduce action meaning "reduce symbol $X$ with rule numbered $m$." The symbol $\$$ is used to denote "end of string." There is a reduce/reduce conflict in state 4. Let's see what happens when we try to parse the string "abc".

| Step | Input | State | Stack                     | Next operation  |
| ---- | ----- | ----- | ------------------------- | --------------- |
| 0    | ""    | 0     | $\$, S_0$                 | p2              |
| 1    | "a"   | 2     | $\$, S_0, a, S_2$         | p4              |
| 2    | "ab"  | 4     | $\$, S_0, a, S_1, b, S_4$ | r(B, 3)/r(D, 4) |

A usual LR parser now has to choose between two possible reductions ($B \rightarrow b$ and $D \rightarrow b$). With GLR, it can attempt to try all options, but how would it do that? The simplest solution is to duplicate the stack and treat each stack as a separate process. After performing r(B, 3) action the stack is $\{\$, S_0, a, S_1, B, S_5\}$; similarly, we obtain $\{\$, S_0, a, S_1, D, S_3\}$ when r(D, 4) is applied. However, this approach is not ideal, the number of stacks can blow up exponentially, we need something more efficient.
#### Graph-Structured Stack (GSS)
In the above example, notice that the first four elements are the same in both stacks, therefore we can "share" them in a unified data structure. This is a "Graph-Structured Stack", or GSS, proposed by Tomita in his book [2]. 
![ddd](images/GSS_2.png)

This image illustrates how nodes $S_0$ and $S_1$ are shared between the two stacks. As the name suggests, our stacks is now a single graph, and each element in the stack is a node. In the original Tomita's approach, elements $a$, $D$ and $B$ are individual nodes, but here we have simplified by making them the edge labels between states.

The nodes are divided into $n+1$ *levels,* with $n$ as the length of the input string. GSS construction is done level by level, and a new level is created upon a *shift* action. The GSSNode data structure is as follow:

In [1]:
class GSSNode:
    '''
        Represent a node in the GSS structure, nodes are indentified by id
    '''
    def __init__(self, level: int, id: int, label):
        self.level = level
        self.id = id
        self.label = label
        self.children: list[tuple['GSSNode', 'SPPFNode']] = []

    def __repr__(self):
        repr = f"Node({self.label})"
        return repr

    def __eq__(self, other):
        return self.id == other.id

    def __hash__(self):
        return hash(self.id)

    def add_child(self, child: 'GSSNode', edge):
        self.children.append((child, edge))

The GSS is a bit unusual. It does not perform the "pop" operation like an ordinary stack. Once a node is created, it is never removed. Instead of popping $m$ nodes out of the stack, we perform a traversal of length $m$ from the original node. For example, instead of popping 2 elements from node $S_3$, we traverse down the graph with length 2 and find that node $S_0$ is our target. We define a method to perform this operation:

In [2]:
class GSSNode(GSSNode):
    def find_paths_with_length(self, m: int) -> set[tuple['GSSNode',...]]:
        '''
            Find a set of nodes with length m from the origin node,
            return tuples of lenght m in a set, tuples contain all the labels and the destination node
        '''

        res: set[tuple] = set()

        def dfs(node: GSSNode, path: list[GSSNode]):
            if (len(path) >= m):
                res.add(tuple(path + [node]))
                return

            for child, edge in node.children:
                dfs(child, path + [edge])

        dfs(self, [])
        return res

Finally we can have our GSS class:

In [3]:
class GSS:
    '''
        A Graph Structured Stack (GSS)
    '''
    def __init__(self):
        '''
            Initialize the graph, in RNGLR, a GSS has n levels, where n is the length of input string

            Each level is a set of GSSNodes, levels are stored in a list
        '''
        self.level: list[set[GSSNode]] = []
        self.count = 0

    def resize(self, n: int):
        '''
            Resize the GSS to include n levels
        '''
        self.level = [set() for i in range(n)]

    def create_node(self, label, level: int):
        '''
            Create a new node with label in a specific level
        '''
        new_node = GSSNode(level, self.count, label)
        self.count += 1
        self.level[level].add(new_node)
        return new_node
    
    def find_node(self, label, level: int) -> GSSNode:
        '''
            Find a node with label and in a specific level

            return
                GSSNode object if found, else None is returned
        '''
        # Can be optimized further
        for node in self.level[level]:
            if (node.label == label):
                return node
        return None
    
    def __repr__(self):
        '''
            Print the GSS structure
        '''
        repr = "GSS:\n"
        for idx, level in enumerate(self.level):
            repr += f"Level {idx}:\n"
            for node in level:
                repr += f"    {node}\n"
                for child, edge in node.children:
                    repr += f"        {child} - {edge}\n"
        return repr

[Maybe have a parse example with GSS here]
#### Shared Packed Parse Forest (SPPF)
For practical usage, we are more interested in a full parser rather than just a recogniser. While a recogniser's output is simply a yes/no answer, a parser has to provide a full derivation path (usually in the form of the parse tree). However, a parse tree is insufficient because we are dealing with all context-free grammars, which includes ambiguous grammars; thus, multiple derivations (or even infinite ones) are possible. Instead of a parse tree, a data structure called *Shared Packed Parse Forest* (SPPF) is used.

Consider the string "abc" in the above example, we have 2 possible derivations, resulting in 2 parse trees: 
![parse tree](images/Parse_tree.png)
In an SPPF, we combine them into a single graph, the final result looks like this
![sppf](images/SPPF.png)

Nodes like "S", "a", "b" and "c" are shared to reduce space. Since $S$ can be derived in two ways (either $S\rightarrow a\ B\ c$ or $S\rightarrow a\ D\ c$), two new black nodes are created to represent different choices. These are called **packing nodes**.

In [4]:
class PackingNode:
    def __init__(self):
        self.edges = []

    def add_edge(self, node):
        self.edges.append(node)

    def __repr__(self):
        return f"PackingNode({self.edges})"

In the RNGLR algorithm, SPPF nodes are identified by (label, start position), we will discuss more about start position later.

In [5]:
class SPPFNode:
    def __init__(self, id: int, label: str, start_pos:int = -1):
        '''
            start_pos = -1 means the node is in epsilon-SPPF
        '''
        self.id = id
        self.label = label
        self.start_pos = start_pos
        self.children: list['SPPFNode' | PackingNode] = []
    
    def add_child(self, node):
        self.children.append(node)
    
    def check_sequence_exists(self, nodes: list['SPPFNode']) -> bool:
        '''
            Check if a sequence of nodes already exists in the current node
        '''
        
        # If exist packing nodes
        if any(isinstance(child, PackingNode) for child in self.children):
            for child in self.children:
                if child.edges == nodes:
                    return True
            return False
        
        # No packing nodes case
        return self.children == nodes
    
    def add_children(self, nodes: list['SPPFNode']):
        '''
            Add a list of nodes to the current node
        '''
        if len(self.children) == 0:
            for node in nodes:
                self.add_child(node)
            return
        
        # If already exists, we skip
        if self.check_sequence_exists(nodes):
            return
        
        # No packing node yet
        if not isinstance(self.children[0], PackingNode):
            z = PackingNode()
            for child in self.children:
                z.add_edge(child)
            self.children = [z]
        
        t = PackingNode()
        for node in nodes:
            t.add_edge(node)
        self.children.append(t)
    
    # Nodes are indentified by (label, start_pos)
    def __hash__(self):
        return hash((self.label, self.start_pos))

    def __eq__(self, other):
        return (self.label == other.label and self.start_pos == other.start_pos)

    def __repr__(self):
        return f"SPPF Node:({self.label}, {self.start_pos})"

The `add_children()` method is more involved than usual. What it tries to do is making sure that all the derivations are unique, also create and manage packing nodes when necessary. Finally we have the SPPF class:

In [6]:
class SPPF:
    def __init__(self, grammar: Grammar):
        self.grammar = grammar

        # Two dictionary node_id -> Node and node_label -> node_id
        self.epsilon_sppf, self.I = self.build_epsilon_sppf()

        self.nodes: list[SPPFNode] = []
        self.counter = 0
    
    def create_node(self, label: str, start_pos: int) -> SPPFNode:
        node = SPPFNode(self.counter, label, start_pos)
        self.counter += 1
        self.nodes.append(node)
        return node

	def __repr__(self):
        repr = "SPPF:\n"
        for node in self.nodes:
            repr += f"    {node.label}-{node.start_pos}\n"
            for child in node.children:
                if isinstance(child, PackingNode):
                    repr += f"        PackingNode\n"
                    for edge in child.edges:
                        repr += f"            {edge}\n"
                else: repr += f"        {child}\n"
        return repr

TabError: inconsistent use of tabs and spaces in indentation (824756976.py, line 17)

##### Epsilon SPPF
We precompute SPPF trees for nullable non-terminals ($A \overset{*}\rightarrow \epsilon$), they are also called $\epsilon$-SPPF trees, this step is necessary for our parser later. In addition to non-terminals, we also build an $\epsilon$-SPPF tree for every string $\beta$ such that $\beta\overset{*}\rightarrow \epsilon$ and there exists a rule $A \rightarrow \alpha \beta$ in the grammar.

In [None]:
def build_epsilon_sppf(self) -> tuple[dict[int, SPPFNode], dict[str, int]]:
        '''
            Build an epsilon-SPPF tree

            return
                A tuple that contains
                - All SPPFNodes created, stored in a dict
                - The I function dictionary
        '''
        # key: node_id, value: SPPF Node
        epsilon_sppf: dict[int, SPPFNode] = {}

        # Create epsilon node
        eps_node = SPPFNode(0, "epsilon")
        epsilon_sppf[0] = eps_node
        counter = 1

        # Find a given node with label
        node_with_label: dict[str, SPPFNode] = {}

        nullable = RNGLRTableGenerator.get_nullable(self.grammar)

        # Step 1, add all nullable symbols
        # Sorted to guarantee determinism
        for nt in sorted(nullable):
            node = SPPFNode(counter, nt)
            epsilon_sppf[counter] = node
            node_with_label[nt] = node
            counter += 1
        
        for lhs in self.grammar:
            for rhs in self.grammar[lhs]:
                # Epsilon rule
                if len(rhs) == 0:
                    node_with_label[lhs].add_child(eps_node)
                # Total nullable
                elif all(x in nullable for x in rhs):
                    node = PackingNode()
                    for nt in rhs:
                        node.add_edge(node_with_label[nt])
                    node_with_label[lhs].add_child(node)
                # Partial nullable
                else:
                    for i in range(1, len(rhs)):
                        partial_rhs = rhs[i:]
                        if len(partial_rhs) == 0:
                            continue

                        if all(x in nullable for x in partial_rhs):
                            label = ''.join(partial_rhs)
                            if label in node_with_label:
                                continue
                            node = SPPFNode(counter, label)
                            for x in partial_rhs:
                                node.add_child(node_with_label[x])
                            node_with_label[label] = node
                            epsilon_sppf[counter] = node
                            counter += 1

        # Construct the I indexing map label -> node_id
        I: dict[str, int] = {}
        for label, node in node_with_label.items():
            I[label] = node.id
        
        return (epsilon_sppf, I)

#### Right-Nulled GLR (RNGLR)
In Tomita's book, he introduced 4 different algorithms. The first one only works for grammar without $\epsilon$-rules. Algorithm 2 and 3 were intended to handle $\epsilon$-rules but failed to deal with hidden left-recursion in grammars. Algorithm 4 (which is the full parser) inherited the same problem from algorithm 2 and 3. RNGLR is an extension to algorithm 1 to include grammars with $\epsilon$-rules.

With GSS and SPPF defined, we are now ready to build the RNGLR parser. Our algorithm uses a slightly modified parse table, which is neither LR(1) nor LALR(1). The RNGLR table is built upon the usual LR table, but with the addition of new reductions for "*right-nullable*" rules. A *right-nullable* rule has the form $A\rightarrow \alpha\beta$, where $\beta$ can derive to $\epsilon$.

The accommodate for the SPPF structure, we use the following format in 

For this specific implementation, we are using LR(1) parse table as the base, and then add right-nulled reductions later

### References
[1] D. E. Knuth, “On the translation of languages from left to right,” _Information and Control_, vol. 8, no. 6, pp. 607–639, Dec. 1965, doi: [10.1016/S0019-9958(65)90426-2](https://doi.org/10.1016/S0019-9958\(65\)90426-2).

[2] G. R. Economopoulos, “Generalised LR parsing algorithms”. Retrieved from https://core.ac.uk/download/pdf/301667613.pdf

[3] M. Tomita, _Efficient Parsing for Natural Language_. Boston, MA: Springer US, 1986. doi: [10.1007/978-1-4757-1885-0](https://doi.org/10.1007/978-1-4757-1885-0).