#### Recursive Descent Parsing
CS 236 <br>
Fall 2024

Brigham Young University <br>
September 2024
***

Consider the following LL(1) Grammar
* $E \rightarrow N \ | \ OEE$
* $O \rightarrow +\ |\ *$ 
* $N \rightarrow 0\ |\ 1\ |\ 2\ |\ 3$

The starting non-terminal is $E$

The first sets are
* $FIRST(OEE) = FIRST(O) = \{+,*\}$
* $FIRST(N) = \{0,1,2,3\}$

***

The slides from class showed how to construct the following parse table for this grammar.

|       | **+** | *     | **0** | **1** | **2** | **3** |  **#** |
|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:-----:|:------:|
| **E** |  OEE  |  OEE  |   D   |   D   |   D   |   D   |        |
| **O** |   +   |   *   |       |       |       |       |        |
| **D** |       |       |   0   |   1   |   2   |   3   |        |
| **+** | AdPop |       |       |       |       |       |        |
| *     |       | AdPop |       |       |       |       |        |
| **0** |       |       | AdPop |       |       |       |        |
| **1** |       |       |       | AdPop |       |       |        |
| **2** |       |       |       |       | AdPop |       |        |
| **3** |       |       |       |       |       | AdPop |        |
| **#** |       |       |       |       |       |       | accept |

Recall that the leftmost column represents what is on the stack and the topmost row represents the current input.

The other parts of the table give instructions on what to do with the stack and the input. For example, when the top of the stack is **E** and the input is **+** the parse table says "pop **E** from the stack, push **OEE** onto the stack, and do nothing with the input." For another example, when the top of the stack is **+** and the input is **+** the table says "pop **+** from the stack, don't push anything onto the stack, and advance the input."

Managing the stack is unnecessary, and implementing a parser using a table can require a huge parse table. There is a better way, _Recursive Descent Parsing_ (RDP). RDP uses recursion so that the computer's operating system is responsible for managing the stack. The result is elegant, easily understood code. The trick is to figure out how to manage the stack and the input in such a way that the things specified in the table are handled properly.

---

Let's construct a class for the recursive descent parser (RDP). The RDP will need to manage both the stack and the input.

- **Stack** The key idea is to create a function for each nonterminal. Calling the function _pushes_ the function onto the stack, and returning from the function _pops_ the function from the stack. 
- **Input** The proper way to handle an input depends on what is on top of the stack. For example, we don't advance an **3** input if an **E** is at the top of the stack, but we do advance if a **3** is at the top of the stack. 

We'll show two ways for managing when an input is advanced.

---


#### Method 1: Define a function for each nonterminal and each terminal

The first method isn't the most efficient, but it does match up very closely with the parse table above. The key idea is that we'll define a function for each thing that can be on the stack. Looking at the parse table above, we see that both terminals and nonterminals can appear on the stack.

Let's define a class called Grammar that defines a function for each nonterminal and terminal. We'll first sketch the class, meaning we'll look at the structure before we implement any of the functions. Recall that a grammar is a tuple made up of a terminals, non-terminals, a starting non-terminal, and a set of productions. Look for each of these elements in the class below.

In [23]:
class Grammar:
    def __init__(self) -> None:
        ###################################
        # Tuple defining an LL(1) grammar #
        # • set of nonterminals           #
        # • set of terminals              #
        # • starting nonterminal          #
        # • set of productions            #
        ###################################
        self.nonterminals: set[str] = {'E', 'O', 'N'}  # set of nonterminals. Each nonterminal will have its own function
        self.terminals: set[str] = {'+', '*', '0', '1', '2', '3'}  # set of terminals
        self.starting_nonterminal: function[[], None] = self.E_nonterminal  # Starting nonterminal
        # Productions                               # Defined in the functions

    ################################################################
    # Each nonterminal gets its own function.                      #
    # The function knows which productions have the nonterminal on #
    # the left hand side of the production. The correct right      #
    # hand side of the production is chosen by looking at the      #
    # current input and the FIRST set of the right hand side       #
    ################################################################
    def E_nonterminal(self) -> None:
        """ Production E--> N | OEE """
        raise NotImplementedError
    
    def N_nonterminal(self) -> None:
        """ Production N --> 0 | 1 | 2 | 3 """
        raise NotImplementedError
    
    def O_nonterminal(self) -> None:
        """ Production O --> + | * """
        raise NotImplementedError
    
    def plus_terminal(self) -> None:
        """ Called when the + should be pushed onto the stack """
        raise NotImplementedError
    
    def star_terminal(self) -> None:
        """ Called when the * should be pushed onto the stack """
        raise NotImplementedError
    
    def zero_terminal(self) -> None:
        """ Called when the 0 should be pushed onto the stack """
        raise NotImplementedError
    
    def one_terminal(self) -> None:
        """ Called when the 1 should be pushed onto the stack """
        raise NotImplementedError
    
    def two_terminal(self) -> None:
        """ Called when the 2 should be pushed onto the stack """
        raise NotImplementedError
    
    def three_terminal(self) -> None:
        """ Called when the 3 should be pushed onto the stack """
        raise NotImplementedError


The important thing to notice is that there is a function for each terminal and non-terminal. We now need to define these functions. 

Recall that the instructions for the parse table used FIRST sets. Specifically, if the input character was in the FIRST set of the righthand side of a production then that production was correct. We'll now explicitly define FIRST sets for each production.

Since FIRST sets are something used in parsing and are not part of the definition of a grammar per se, we'll change the name of the class to _Parser_.

In [44]:
from typing import Callable as function
class Parser:
    def __init__(self) -> None:
        ###################################
        # Tuple defining an LL(1) grammar #
        # • set of nonterminals           #
        # • set of terminals              #
        # • starting nonterminal          #
        # • set of productions            #
        ###################################
        self.nonterminals: set[str] = {'E', 'O', 'N'}  # set of nonterminals. Each nonterminal will have its own function
        self.terminals: set[str] = {'+', '*', '0', '1', '2', '3'}  # set of terminals
        self.starting_nonterminal: function[[], None] = self.E_nonterminal  # Starting nonterminal
        # Productions                               # Defined in the functions

    ################################################################
    # Each nonterminal gets its own function.                      #
    # The function knows which productions have the nonterminal on #
    # the left hand side of the production. The correct right      #
    # hand side of the production is chosen by looking at the      #
    # current input and the FIRST set of the right hand side       #
    ################################################################
    def E_nonterminal(self) -> None:
        """ Production E--> N | OEE """
        ################################################################
        # Define FIRST sets for the right hand side of each production #
        ################################################################
        first: dict[str, set[str]] = dict()
        first['OEE'] = {'+', '-'}
        first['N'] = {'0', '1', '2', '3'}
        
        pass # I'm using this to indicate that the function is not complete 
    
    def N_nonterminal(self) -> None:
        """ Production N --> 0 | 1 | 2 | 3 """
        first: dict[str, set[str]] = dict()
        first['0'] = {'0'}
        first['1'] = {'1'}
        first['2'] = {'2'}
        first['3'] = {'3'}
        
        pass # I'm using this to indicate that the function is not complete 
    
    def O_nonterminal(self) -> None:
        """ Production O --> + | * """
        first: dict[str, set[str]] = dict()
        first['+'] = {'+'}
        first['*'] = {'*'}
        
        pass # I'm using this to indicate that the function is not complete 

    
    def plus_terminal(self) -> None:
        """ Called when the + should be pushed onto the stack """
        raise NotImplementedError
    
    def star_terminal(self) -> None:
        """ Called when the * should be pushed onto the stack """
        raise NotImplementedError
    
    def zero_terminal(self) -> None:
        """ Called when the 0 should be pushed onto the stack """
        raise NotImplementedError
    
    def one_terminal(self) -> None:
        """ Called when the 1 should be pushed onto the stack """
        raise NotImplementedError
    
    def two_terminal(self) -> None:
        """ Called when the 2 should be pushed onto the stack """
        raise NotImplementedError
    
    def three_terminal(self) -> None:
        """ Called when the 3 should be pushed onto the stack """
        raise NotImplementedError

Note that the FIRST set of the RHS of a production is just the terminal if the RHS is a terminal. Note further that the functions for the terminals still don't do anything.

We've been setting up the Parser class to know how to handle the stack and the input but we haven't defined the input. In the next cell, we'll do three things: 
- we'll define a _parse_ function that calls the starting nonterminal and let's the recursion happen
- we'll pass an input to the this parse class so that it can read and advance the input stream, and
- we'll push things onto the stack or pop things from the stack using the instructions in the Parse table

We're used to seeing the input as a string, but the starter code uses Pythong _iterators_. Using iterators isn't necessary when the input is a string, but when the input is a list of tokens then using iterators is very convenient. A good prompt for a google search is "Python iterators tutorial" and a good prompts for a LLM query is "Teach me how to use iterators in Python". The important thing to keep in mind is that iterators have a built in function called _next_. When _next_ is called, two things happen: First, the first unprocessed value is output. Second, the iterator advances and points to the next unprocessed value. If we try to advance to the next input and every value has been processed, a _StopIteration_ exception is thrown.

In [25]:
from typing import Iterator
from typing import Callable as function
del Parser
class Parser:
    def __init__(self) -> None:
        ###################################
        # Tuple defining an LL(1) grammar #
        # • set of nonterminals           #
        # • set of terminals              #
        # • starting nonterminal          #
        # • set of productions            #
        ###################################
        self.nonterminals: set[str] = {'E', 'O', 'N'}  # set of nonterminals. Each nonterminal will have its own function
        self.terminals: set[str] = {'+', '*', '0', '1', '2', '3'}  # set of terminals
        self._starting_nonterminal: function[[], None] = self.E_nonterminal  # Starting nonterminal
        
    def parse(self, input: Iterator) -> None:
        self._input = input
        self._current_input = next(self._input) # Read the first char and advance the iterator
        self._starting_nonterminal() # Start the recursion by calling the starting nonterminal
        print("If no errors appeared the parse was successful")
 
    ################################################################
    # Each nonterminal gets its own function.                      #
    # The function knows which productions have the nonterminal on #
    # the left hand side of the production. The correct right      #
    # hand side of the production is chosen by looking at the      #
    # current input and the FIRST set of the right hand side       #
    ################################################################
    def E_nonterminal(self) -> None:
        """ Production E--> N | OEE """
        ################################################################
        # Define FIRST sets for the right hand side of each production #
        ################################################################
        first: dict[str, set[str]] = dict()
        first['OEE'] = {'+', '-'}
        first['N'] = {'0', '1', '2', '3'}

        if self._current_input in first['OEE']:
            # Push OEE onto the stack
            self.O_nonterminal()
            self.E_nonterminal()
            self.E_nonterminal()
        elif self._current_input in first['N']:
            # Push N onto the stack
            self.N_nonterminal()
        else: 
            print("Something illegal has happened")

        return # Pops this nonterminal from the stack
        
    def N_nonterminal(self) -> None:
        """ Production N --> 0 | 1 | 2 | 3 """
        first: dict[str, set[str]] = dict()
        first['0'] = {'0'}
        first['1'] = {'1'}
        first['2'] = {'2'}
        first['3'] = {'3'}

        if self._current_input in first['0']:
            # push zero onto the stack
            self.zero_terminal()
        elif self._current_input in first['1']:
            # push one onto the stack
            self.one_terminal()
        elif self._current_input in first['2']:
            # push two onto the stack
            self.two_terminal()
        elif self._current_input in first['3']:
            # push three onto the stack
            self.three_terminal()
        else: 
            print("Something illegal has happened")

        return # Pops this nonterminal from the stack
    
    def O_nonterminal(self) -> None:
        """ Production O --> + | * """
        first: dict[str, set[str]] = dict()
        first['+'] = {'+'}
        first['*'] = {'*'}
        
        if self._current_input in first['+']:
            # push + onto the stack
            self.plus_terminal()
        elif self._current_input in first['*']:
            # push * onto the stack
            self.star_terminal()
        else: 
            print("Something illegal has happened")

        return # Pops this nonterminal from the stack

    
    def plus_terminal(self) -> None:
        """ Called when the + should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def star_terminal(self) -> None:
        """ Called when the * should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def zero_terminal(self) -> None:
        """ Called when the 0 should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def one_terminal(self) -> None:
        """ Called when the 1 should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def two_terminal(self) -> None:
        """ Called when the 2 should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def three_terminal(self) -> None:
        """ Called when the 3 should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack

Notice how the function for each nonterminal does the same kind of check. Let $X$ denote the nonterminal. When the nonterminal function is called, its job is to see which production involving $X$ should be applied. Assume that there are two productions with $X$ on the LHS side:

- $X \rightarrow Y$
- $X \rightarrow Z$

Each nonterminal function checks to see if the input is in the first set of the righthand side of one of the productions. If so, it pushes nonterminal $Y$ or $Z$ onto the stack by calling it. If the current input is not any FIRST set, it reports an error.

Notice how the function for ech terminal does the same thing: it advances the input and pops the stack.

If we've implemented this correctly, we should not see any errors if we try to parse the input "23+" and we should see an error if we try and parse the input "2"

In [26]:
parser = Parser()
input = iter("+23#")
parser.parse(input)

If no errors appeared the parse was successful


Try a different input

In [28]:
input = iter("2#")
parser.parse(input)

If no errors appeared the parse was successful


What happened? The parse was successful because you can derive "2" from the starting nonterminal. Here's the derivation: $E \Rightarrow N \Rightarrow 2$

Let's try an input that should fail.

In [30]:
input = iter("+2#")
parser.parse(input)

Something illegal has happened
If no errors appeared the parse was successful


And another input that should fail.

In [31]:
input = iter("+012#")
parser.parse(input)

If no errors appeared the parse was successful


That is not right. There were still inputs to be read. 

Let's spruce up the class to handle these errors. Our "sprucing up" will be done using a useful Python feature: the ability to define our own exceptions. We'll begin by defining the exception. This is done by inheriting from the Python's base exception class.

In [32]:
class ParseErrorException(Exception):
    """ Class for parsing errors.
    
    A parse error is when any of the following errors occur:
       1. the current input is not in the FIRST set of any production
       2. we expect to see another input but all the inputs have been processed
       3. we have processed everthing on the stack but still have inputs. """
    
    def __init__(self, message: str = "A parse error occurred ") -> None:
        super().__init__(message)
        self._message = message

This class allows us to exit the parser whenever we want and report what caused the error. The default error message is not very informative since it simply says that a parse error has occurred. We'll throw _ParseErrorException_ whenever we detect a parse error.  Look for it in two types of places in the code:
- in the nonterminal functions when an input isn't in any first set (replacing the simple print statement when there is an error)
- in the parse function where it checks to see if there are any more inputs to read


In [34]:
from typing import Iterator
from typing import Callable as function
class Parser:
    def __init__(self) -> None:
        ###################################
        # Tuple defining an LL(1) grammar #
        # • set of nonterminals           #
        # • set of terminals              #
        # • starting nonterminal          #
        # • set of productions            #
        ###################################
        self.nonterminals: set[str] = {'E', 'O', 'N'}  # set of nonterminals. Each nonterminal will have its own function
        self.terminals: set[str] = {'+', '*', '0', '1', '2', '3'}  # set of terminals
        self._starting_nonterminal: function[[], None] = self.E_nonterminal  # Starting nonterminal
        
    def parse(self, input: Iterator) -> None:
        self._input = input
        self._current_input = next(self._input) # Read the first char and advance the iterator
        self._starting_nonterminal() # Start the recursion by calling the starting nonterminal
        
        if self._current_input == '#':
            print("Parse successful")
        else:
            raise ParseErrorException("At least one input was unprocessed")
        
        return
    
 
    ################################################################
    # Each nonterminal gets its own function.                      #
    # The function knows which productions have the nonterminal on #
    # the left hand side of the production. The correct right      #
    # hand side of the production is chosen by looking at the      #
    # current input and the FIRST set of the right hand side       #
    ################################################################
    def E_nonterminal(self) -> None:
        """ Production E--> N | OEE """
        ################################################################
        # Define FIRST sets for the right hand side of each production #
        ################################################################
        first: dict[str, set[str]] = dict()
        first['OEE'] = {'+', '-'}
        first['N'] = {'0', '1', '2', '3'}

        if self._current_input in first['OEE']:
            # Push OEE onto the stack
            self.O_nonterminal()
            self.E_nonterminal()
            self.E_nonterminal()
        elif self._current_input in first['N']:
            # Push N onto the stack
            self.N_nonterminal()
        else: 
            error_message = f"Input {self._current_input} is not in the first set of nonterminal E"
            raise ParseErrorException(error_message)

        return # Pops this nonterminal from the stack
        
    def N_nonterminal(self) -> None:
        """ Production N --> 0 | 1 | 2 | 3 """
        first: dict[str, set[str]] = dict()
        first['0'] = {'0'}
        first['1'] = {'1'}
        first['2'] = {'2'}
        first['3'] = {'3'}

        if self._current_input in first['0']:
            # push zero onto the stack
            self.zero_terminal()
        elif self._current_input in first['1']:
            # push one onto the stack
            self.one_terminal()
        elif self._current_input in first['2']:
            # push two onto the stack
            self.two_terminal()
        elif self._current_input in first['3']:
            # push three onto the stack
            self.three_terminal()
        else: 
            error_message = f"Input {self._current_input} is not in the first set of nonterminal N"
            raise ParseErrorException(error_message)

        return # Pops this nonterminal from the stack
    
    def O_nonterminal(self) -> None:
        """ Production O --> + | * """
        first: dict[str, set[str]] = dict()
        first['+'] = {'+'}
        first['*'] = {'*'}
        
        if self._current_input in first['+']:
            # push + onto the stack
            self.plus_terminal()
        elif self._current_input in first['*']:
            # push * onto the stack
            self.star_terminal()
        else: 
            error_message = f"Input {self._current_input} is not in the first set of nonterminal O"
            raise ParseErrorException(error_message)

        return # Pops this nonterminal from the stack

    def plus_terminal(self) -> None:
        """ Called when the + should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def star_terminal(self) -> None:
        """ Called when the * should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def zero_terminal(self) -> None:
        """ Called when the 0 should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def one_terminal(self) -> None:
        """ Called when the 1 should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def two_terminal(self) -> None:
        """ Called when the 2 should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack
    
    def three_terminal(self) -> None:
        """ Called when the 3 should be pushed onto the stack """
        # This function is only called when the current input is +
        # Thus, this function is at the top of the stack when the current input is +
        # The parse table says we should advance the input and pop the stack
        # We advance the input using the next operator
        # We pop the stack by returning
        self._current_input = next(self._input) # Advance
        return # pop the stack

Let's test the same four inputs. Let's first do it with the errors being thrown without us handling them neatly.

In [35]:
del parser
parser = Parser()
input = iter("+23#")
parser.parse(input)

Parse successful


In [36]:
input = iter("2#")
parser.parse(input)

Parse successful


In [37]:
input = iter("+2#")
parser.parse(input)

ParseErrorException: Input # is not in the first set of nonterminal E

In [38]:
input = iter("+=12#")
parser.parse(input)

ParseErrorException: Input = is not in the first set of nonterminal E

The errors were thrown and the trace of the cause of the error is shown. If we don't want to see what caused the error, we can use a neat Python feature to detect whether exceptions have been thrown. A good LLM prompt is "Teach me how to use try and except in Python".

In [39]:
del parser
parser = Parser()
input = iter("+23#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

Parse successful


In [40]:
input = iter("+#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

Input # is not in the first set of nonterminal E


In [41]:
input = iter("+2#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

Input # is not in the first set of nonterminal E


In [42]:
input = iter("+123#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

At least one input was unprocessed


---

We can see that code tells us whether a string can be derived from a grammar or not, but it might not be clear how the call stack is being used. Let's repeat the first test case "+23" but do it in debug mode in the browser. In the following cell, click on the left of the `parser.parse(input)` line. Then, mouse over the left arrow and select "Debug cell". Go to the top of the window and click the down arrow. That will start stepping you through the code one line at a time.

When you run the debugger, the pane on the left will show the call stack. Notice the following:
- how each call to a function pushes the function on the stack and each return from the function pops the function from the stack
- how the call stack automatically keeps track of which nonterminals have been processed and which have yet to be processed


In [43]:
input = iter("+23#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

Parse successful


---

#### Method 2: Matching inputs

You've probably noticed that there is a lot of redundancy across the terminal functions. Each function does the same thing: advances the input and pops the stack. We can do this more efficiently.

You've probably also noticed that the FIRST sets for each terminal contain a single element. Checking to see if the terminal is in a set containing a terminal is inefficient. 

Finally, you've probably also noticed that we don't actually ever use the explicit list of terminals and non-terminals created in the `__init__` function. We'll just remove that part of the code. It was only included so that you could see how the code matched the mathematical tuple.

Let's make the code more efficient by directly checking whether an input matches something that we expect to see. If it does, we won't bother pushing the input onto the stack just to pop it off and advance the input. We'll do these by testing whether the input matches what we expect. If it does, we'll advance the input. If it does not, we'll throw an error.

In [47]:
from typing import Iterator
from typing import Callable as function
class Parser:
    def __init__(self) -> None:
        self._starting_nonterminal: function[[], None] = self.E_nonterminal  # Starting nonterminal
        
    def parse(self, input: Iterator) -> None:
        self._input = input
        self._current_input = next(self._input) # Read the first char and advance the iterator
        self._starting_nonterminal() # Start the recursion by calling the starting nonterminal
        
        if self._current_input == '#':
            print("Parse successful")
        else:
            raise ParseErrorException("At least one input was unprocessed")
        
        return
    
 
    ################################################################
    # Each nonterminal gets its own function.                      #
    # The function knows which productions have the nonterminal on #
    # the left hand side of the production. The correct right      #
    # hand side of the production is chosen by looking at the      #
    # current input and the FIRST set of the right hand side       #
    ################################################################
    def E_nonterminal(self) -> None:
        """ Production E--> N | OEE """
        ################################################################
        # Define FIRST sets for the right hand side of each production #
        ################################################################
        first: dict[str, set[str]] = dict()
        first['OEE'] = {'+', '-'}
        first['N'] = {'0', '1', '2', '3'}

        if self._current_input in first['OEE']:
            # Push OEE onto the stack
            self.O_nonterminal()
            self.E_nonterminal()
            self.E_nonterminal()
        elif self._current_input in first['N']:
            # Push N onto the stack
            self.N_nonterminal()
        else: 
            error_message = f"Input {self._current_input} is not in the first set of nonterminal E"
            raise ParseErrorException(error_message)

        return # Pops this nonterminal from the stack
        
    def N_nonterminal(self) -> None:
        """ Production N --> 0 | 1 | 2 | 3 """
        ## Replace the following: 
        # first: dict[str, set[str]] = dict()
        # first['0'] = {'0'}
        # first['1'] = {'1'}
        # first['2'] = {'2'}
        # first['3'] = {'3'}

        if self._match('0') or self._match('1') or self._match('2') or self._match('3'):
            self._current_input = next(self._input)
        else: 
            error_message = f"Input {self._current_input} cannot be produced by nonterminal N"
            raise ParseErrorException(error_message)

        return # Pops this nonterminal from the stack
    
    def O_nonterminal(self) -> None:
        """ Production O --> + | * """
        ## Replace the following 
        # first: dict[str, set[str]] = dict()
        # first['+'] = {'+'}
        # first['*'] = {'*'}
        
        if self._match('+') or self._match('*'):
            self._current_input = next(self._input)
        else: 
            error_message = f"Input {self._current_input} cannot be produced by nonterminal O"
            raise ParseErrorException(error_message)

        return # Pops this nonterminal from the stack
   
    def _match(self, input:str) -> bool:
        return self._current_input == input

Notice how all the terminal functions are gone.

Notice the new function that compares its argument to the current input to see if they match.

Notice how the input is advanced whenever the current input matches one of the terminals that can be produced by the nonterminal.

Let's test.

In [48]:
del parser
parser = Parser()
input = iter("+23#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

Parse successful


In [49]:
input = iter("2#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

Parse successful


In [50]:
input = iter("+123#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

At least one input was unprocessed


In [51]:
input = iter("+3#")
try:
    parser.parse(input)
except ParseErrorException as message:
    print(message)

Input # is not in the first set of nonterminal E
