# Parsing

## Bottom-Up Parsing
- Starting with a string, we build the parse tree on top of it
 - Does right most derivation in reverse
- Can handle left recursion
- At each step we need to find the handle
 - The handle is portion of the mix of terminals and non-terminals that can simplified to another non-terminal
 - A handle must lead to a valid derivation
 - For the string __id__ + id \* id, __id__ is the handle because we can simplfy to __F__ + id \* id 

## Sentential Form and Phrases
- A sentential form is any line of a derivation, that is a mix of terminals and non-terminals
- A phrase is any set of symbols that will eventually be reduced to a single symbol
    - All the children of one node in the parse tree
- A simple phrase is a phrase that can be reduces to one symbol in one step, that is a subtree with a depth of 1
- A handle is the left-most simple phrase
    - We are pruning the parse tree as we go up it

## Phrases and Handles Practice
- Draw the parse tree and find the phrases and handles for the following right sentential form given the grammar
- S $\to$ AbB | bAc
- A $\to$ Ab | aBB
- B $\to$ Ac | cBb | c

- Exercises
    - aAcccbbc (as class)
    - AbcaBccb
    

## Shift-Reduce

- The general algorithm used for bottom-up parsing
- Uses the LR parsing strategy
 - Scans strings from left-to-right
 - Uses the rightmost derivation
- Implemented using a parsing table and a stack
- Shift pushes a token on to the stack while reduce uses a rule of the grammar to simplify part of the stack

## Parse Tables

<div style="width:20%;float:left">
<p> For the grammar: </p>
<ol>
<li>$E \to E \, + \, T$</li>
<li>$E \to T$</li>
<li>$T \to T \, * \, F$</li>
<li>$T \to F$</li>
<li>$F \to (\, E \,) T$</li>
<li>$F \to id$</li>
</ol>
</div>
<div style="width:80%;float:left">
<p>The parsing table is: </p>
<img style="width:60%" src="parsetable.jpg" />
</div>

## Shift-Reduce Algorithm

- Initialize the stack with state 0
- While not accept or error
 - Given state and next symbol, find appropriate action in table
  - If action is shift, we shift that symbol and new state onto the stack
  - If action is reduce:
   - Pop handle and apply rule as indicated in table
   - Using next state on stack, look at goto table for new symbol and that state
   - Push non-terminal and state from goto on to stack


## The Stack
- In shift-reduce parsing, the stack is always of the form

### $S_n$, $SYMBOL$, $S_m$, $SYMBOL$, $S_o$, ..... $S_x$

- Where $S_n$ is a state and `SYMBOL` is a terminal or non-terminal from the grammar
- This is sometimes written as 

###  $n$, $SYMBOL$, $m$, $SYMBOL$, $o$, ..... $x$

## Shift-Reduce Algorithm Practice

<div style="width:30%;float:left">
<p> Parse </p>
<table>
<thead>
<tr>
<th>Stack</th>
<th>Input</th>
<th>Action</th>
</tr>
</thead>
<tbody>
<tr>
<td>0</td>
<td>id \* id + id \$ </td>
<td></td>
</tr>
<tr>
<!--<td>0 id 5</td>
<td> \* id + id \$ </td>
<td>R6</td>
</tr>
<tr>
<td>0 F 3</td>
<td>\* id + id \$ </td>
<td>R4</td>
</tr>
<tr>
<td>0 T 2</td>
<td>\* id + id \$ </td>
<td>S7</td>
</tr>
<tr>
<td>0 T 2 \* 7</td>
<td>id + id \$ </td>
<td>S5</td>
</tr>
<tr>
<td>0 T 2 \* 7 id 5</td>
<td>+ id \$ </td>
<td>R6</td>
</tr>
<tr>
<td>0 T 2 \* 7 F 10</td>
<td>+ id \$ </td>
<td>R3</td>
</tr>
<tr>
<td>0 T 2 </td>
<td>+ id \$ </td>
<td>R2</td>
</tr>
<tr>
<td>0 E 1 </td>
<td>+ id \$ </td>
<td>S6</td>
</tr>
<tr>
<td>0 E 1  + 6</td>
<td>id \$ </td>
<td>S5</td>
</tr>
<tr>
<td>0 E 1  + 6 id 5</td>
<td>\$ </td>
<td>R6</td>
</tr>
<tr>
<td>0 E 1  + 6 F 3</td>
<td>\$ </td>
<td>R4</td>
</tr>
<tr>
<td>0 E 1  + 6 T 9</td>
<td>\$ </td>
<td>R1</td>
</tr>
<tr>
<td>0 E 1</td>
<td>\$ </td>
<td>accept</td>
</tr>-->


</tbody>
</table>
</div>
<div style="width:70%;float:left">
<p>The parsing table is: </p>
<img style="width:65%" src="parsetable.jpg" />
</div>

## Shift-Reduce Practice

Show the parse including the stack for __id \* ( id + id)__  


<div style="width:40%;float:left">
<p> Grammar: </p>
<ol>
<li>$E \to E \, + \, T$</li>
<li>$E \to T$</li>
<li>$T \to T \, * \, F$</li>
<li>$T \to F$</li>
<li>$F \to (\, E \,) $</li>
<li>$F \to id$</li>
</ol>
</div>
<div style="width:60%;float:left">
<p>The parsing table is: </p>
<img style="width:90%" src="parsetable.jpg" />
</div>
<div style="width:35%;float:left">
<p>Parse:</p>
<table>
<thead>
<tr>
<th>Stack</th>
<th>Input</th>
<th>Action</th>
</tr>
</thead>
<tbody>
<!--<tr>
<td>0</td>
<td>id \* (id + id) \$ </td>
<td>S5</td>
</tr>
<tr>
<td>0 id 5</td>
<td> \* (id + id) \$ </td>
<td>R6</td>
</tr>
<tr>
<td>0 F 3</td>
<td>\* (id + id) \$ </td>
<td>R4</td>
</tr>
<tr>
<td>0 T 2</td>
<td>\* (id + id) \$ </td>
<td>S7</td>
</tr>
<tr>
<td>0 T 2 \* 7</td>
<td>(id + id) \$ </td>
<td>S4</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4</td>
<td>id + id ) \$ </td>
<td>S5</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 id 5</td>
<td>+ id) \$ </td>
<td>R6</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 F 3 </td>
<td>+ id) \$ </td>
<td>R4</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 T 2  </td>
<td>+ id )\$ </td>
<td>R2</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 E 8 </td>
<td>+ id ) \$ </td>
<td>S6</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 E 8 + 6</td>
<td> id )\$ </td>
<td>S5</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 E 8 + 6 id 5</td>
<td> ) \$ </td>
<td>R6</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 E 8 + 6 F 3</td>
<td> ) \$ </td>
<td>R4</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 E 8 + 6 T 9 </td>
<td> ) \$ </td>
<td>R1</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 E 8</td>
<td> ) \$ </td>
<td>S11</td>
</tr>
<tr>
<td>0 T 2 \* 7 ( 4 E 8 ) 11 </td>
<td>\$ </td>
<td>R5</td>
</tr>
<tr>
<td>0 T 2 \* 7 F 10 </td>
<td>\$ </td>
<td>R3</td>
</tr>
<tr>
<td>0 T 2 </td>
<td>\$ </td>
<td>R2</td>
</tr>
<tr>
<td>0 E 1 </td>
<td>\$ </td>
<td>accept</td>
</tr>
-->

</tbody>
</table>
</div>

## Yacc
- Yacc stands for yet another compiler compiler, and generates the tables needed to perform shift-reduce parsing
- Yacc is not free, but there is a free version known as `bison`
- A modified BNF grammar of the form
```c
    LHS:
        RHS
      | RHS
```

- A bit more involved than `lex` and `flex` in the setup