# Chapter 4-B Bottom-Up Parsing

## Table of contents:
1. [Bottom-up parsing](#bottom-up-parsing)
1. [LR(k) grammars](#lr-k-grammars)
1. [Extended PDA](#extended-pda)
    1. [CFG → Extended PDA](#cfg-to-extended-pda)
    1. [Two sources of nondeterminism](#two-sources-of-nondeterminism)
    1. [Actions](#extended-pda-actions)
    1. [Example](#extended-pda-example)
1. [Deterministic LR(k) parser](#deterministic-lrk-parser)
    1. [Definition](#deterministic-lrk-parser-def)
    1. [Types of Conflicts](#deterministic-lrk-parser-conflicts)

## Bottom-up parsing <a class="anchor" id="bottom-up-parsing"></a>

Given a CFG $G = (N, \Sigma, P, S)$ and an input string $w$, construct a parse tree for $w$ in $G$ bottom-up, i.e., start with $w$ and build the tree upward in order to reduce the partially constructed subtrees to the start symbol $S$ of $G$.

<img src="./res/04b/4b-1.png" width="300px" alt="Reduce w to S by using rules of G" title="Reduce w to S by using rules of G"/>

## LR(k) grammars <a class="anchor" id="lr-k-grammars"></a>
A (proper) subclass of CFGs that permit a deterministic, no backtracking and bottom-up construction of a parse tree in O(n) time by using:
* **L**eft-to-right scan of input symbols,
* **R**ightmost derivations in reverse, and
* **$k$** lookahead symbols.

#### Example: A rightmost derivation and its reverse

$
\begin{eqnarray}
S &\rightarrow& AB \\
A &\rightarrow& AB \mid a \\
B &\rightarrow& Bb \mid b
\end{eqnarray}
$

$w=abbb$


##### Rightmost derivation

$
\begin{eqnarray}
S &\Rightarrow_{rm}& AB \\
&\Rightarrow_{rm}& ABb \\
&\Rightarrow_{rm}& Abb \\
&\Rightarrow_{rm}& ABb \\
&\Rightarrow_{rm}& ABbb \\
&\Rightarrow_{rm}& Abbb \\
&\Rightarrow_{rm}& abbb \\
\end{eqnarray}
$

<img src="./res/04b/4b-2.png" width="500px" alt="Rightmost derivation" title="Rightmost derivation"/>

##### Reverse rightmost derivation

There exists a way to construct a parse tree top-down by using a rightmost derivation if and only if there exists a way to construct the same tree bottom-up by using the reverse of such a rightmost derivation.

<img src="./res/04b/4b-3.png" width="500px" alt="Reverse rightmost derivation" title="Reverse rightmost derivation"/>

## Extended PDA <a class="anchor" id="extended-pda"></a>

A PDA $M = (Q, \Sigma, \Gamma, \delta, q_0, Z_0, F)$ that permits scanning and replacing zero or more stack symbols at a time, thus its transition function $\delta$ is defined by $\delta$: $Q \times (\Sigma \cup \{\varepsilon\}) \times \Gamma^* \rightarrow 2^{Q\times\Gamma^*}$.

### CFG → Extended PDA <a class="anchor" id="cfg-to-extended-pda"></a>

Given a CFG $G = (N, \Sigma, P, S)$, we construct an extended PDA M such that $L(M) = L(G)$. For any input $w \in \Sigma^*$, $M$ will simulate the reverse of a rightmost derivation of $w$ in $G$ and accept $w$ if and only if $w$ can be generated by $G$.

(1) $M$ will check if the input string $w$ can convert to the start symbol $S$. This will be verified if $S$ alone appears in the stack when all symbols of $w$ are consumed.

<img src="./res/04b/4b-4.png" width="900px" alt="Depiction of point 1" title="Depiction of point 1"/>

(2) Suppose that $S \Rightarrow_{rm}^* \alpha Ay$ (where $w = xy$) $\Rightarrow_{rm} \alpha \beta y \Rightarrow_{rm}^* xy$. If $M$ has successfully simulated the reverse of the rightmost derivation of $G$ that converts $x$ to $\alpha \beta$, then the corresponding configuration of $M$ is $(q, x{\uparrow}y, Z_0\alpha\beta)$, where the rightmost symbol of $\beta$ is the stack top symbol. $M$ needs to verify that $\alpha \beta y$ can convert to $S$.

<img src="./res/04b/4b-5.png" width="900px" alt="Depiction of point 2" title="Depiction of point 2"/>

(3) If $A\rightarrow\beta$ is a rule and $\beta$ is on top of the stack, then reduce $\beta$ to $A$. This string $\beta$ in stack that is ready to be reduced is called a **handle** (RHS of a rule that is exposed at the top of the stack, ready to be reduced).

(4) Note that this REDUCE action takes place in the stack. Move symbols of $y$ to the stack (this is called the SHIFT action) as necessary so that they can become a part of the RHS of a rule.

### Two sources of nondeterminism <a class="anchor" id="two-sources-of-nondeterminism"></a>

* Identifying the handle – If $\gamma = \alpha_1 \beta_1 = \alpha_2 \beta_2 = ··· = \alpha_k \beta_k$ is the stack content and each $\beta_i$ is the RHS of a rule, which $\beta_i$ is the handle?
* If $\beta$ is the handle and $A_1 \rightarrow \beta, A_2 \rightarrow \beta, \dots, A_m \rightarrow \beta$ are rules of $G$, which $A_j$ must replace $\beta$ in the stack?

### Actions <a class="anchor" id="extended-pda-actions"></a>

For now, our nondeterministic extended PDA is $M = (\{q_0,q_1\}, \Sigma, N \cup \Sigma \cup \{Z_0\}, \delta, q_0, Z_0, \{q_1\})$, where $\delta$ consists of the following four types of actions:
1. SHIFT action: $\forall a \in \Sigma, \delta(q_0, a, \varepsilon) = \{(q_0, a)\}$
2. REDUCE action: $\forall A \rightarrow \beta \in P, (q_0, A) \in \delta(q_0, \varepsilon, \beta)$
3. ACCEPT action: $\delta(q_0, \varepsilon, Z_0 S) = \{(q_1, \varepsilon)\}$
4. REJECT action: if there is no possible action in a non-accepting configuration

### Example <a class="anchor" id="extended-pda-example"></a>

$
\begin{align}
S &\rightarrow& aSbS \mid abA \mid a \\
A &\rightarrow& aA \mid \varepsilon \\
\end{align}
$

$M = (\{q_0,q_1\}, \{a, b\}, \{S, A, a, b, Z_0\}, \delta, q_0, Z_0, \{q_1\})$, where

$
\begin{eqnarray}
\delta(q_0,a,\varepsilon) &=& \{(q_0,a)\} \text{ // SHIFT terminal symbol a onto stack} \\
\delta(q_0,b,\varepsilon) &=& \{(q_0,b)\} \text{ // SHIFT terminal symbol b onto stack} \\
\delta(q_0,\varepsilon,aSbS) &=& \{(q_0, S)\} \text{ // REDUCE for S rules} \\
\delta(q_0,\varepsilon,abA) &=& \{(q_0, S)\} \text{ // REDUCE for S rules} \\
\delta(q_0,\varepsilon,a) &=& \{(q_0, S)\} \text{ // REDUCE for S rules} \\
\delta(q_0,\varepsilon,aA) &=& \{(q_0, A)\} \text{ // REDUCE for A rules} \\
\delta(q_0,\varepsilon,\varepsilon) &=& \{(q_0, A)\} \text{ // REDUCE for A rules} \\
\delta(q_0,\varepsilon,Z_0S) &=& \{(q_1, \varepsilon)\} \text{ // ACCEPT} \\
\end{eqnarray}
$

#### A rightmost derivation <a class="anchor" id="rightmost-derivation"></a>
$
\begin{eqnarray}
S\Rightarrow_{rm}aSbS \\
\Rightarrow_{rm}aSbaS \\
\Rightarrow_{rm}aabAba \\
\Rightarrow_{rm}aabaAba \\
\Rightarrow_{rm}aababa
\end{eqnarray}
$

#### A rightmost derivation tree <a class="anchor" id="rightmost-derivation-tree"></a>

The numbers in the diagram below each correspond to a step in obtaining the rightmost derivation of the string $w=aababa$.

<img src="./res/04b/4b-6.png" width="500px" alt="Rightmost Derivation Tree" title="Rightmost Derivation Tree"/>

#### A bottom-up parsing for $aababa$:

$
\begin{eqnarray}
(q_0, {\uparrow}aababa, Z_0) &\vdash& (q_0, a{\uparrow}ababa, Z_0a) &\text{ // SHIFT (part of the handle aSbS created in the stack)}& \\
&\vdash& (q_0, aa{\uparrow}baba, Z_0aa) &\text{ // SHIFT (part of the handle abA created in the stack)}& \\
&\vdash& (q_0, aab{\uparrow}aba, Z_0aab) &\text{ // SHIFT (part of the handle abA created in the stack)}& \\
&\vdash& (q_0, aaba{\uparrow}ba, Z_0aaba) &\text{ // SHIFT (part of the handle aA created in the stack)}& \\
&\vdash& (q_0, aaba{\uparrow}ba, Z_0aabaA) &\text{ // REDUCE (handle } \varepsilon \text{ reduced to A)}& \\
&\vdash& (q_0, aaba{\uparrow}ba, Z_0aabA) &\text{ // REDUCE (handle aA reduced to A)}& \\
&\vdash& (q_0, aaba{\uparrow}ba, Z_0aS) &\text{ // REDUCE (handle abA reduced to S)}& \\
&\vdash& (q_0, aabab{\uparrow}a, Z_0aSb) &\text{ // SHIFT (part of the handle aSbS created in the stack)}& \\
&\vdash& (q_0, aababa{\uparrow}, Z_0aSba) &\text{ // SHIFT (part of the handle a created in the stack)}& \\
&\vdash& (q_0, aababa{\uparrow}, Z_0aSbA) &\text{ // REDUCE (handle a reduced to A)}& \\
&\vdash& (q_0, aababa{\uparrow}, Z_0S) &\text{ // REDUCE (handle aSbA reduced to S)}& \\
&\vdash& (q_1, aababa{\uparrow}, \varepsilon) &\text{ // ACCEPT}&
\end{eqnarray}
$

#### Example stack content

Because we know our target parse tree (see the image above), we SHIFT the first $a$ onto the stack. This is the beginning of the handle $aSbS$. We want to eventually REDUCE $aSbS$ to $S$.

The PDA technically allows for other actions to occur, like $\delta(q_0,\varepsilon,a) = \{(q_0, S)\}$ and  $\delta(q_0,\varepsilon,\varepsilon) = \{(q_0, A)\}$, but these transitions would not follow the parse tree for the rightmost derivation.

<img src="./res/04b/4b-7.png" width="500px" alt="Example Stack Content" title="Example Stack Content"/>

## Deterministic LR(k) parser <a class="anchor" id="deterministic-lrk-parser"></a>

An extended PDA that can always decide, by using _$k$ lookahead symbols_ and the _LR(k) parse table_, whether to SHIFT, REDUCE, ACCEPT or REJECT. If REDUCE, it also knows the portion of the stack content that is the handle and which nonterminal must replace this handle in the stack

### Def. A pair of the form $[A \rightarrow \beta{.}\gamma, x]$ is an _LR(k) item_ if $A \rightarrow \beta\gamma$ is a rule and $x \in \Sigma^*, |x| \le k$ <a class="anchor" id="deterministic-lrk-parser-def"></a>

* It is valid if there is a rightmost derivation $S \Rightarrow_{rm}^* \alpha{A}{w} \Rightarrow_{rm} \alpha\beta\gamma{w}$ and $x \in FIRST_k(w)$ (or $x \in FOLLOW_k(A)$).
* A valid item $[A \rightarrow \beta{.}\gamma, x]$ indicates that, with $\beta$ on top of the stack, we expect to add $\gamma$ into the stack and then reduce $\beta\gamma$ to $A$ upon looking ahead $x$ from the input tape.

#### Parse Tree Visualization

<img src="./res/04b/4b-8.png" width="600px" alt="Parse Tree Visualization" title="Parse Tree Visualization"/>

#### Corresponding PDA State

<img src="./res/04b/4b-9.png" width="600px" alt="Corresponding PDA State" title="Corresponding PDA State"/>

### Note. To remove the nondeterministic behavior of our extended PDA, we need only resolve two types of confusion: <a class="anchor" id="deterministic-lrk-parser-conflicts"></a>

#### SHIFT/REDUCE conflict

There should not be two items, $[A \rightarrow \beta{.}, x]$ indicating a valid REDUCE action and $[B \rightarrow \beta'{.}\gamma, y]$, $\gamma\ne\varepsilon$ and $x \in FIRST_k(\gamma y)$, indicating a valid SHIFT action, at any point of parsing.

#### REDUCE/REDUCE conflict

There should not be two items, $[A \rightarrow \beta{.}, x]$ and $[B \rightarrow \beta'{.}, x]$, indicating valid REDUCE actions, at any point of parsing.