# Exercises

## Topics to review
1. Construction of a parse tree given a CFG $G$ and input string $w$.
2. Parsing of a given input source program into tokens (i.e., `<token (lexeme), token type>`, e.g., `<i, [identifier]>`)
3. Showing the symbol table contents after parsing a given input source program.
4. Transforming a DFA into a scanner (using a block structure).
5. Three-address code, code optimization, and code generation.
6. Ambiguity.
7. CYK parse table construction.
8. Construction of a PDA simulating the leftmost derivation of a CFG $G$.
9. Construction of the LL(1) parse table for a CFG $G$ containing $FIRST_1$, $FOLLOW_1$ and $LOOKAHEAD_1$ sets.
10. Construction of the LL(2) parse table for a CFG $G$ containing $FIRST_2$, $FOLLOW_2$ and $LOOKAHEAD_2$ sets.
11. Running the LL(k) parser for a CFG $G$ on the input string $w$ to find an accepting computation. For non-LL(k) grammars, use rules leading to acceptance when there are multiple choices as a nondeterministic PDA does.

## Homework 1

1. Show how $w=aa{+}a{*}$ can be generated by the following CFG $G$:

$
\begin{align}
S &\rightarrow& SS+ \\
&\mid& SS* \\
&\mid& a
\end{align}
$

2. Print each token of the following input program:
```
float limitedSquare(x) float x; {
/* returns x-squared, but never more than 100 */
return (x<=-10.0||x>=10.0)?100:x*x;
}
```

3. Show the symbol table contents of the previous program.

## Homework 2

1. Construct the CYK parse table for the following CFG and input string baabab:

$
\begin{align}
S &\rightarrow& AB \mid AS \\
A &\rightarrow& BA \mid SA \mid a \\
B &\rightarrow& AB \mid b
\end{align}
$

2. Construct a PDA simulating the leftmost derivation of the following CFG

$
\begin{align}
S &\rightarrow& aSbB \\
S &\rightarrow& Aa \\
A &\rightarrow& aA \\
A &\rightarrow& SA \\
A &\rightarrow& a \\
B &\rightarrow& AB \\
B &\rightarrow& bBC \\
B &\rightarrow& b \\
C &\rightarrow& abCS \\
C &\rightarrow& \varepsilon
\end{align}
$

3. Construct the LL(1) parse table for the CFG given in (2) containing $FIRST_1$, $FOLLOW_1$ and $LOOKAHEAD_1$ sets. Is this an LL(1) grammar? Justify your answer.

4. Construct the LL(2) parse table for the CFG given in (2) containing $FIRST_2$, $FOLLOW_2$ and $LOOKAHEAD_2$ sets. Is this an LL(2) grammar? Justify your answer.

5. Run the LL(2) parser on the input string aaaabbb to find an accepting computation. If the grammar in (2) is not LL(2), use a rule leading to acceptance when there are multiple choices as a nondeterministic PDA does.

## More exercises in the style Homework 2

### 1. Consider the CYK parse table for the following CFG $G$.

$
\begin{align}
S &\rightarrow& AB \mid XB \mid \varepsilon \\
T &\rightarrow& AB \mid XB \\
X &\rightarrow& AT \\
A &\rightarrow& a \\
B &\rightarrow& b
\end{align}
$

a. Is $w = aaabb$ in $L(G)$? Construct a CYK parse table to answer this question.

### No
| Length | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ |
|--------|-------|-------|-------|-------|-------|
| 5      | X     |       |       |       |       |
| 4      |       | S,T   |       |       |       |
| 3      |       | X     |       |       |       |
| 2      |       |       | S,T   |       |       |
| 1      | A     | A     | A     | B     | B     |
|        | a     | a     | a     | b     | b     |

Observe that $S \notin T[1,5]$ (top-left cell).

b. Is $w = aaabbb$ in $L(G)$? Construct a CYK parse table to answer this question.

### Yes

| Length | $a_1$ | $a_2$ | $a_3$ | $a_4$ | $a_5$ | $a_6$ |
|--------|-------|-------|-------|-------|-------|-------|
| 6      | S,T   |       |       |       |       |       |
| 5      | X     |       |       |       |       |       |
| 4      |       | S,T   |       |       |       |       |
| 3      |       | X     |       |       |       |       |
| 2      |       |       | S,T   |       |       |       |
| 1      | A     | A     | A     | B     | B     | B     |
|        | a     | a     | a     | b     | b     | b     |

Observe that $S \in T[1,6]$ (top-left cell).
c. Draw a parse tree for $w = aaabbb$ (you can use the CYK parse table as reference).

<img src="./res/ex/ex-1.png" width="300px" alt="Solution"/>

<br />

### 2. Consider the following CFG $G$. 

$
\begin{align}
E &\rightarrow& TE' \\
E' &\rightarrow& {+}TE' \\
&\mid& \varepsilon \\
T &\rightarrow& FT' \\
T' &\rightarrow& {*}FT' \\
&\mid& \varepsilon \\
F &\rightarrow& [id]
\end{align}
$

a. Construct a PDA simulating the leftmost derivation of $G$.

$M = (\{q_0,q_1\}, \{[id],+,*,(,)\}, \{E,E',T,T',F,[id],+,*,(,),Z_0\}, \delta, q_0, Z_0, \{q_1\})$, where

$
\begin{align}
\delta(q_0,\varepsilon,Z_0) = \{(q_1, EZ_0)\}\ \\
\delta(q_1,\varepsilon,E) = \{(q_1, TE')\} \\
\delta(q_1,\varepsilon,E') = \{(q_1, +TE'), (q_1, \varepsilon)\} \\
\delta(q_1,\varepsilon,T) = \{(q_1, FT')\} \\
\delta(q_1,\varepsilon,T') = \{(q_1, *FT'), (q_1, \varepsilon)\} \\
\delta(q_1,\varepsilon,F) = \{(q_1, (E)), (q_1, [id])\} \\
\delta(q_1,[id],[id]) = \{(q_1, \varepsilon)\} \\
\delta(q_1,+,+) = \{(q_1, \varepsilon)\} \\
\delta(q_1,*,*) = \{(q_1, \varepsilon)\} \\
\delta(q_1,(,() = \{(q_1, \varepsilon)\} \\
\delta(q_1,),)) = \{(q_1, \varepsilon)\} \\
\delta(q_1,\varepsilon,Z_0) = \{(q_1, \varepsilon)\}
\end{align}
$

b. Construct the LL(1) parse table for the following CFG $G$ containing $FIRST_1$, $FOLLOW_1$ and $LOOKAHEAD_1$ sets. Is this an LL(1) grammar? Justify your answer.

| N    | Rule | $FIRST_1$ | $FOLLOW_1$ | $LOOKAHEAD_1$ |
|------|------|-----------|------------|---------------|
| $E$  | (1)  | (, [id]   | ε, )       | (, [id]       |
| $E'$ | (2)  | +         | ε, )       | +             |
|      | (3)  | ε         |            | ε, )          |
| $T$  | (4)  | (, [id]   | ε, +, )    | (, [id]       |
| $T'$ | (5)  | *         | ε, +, )    | *             |
|      | (6)  | ε         |            | ε, +, )       |
| $F$  | (7)  | (         | ε, +, ), * | (             |
|      | (8)  | [id]      |            | [id]          |

c. Run the LL(1) parser on the input string `([id] + [id]) * [id]` to find an accepting computation. If the grammar is not LL(1), use a rule leading to acceptance when there are multiple choices as a nondeterministic PDA does.

$
\begin{align}
(q_0, {\uparrow}([id]+[id])*[id], Z_0) &\vdash& (q_1, {\uparrow}([id]+[id])*[id], EZ_0) \text{ // Push E onto stack} \\
&\vdash& (q_1,{\uparrow}([id]+[id])*[id], Z_0), TE'Z_0) \\
&\vdash& (q_1,{\uparrow}([id]+[id])*[id], Z_0), FT'E'Z_0) \\
&\vdash& (q_1,{\uparrow}([id]+[id])*[id], Z_0), (E)T'E'Z_0) \\
&\vdash& (q_1,({\uparrow}[id]+[id])*[id], Z_0), E)T'E'Z_0) \\
&\vdash& (q_1,({\uparrow}[id]+[id])*[id], Z_0), TE')T'E'Z_0) \\
&\vdash& (q_1,({\uparrow}[id]+[id])*[id], Z_0), FT'E')T'E'Z_0) \\
&\vdash& (q_1,({\uparrow}[id]+[id])*[id], Z_0), [id]T'E')T'E'Z_0) \\
&\vdash& (q_1,([id]{\uparrow}+[id])*[id], Z_0), T'E')T'E'Z_0) \\
&\vdash& (q_1,([id]{\uparrow}+[id])*[id], Z_0), E')T'E'Z_0) \\
&\vdash& (q_1,([id]{\uparrow}+[id])*[id], Z_0), +TE')T'E'Z_0) \\
&\vdash& (q_1,([id]+{\uparrow}[id])*[id], Z_0), TE')T'E'Z_0) \\
&\vdash& (q_1,([id]+{\uparrow}[id])*[id], Z_0), FT'E')T'E'Z_0) \\
&\vdash& (q_1,([id]+{\uparrow}[id])*[id], Z_0), [id]T'E')T'E'Z_0) \\
&\vdash& (q_1,([id]+[id]{\uparrow})*[id], Z_0), T'E')T'E'Z_0) \\
&\vdash& (q_1,([id]+[id]{\uparrow})*[id], Z_0), E')T'E'Z_0) \\
&\vdash& (q_1,([id]+[id]{\uparrow})*[id], Z_0), )T'E'Z_0) \\
&\vdash& (q_1,([id]+[id]){\uparrow}*[id], Z_0), T'E'Z_0) \\
&\vdash& (q_1,([id]+[id]){\uparrow}*[id], Z_0), *FT'E'Z_0) \\
&\vdash& (q_1,([id]+[id])*{\uparrow}[id], Z_0), FT'E'Z_0) \\
&\vdash& (q_1,([id]+[id])*{\uparrow}[id], Z_0), [id]T'E'Z_0) \\
&\vdash& (q_1,([id]+[id])*[id]{\uparrow}, Z_0), T'E'Z_0) \\
&\vdash& (q_1,([id]+[id])*[id]{\uparrow}, Z_0), E'Z_0) \\
&\vdash& (q_1,([id]+[id])*[id]{\uparrow}, Z_0), Z_0) \\
&\vdash& (q_1,([id]+[id])*[id]{\uparrow}, Z_0), \varepsilon) \\
\end{align}
$

<br />

### 3. Consider the following CFG $G$. 

$
\begin{align}
S &\rightarrow& aaB \\
&\mid& aaC \\
B &\rightarrow& b \\
C' &\rightarrow& c
\end{align}
$

a. Construct a PDA simulating the leftmost derivation of $G$.

$M = (\{q_0,q_1\}, \{a,b,c\}, \{S,B,C,b,c,Z_0\}, \delta, q_0, Z_0, \{q_1\})$, where

$
\begin{align}
\delta(q_0,\varepsilon,Z_0) = \{(q_1, SZ_0)\}\ \\
\delta(q_1,\varepsilon,S) = \{(q_1, aaB), (q_1, aaC)\} \\
\delta(q_1,\varepsilon,B) = \{(q_1, b)\} \\
\delta(q_1,\varepsilon,C) = \{(q_1, c)\}
\delta(q_1,a,a) = \{(q_1, \varepsilon)\} \\
\delta(q_1,b,b) = \{(q_1, \varepsilon)\} \\
\delta(q_1,c,c) = \{(q_1, \varepsilon)\} \\
\delta(q_1,\varepsilon,Z_0) = \{(q_1, \varepsilon)\}
\end{align}
$

b. Construct the LL(1), LL(2), and LL(3) parse tables for the following CFG $G$ containing $FIRST_1$, $FOLLOW_1$ and $LOOKAHEAD_1$ sets. What is the minimum value of k for the grammar to be LL(k)?

### LL(1) parse table

| N    | Rule | $FIRST_1$ | $FOLLOW_1$ | $LOOKAHEAD_1$ |
|------|------|-----------|------------|---------------|
| $S$  | (1)  | a         | ε          | a             |
|      | (2)  | a         | ε          | a             |
| $B $ | (3)  | b         | ε          | b             |
| $C$  | (4)  | c         | ε          | c             |

Not LL(1) since $LOOKAHEAD_1$ for $S$ rules is not disjoint.
  
### LL(2) parse table

| N    | Rule | $FIRST_2$ | $FOLLOW_2$ | $LOOKAHEAD_2$ |
|------|------|-----------|------------|---------------|
| $S$  | (1)  | aa        | ε          | aa            |
|      | (2)  | aa        | ε          | aa            |
| $B $ | (3)  | b         | ε          | b             |
| $C$  | (4)  | c         | ε          | c             |

Not LL(2) since $LOOKAHEAD_2$ for $S$ rules is not disjoint.

### LL(3) parse table

| N    | Rule | $FIRST_2$ | $FOLLOW_2$ | $LOOKAHEAD_2$ |
|------|------|-----------|------------|---------------|
| $S$  | (1)  | aab       | ε          | aab           |
|      | (2)  | aac       | ε          | aac           |
| $B $ | (3)  | b         | ε          | b             |
| $C$  | (4)  | c         | ε          | c             |

The grammar is LL(3) since $LOOKAHEAD_3$ for all rules is disjoint.

c. Run the LL(3) parser on the input string `aab` to find an accepting computation. If the grammar is not LL(3), use a rule leading to acceptance when there are multiple choices as a nondeterministic PDA does.

$
\begin{align}
(q_0, {\uparrow}aab, Z_0) &\vdash& (q_1, {\uparrow}aab, SZ_0) \text{ // Push S onto stack} \\
&\vdash& (q_1, {\uparrow}aab, aaBZ_0) \\
&\vdash& (q_1, a{\uparrow}ab, aBZ_0) \\
&\vdash& (q_1, aa{\uparrow}b, BZ_0) \\
&\vdash& (q_1, aa{\uparrow}b, bZ_0) \\
&\vdash& (q_1, aab{\uparrow}, Z_0) \\
&\vdash& (q_1, aab{\uparrow}, \varepsilon) \\
\end{align}
$

## Other potential questions

### 1. Transform the given statement into intermediate (three-address) code/assembly code/machine code. Do not perform any optimizations.
```
A := B1 + C * 21
```
    
```
R1 := C * 21
R2 := B1 + R1
A  := R2
```

### 2. Perform a local optimization on the code from the previous question.

Since `R1` is not used anywhere except for `B1 + R1`, we can simply assign the result to A.
    
```
R1 := C * 21
A := B1 + R1
```

### 3. Is the following CFG unambiguous? Why or why not?
* Strategy 1: If asked, it'll probably be an ambiguous grammar, so provide a string that has two parse trees.
> A context-free grammar is **ambiguous** if there are two or more parse trees (or leftmost/rightmost derivations) for some sentence. Otherwise, it is **unambiguous**.
* Strategy 2: If the grammar is unambiguous, maybe follow the steps described at https://cs.stackexchange.com/questions/2320/how-to-prove-that-a-grammar-is-unambiguous? We did not go over this (and I don't understand half of what they're talking about), so I don't believe this type of question (with an unambiguous grammar) will be asked.

### 4. Transform the given DFA to a scanner:

<img src="./res/03/3_4.png" width="400px" alt="DFA Example"/>
    
```
State-A:
    read(c);
    if (c == 'a') then goto State-B
    else if (c == 'b') then goto State-B
    else ERROR()
State-B:
    read(c)
    if (c == 'a') then goto State-D
    else if (c == 'b') then goto State-B
    else if (c in $) then ACCEPT()
    else ERROR()
State-C:
    read(c)
    if (c == 'a') then goto State-A
    else if (c == 'b') then goto State-D
    else ERROR()
State-D:
    read(c)
    if (c == 'a') then goto State-B
    else if (c == 'b') then goto State-A
    else ERROR()
```