# Daily Blog #83 - Parse Trees and Derivations in Context-Free Grammars
### July 22, 2025 

---

### **1. Overview**

In the study of **Context-Free Grammars (CFGs)**, two fundamental tools used to analyze the structure of generated strings are:

* **Derivations**: Step-by-step applications of production rules to generate strings.
* **Parse Trees** (also called **Derivation Trees**): Hierarchical, tree-like visual representations of derivations.

Both are used to validate whether a string belongs to a language generated by a grammar, and they help reveal ambiguities, nested structures, and the underlying syntax of expressions.

---

### **2. Derivations**

#### **Definition:**

A **derivation** is a sequence of rule applications starting from the start symbol that produces a terminal string.

There are two standard types:

#### **a. Leftmost Derivation (LMD)**:

Always expands the **leftmost** non-terminal first at every step.

#### **b. Rightmost Derivation (RMD)**:

Always expands the **rightmost** non-terminal first at every step.

---

#### **Example Grammar:**

Let G be defined as:

* **S → aSb | ab**

This grammar generates strings like: `ab`, `aabb`, `aaabbb`, etc.

---

#### **Example Derivations for `aabb`:**

**Leftmost Derivation:**

1. S
2. → aSb
3. → aaSbb
4. → aa**b**bb (S → ab)

**Rightmost Derivation:**

1. S
2. → aSb
3. → aSbb
4. → aabb (S → ab)

Both derivations result in the same string (`aabb`), but the order in which non-terminals are replaced differs.

---

### **3. Parse Trees**

#### **Definition:**

A **parse tree** is a tree diagram that represents the **syntactic structure** of a string derived from a CFG. Each **interior node** is a non-terminal, and each **leaf node** is a terminal (or ε if applicable).

#### **Key Properties:**

* Root node is the **start symbol**.
* Internal nodes are **non-terminals**.
* Leaves are **terminals** or **ε**.
* Children of a node correspond to symbols on the right-hand side of a production used for that non-terminal.

---

#### **Parse Tree for `aabb` Using the Grammar Above:**

```
        S
      / | \
     a  S  b
        /|\
       a S b
          |
          b
```

This corresponds to the derivation:

* S → aSb
* → aaSbb
* → aabb

---

### **4. Ambiguity in Grammars**

A CFG is **ambiguous** if there exists a string in its language that has **more than one distinct parse tree** (or more than one leftmost/rightmost derivation).

#### **Ambiguous Example:**

Grammar:

* E → E + E | E \* E | (E) | id

String: `id + id * id`

This string has **two parse trees**:

1. *(id + id) \* id* — implies addition happens first
2. *id + (id \* id)* — implies multiplication happens first

This ambiguity is problematic in compilers and must be resolved through grammar rewriting or precedence rules.

---

### **5. Parse Tree vs Derivation Table**

| Feature             | Parse Tree                         | Derivation                      |
| ------------------- | ---------------------------------- | ------------------------------- |
| Structure           | Hierarchical, tree-like            | Linear, sequential              |
| Visual Clarity      | Easy to see nesting and structure  | Better for tracing rule order   |
| Used in             | Syntax analysis, compilers         | Theoretical proofs, tracing     |
| Ambiguity Detection | Directly shows multiple tree forms | Indirect, needs full comparison |

---

### **6. Parse Tree Construction: Algorithmic Steps**

Given a CFG and an input string:

1. **Start from the start symbol (S)**.
2. **Apply production rules** according to a derivation (usually leftmost or rightmost).
3. **Each rule application becomes a new branch** in the tree.
4. **Continue until all leaves are terminals** and match the input string.

---

### **7. Summary**

| Concept        | Description                                                 |
| -------------- | ----------------------------------------------------------- |
| **Derivation** | Sequence of rule applications to produce a string           |
| **Leftmost**   | Always replace the leftmost non-terminal                    |
| **Rightmost**  | Always replace the rightmost non-terminal                   |
| **Parse Tree** | Hierarchical tree showing how a string is derived           |
| **Ambiguity**  | A grammar is ambiguous if a string has multiple parse trees |