# Daily Blog #79 - Converting Regular Expressions to NFA
### July 18, 2025 


#### **(RE ➝ ε-NFA using Thompson’s Construction)**

This is a core skill in automata theory: converting a **regular expression** into an **ε-NFA (Nondeterministic Finite Automaton with ε-transitions)**.

This is how regex gets compiled under the hood — by building a state machine that accepts exactly the strings described by the RE.

---

### **Key Idea:**

Thompson’s Construction builds an ε-NFA **recursively**, based on the structure of the regular expression.
Each operation (`|`, `.`, `*`) has a standard construction template. We combine them to build more complex machines.

---

## **Basic Building Blocks**

Define how to build ε-NFA for the **basic elements**:

### 1. **Single Symbol `a`**

**RE**: `a`
**NFA**: two states, one transition

```
(q0) --a--> (q1)
```

* q0: start
* q1: accepting

---

### 2. **Empty String `ε`**

**RE**: `ε`

```
(q0) --ε--> (q1)
```

Accepts the empty string.

---

## **Composite Operations**

These are recursive patterns you apply when combining REs.

---

### 3. **Concatenation `AB`**

If:

* A’s ε-NFA: `(start_A) ... (end_A)`
* B’s ε-NFA: `(start_B) ... (end_B)`

Then connect:
`end_A --ε--> start_B`

Result:

```
(start_A) ... --ε--> (start_B) ... (end_B)
```

The accepting state of A becomes the new path into B.

---

### 4. **Union (OR) `A|B`**

* Create a new start state and new final state
* Add ε-transitions:

  * From new start → start\_A and start\_B
  * From end\_A and end\_B → new end

```
           ε               ε
       -->[start_A]-->...-->[end_A]
     /                                \
(qs)                                  (qe)
     \                                /
       -->[start_B]-->...-->[end_B]
           ε               ε
```

This lets the NFA choose either path nondeterministically.

---

### 5. **Kleene Star `A*`**

* Create new start and end states
* ε-transitions:

  * From new start → old start
  * From new start → new end (to allow 0 repetitions)
  * From old end → old start (loop)
  * From old end → new end

```
      ε          ε
  -->[start_A]-->...-->[end_A]
 /                          \
(qs)                        (qe)
 \                          /
  --------ε<--------ε-------
```

Allows repeating A any number of times (including 0).

---

## Thompson’s Construction: Strategy Recap

| Regex Symbol | NFA Structure                           |                                    |
| ------------ | --------------------------------------- | ---------------------------------- |
| `a`          | `q0 --a--> q1`                          |                                    |
| `ε`          | `q0 --ε--> q1`                          |                                    |
| `A.B`        | Connect end of A to start of B          |                                    |
| \`A          | B\`                                     | New start and end; split and merge |
| `A*`         | Loop from end back to start; allow skip |                                    |

---

### Practice Example: Convert `(a|b)*abb` to ε-NFA

Step-by-step breakdown:

#### Step 1: Build RE for `a|b`

This creates:

```
        ε      a      ε
     -->[q1]------>[q2]--\
   /                        \
(q0)                          (q5)
   \                        /
     -->[q3]------>[q4]--/
        ε      b      ε
```

#### Step 2: Apply Kleene Star: `(a|b)*`

* Add start (q\_start) and end (q\_end)
* Loop q5 → q1 (original start)
* Add skip path (q\_start → q\_end)

#### Step 3: Add concatenation with `a`, then `b`, then another `b`

Each:

```
--a-->--b-->--b-->
```

Connect final state of `(a|b)*` to start of this sequence.

Final accepting state is at the end of the second `b`.

