# Daily Blog #78 - Regular Expressions (RE)
### July 17, 2025 

---

#### **What is a Regular Expression (RE)?**

A **regular expression** is a pattern that defines a **regular language** — meaning a set of strings that follow specific, rule-based formats.

In theoretical computer science (not just Python), REs are a formal way to define patterns **recognized by finite automata** (DFA/NFA).

---

### **1. Regular Expressions Define Regular Languages**

They are **equivalent in power** to DFAs and NFAs.
Anything you can express with a RE, you can build a DFA for it — and vice versa.

---

### **2. Syntax: The Building Blocks**

| Symbol | Meaning                                | Example    | Matches                      |     |            |
| ------ | -------------------------------------- | ---------- | ---------------------------- | --- | ---------- |
| `a`    | Literal character                      | `a`        | `a`                          |     |            |
| `ε`    | Empty string (zero characters)         |            | Accepts no input             |     |            |
| \`     | \`                                     | Union (OR) | \`a                          | b\` | `a` or `b` |
| `.`    | Concatenation                          | `ab`       | `a` followed by `b`          |     |            |
| `*`    | Kleene star (zero or more repetitions) | `a*`       | \`\`, `a`, `aa`, `aaa`, etc. |     |            |
| `+`    | One or more repetitions                | `a+`       | `a`, `aa`, `aaa` (not `""`)  |     |            |
| `?`    | Zero or one occurrence (optional)      | `a?`       | \`\`, `a`                    |     |            |
| `()`   | Grouping                               | `(ab)*`    | `ab`, `abab`, etc.           |     |            |

> Note: In theoretical RE (automata theory), we mostly use `ε`, `|`, `.`, and `*`.

---

### **3. Precedence Rules (Order of Operations)**

From highest to lowest:

1. **Kleene Star (`*`)**
2. **Concatenation (`.`)**
3. **Union (`|`)**

Example:
`a|bc*` is parsed as `a | (b (c*))`, **not** `(a|b)(c*)`

---

### **4. Examples: What These Patterns Mean**

| Regular Expression | Matches                            |                                                       |
| ------------------ | ---------------------------------- | ----------------------------------------------------- |
| `a*`               | Any number of `a`s, including none |                                                       |
| `(ab)*`            | `""`, `ab`, `abab`, `ababab`, etc. |                                                       |
| \`a                | b\`                                | Either `a` or `b`                                     |
| \`a(b              | c)\`                               | `ab` or `ac`                                          |
| \`(a               | b)\*\`                             | Any string of `a`s and `b`s, including `""`           |
| \`a(a              | b)\*b\`                            | Starts with `a`, ends with `b`, middle: a/b any count |

---

### **5. From RE to NFA to DFA**

Theoretical REs are **converted to NFAs** using **Thompson’s construction**, then (if needed) into a DFA via subset construction.

---

### **6. Closure Properties of Regular Languages**

Regular expressions (and regular languages) are closed under:

* **Union** (using `|`)
* **Concatenation** (using `.`, implicit)
* **Kleene star** (`*`)
* **Intersection**, **complement**, **difference** — possible but need automata to express

---

### **7. Practice Problems**

**Q1:** Give a RE for all strings over `{0,1}` that contain exactly two 1s.

**A:**
`0*10*10*`
Explanation: Any number of `0`s, one `1`, then again any number of `0`s, and one more `1`.

---

**Q2:** RE for strings over `{a, b}` where every `a` is immediately followed by a `b`.

**A:**
`(b|ab)*`
Explanation: Either a single `b` or the pair `ab` can appear any number of times.

---

**Q3:** RE for binary strings ending in "01".

**A:**
`(0|1)*01`
