# Lecture Notes: Nelson-Oppen – Purification Phase

## Goal of Purification

Given a formula $F$ in the **combination theory** $T_1 \cup T_2$, the goal is to produce:
- Two **pure formulas**:
  - $F_1$ in $T_1$
  - $F_2$ in $T_2$
- Such that $F_1 \land F_2$ is **equisatisfiable** with $F$

> Note: We allow the introduction of **new variables**, so the resulting formulas may not be equivalent to $F$, but **equisatisfiability** is sufficient for deciding satisfiability.

---

## How Purification Works

We **recursively rewrite** impure terms using **fresh variables**.

### Rules:
1. If a function $f$ from $T_1$ is applied to a term $t$ not from $T_1`:
   - Introduce a fresh variable $z$
   - Replace $t$ with $z$ in $f(t)$
   - Conjoin $z = t$
2. If a predicate $p$ from $T_1$ is applied to a term $t$ not from $T_1$:
   - Same strategy: replace with variable and conjoin equality

> Repeat this process until no impure terms remain.

---

## Simple Example

Given theory: **Equality $\cup$ Rationals**

Formula:
```text
x ≤ f(x) + 1
```

- `f(x)` is from the **theory of equality**
- `≤` and `+1` are from the **theory of rationals**
- This formula is **impure**

### Purification:

1. Introduce new variable `y = f(x)`
2. Substitute in original formula:
   ```text
   x ≤ y + 1
   ∧ y = f(x)
   ```

- First conjunct belongs to rationals
- Second conjunct belongs to equality

---

## Complex Example

Given theory: **Equality $\cup$ Integers**

Original formula:
```text
f(x + g(y)) ≤ g(a) + f(b)
```

### Step-by-step Purification:

1. `g(y)` → `z1 = g(y)`
2. `x + z1` → `z2 = x + z1`
3. `f(z2)` → `z3 = f(z2)`
4. `g(a)` → `z4 = g(a)`
5. `f(b)` → `z5 = f(b)`
6. Final purified formula:
   ```text
   z3 ≤ z4 + z5
   ∧ z1 = g(y)
   ∧ z2 = x + z1
   ∧ z3 = f(z2)
   ∧ z4 = g(a)
   ∧ z5 = f(b)
   ```

- First line is a **pure integer** formula
- Remaining lines are **pure equality** formulas

---

## Shared vs Unshared Variables

After purification, some variables may appear in both $F_1$ and $F_2$.

- **Shared Variables**: occur in **both** $F_1$ and $F_2$
- **Unshared Variables**: occur in **only one** of the formulas

### Why this matters:
- Shared variables are the **only** link between the theories
- These will be used during **equality propagation** (next step)

---

## Example: Identify Shared Variables

Purified formulas:
```text
F₁ (T₁: Integers):     w1 + x = y
F₂ (T₂: Equality):      f(w1) = w2 ∧ f(x) = w2
```

- **w1**: shared (appears in both)
- **x**: shared
- **y**: unshared (only in F₁)
- **w2**: shared

---

## Summary

- **Purification** transforms an impure formula into **two pure, equisatisfiable** subformulas
- Achieved by:
  - Replacing impure subterms with fresh variables
  - Conjoining appropriate equalities
- Prepares the formula for **equality propagation** across shared variables in the next phase of the Nelson-Oppen method
