### A2.1.3. Data Flow Analysis

> *Data flow analysis computes information about the flow of data values through a program's control flow graph, determining what facts hold at each program point.*

**Explanation:**

**Data Flow Analysis** propagates facts along the edges of a CFG to determine properties at each program point. A data flow problem is defined by:

1. **Domain** ‚Äî the set of possible data flow values (e.g., sets of variables).
2. **Transfer function** ‚Äî how each instruction transforms the data flow value.
3. **Meet operator** ‚Äî how values from multiple predecessors (or successors) are combined.
4. **Direction** ‚Äî forward (entry ‚Üí exit) or backward (exit ‚Üí entry).

The analysis iterates until a **fixed point** is reached ‚Äî no data flow value changes on any block.

**Classic Data Flow Problems:**

- **Reaching Definitions** (forward) ‚Äî which definitions of a variable may reach a given point.
- **Live Variables** (backward) ‚Äî which variables may be read before being overwritten.
- **Available Expressions** (forward) ‚Äî which expressions have been computed and not invalidated.

**Example:**

For **reaching definitions**, the transfer function for a block is:

$$
\text{OUT}[B] = \text{GEN}[B] \cup (\text{IN}[B] - \text{KILL}[B])
$$

where GEN is the set of definitions generated in the block and KILL is the set of definitions overwritten.

In [None]:
cfg = {"B1": ["B2", "B3"], "B2": ["B4"], "B3": ["B4"], "B4": []}

gen = {"B1": {"d1", "d2"}, "B2": {"d3"}, "B3": {"d4"}, "B4": {"d5"}}
kill = {"B1": set(), "B2": {"d1"}, "B3": {"d2"}, "B4": {"d3", "d4"}}

predecessors = {block: [] for block in cfg}
for block, successors in cfg.items():
    for successor in successors:
        predecessors[successor].append(block)

in_sets = {block: set() for block in cfg}
out_sets = {block: set() for block in cfg}

blocks = list(cfg.keys())

changed = True
while changed:
    changed = False
    for block in blocks:
        in_sets[block] = set().union(*(out_sets[pred] for pred in predecessors[block])) if predecessors[block] else set()
        new_out = gen[block] | (in_sets[block] - kill[block])
        if new_out != out_sets[block]:
            out_sets[block] = new_out
            changed = True

print("Reaching Definitions (fixed point):")
for block in blocks:
    print(f"  {block}: IN={sorted(in_sets[block])}, OUT={sorted(out_sets[block])}")

**References:**

[üìò Aho, A., Lam, M., Sethi, R. & Ullman, J. (2006). *Compilers: Principles, Techniques, and Tools (2nd ed.).* Addison-Wesley.](https://www.pearson.com/en-us/subject-catalog/p/compilers-principles-techniques-and-tools/P200000003155)

[üìò Cooper, K. & Torczon, L. (2011). *Engineering a Compiler (2nd ed.).* Morgan Kaufmann.](https://www.elsevier.com/books/engineering-a-compiler/cooper/978-0-12-088478-0)

---

[‚¨ÖÔ∏è Previous: Control Flow Graph](./02_control_flow_graph.ipynb) | [Next: Static Single Assignment Form ‚û°Ô∏è](./04_static_single_assignment_form.ipynb)