ZJIT: Proposal to convert EBB to conventional BB in HIR

TL;DR: I think converting EBB to regular BB would simplify our current code and also enable us to do more advanced compiler passes more easily

## Background

HIR currently uses extended basic blocks (EBBs). An EBB allows branches (e.g. `IfTrue`, `IfFalse`) to appear anywhere within a block, not just at the end. The LIR, by contrast, uses conventional basic blocks where only the last 0, 1, or 2 instructions can be terminators.

This mismatch creates unnecessary complexity and I'd like to propose converting HIR to use traditional basic blocks.

## Why

### 1. Unify the block model between HIR and LIR

LIR already uses traditional basic blocks. The codegen lowering pass must split every HIR EBB at each mid-block branch, creating a new fall-through LIR block.

Multiple LIR blocks end up sharing the same `hir_block_id`, which complicates any mapping between the two IRs.

If HIR used the same block model, lowering would be a straightforward 1:1 block mapping.

### 2. Reuse data flow algorithms between HIR and LIR

LIR has liveness analysis, critical edge splitting , and successor/predecessor computation that all assume traditional basic blocks. HIR has its own `ControlFlowInfo` that must work around EBBs.

With a shared block model, these algorithms could be written once and reused, or at minimum follow the same patterns and assumptions.

### 3. Simplify CFG analysis

With EBBs, computing successors requires [scanning every instruction in a block](https://github.com/ruby/ruby/blob/a897e06db526996c55451c5d7d5049013b76e1a7/zjit/src/hir.rs#L8471-L8483):

```
// Since ZJIT uses extended basic blocks, one must check all instructions
// for their ability to jump to another basic block, rather than just
// the instructions at the end of a given basic block.
```

This affects every pass that needs CFG information:
- `ControlFlowInfo::new()`: successor/predecessor maps
- `po_from()`: reverse post-order traversal
- `clean_cfg()`: CFG cleanup / dead block elimination
- `fold_constants()`: must break on mid-block terminators

With conventional basic blocks, successors are determined solely by the last 2 instructions.

### 4. Enable standard compiler optimizations

Many textbook optimizations assume traditional basic blocks:
- Dominator tree construction
- Loop detection and LICM
- GVN/GCM (Global Code Motion)
- Tail merging / block deduplication

These are simpler to implement and reason about when each block has exactly one entry point and one exit point.

## How hard is it?

I think we mostly need to:

* Change HIR construction to create new blocks as we're emitting (when we emit branching instructions like `IfTrue` / `IfFalse`).
* Eliminate the block splitting code in LIR.
* Update `ControlFlowInfo`
* Fix tests

I think everything else in HIR would stay mostly the same (BranchEdge, SSA param passing, etc).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ZJIT: Proposal to convert EBB to conventional BB in HIR #959

Background

Why

1. Unify the block model between HIR and LIR

2. Reuse data flow algorithms between HIR and LIR

3. Simplify CFG analysis

4. Enable standard compiler optimizations

How hard is it?

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

ZJIT: Proposal to convert EBB to conventional BB in HIR #959

Description

Background

Why

1. Unify the block model between HIR and LIR

2. Reuse data flow algorithms between HIR and LIR

3. Simplify CFG analysis

4. Enable standard compiler optimizations

How hard is it?

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions