Skip to content

ZJIT: Proposal to convert EBB to conventional BB in HIR #959

@tenderworks

Description

@tenderworks

TL;DR: I think converting EBB to regular BB would simplify our current code and also enable us to do more advanced compiler passes more easily

Background

HIR currently uses extended basic blocks (EBBs). An EBB allows branches (e.g. IfTrue, IfFalse) to appear anywhere within a block, not just at the end. The LIR, by contrast, uses conventional basic blocks where only the last 0, 1, or 2 instructions can be terminators.

This mismatch creates unnecessary complexity and I'd like to propose converting HIR to use traditional basic blocks.

Why

1. Unify the block model between HIR and LIR

LIR already uses traditional basic blocks. The codegen lowering pass must split every HIR EBB at each mid-block branch, creating a new fall-through LIR block.

Multiple LIR blocks end up sharing the same hir_block_id, which complicates any mapping between the two IRs.

If HIR used the same block model, lowering would be a straightforward 1:1 block mapping.

2. Reuse data flow algorithms between HIR and LIR

LIR has liveness analysis, critical edge splitting , and successor/predecessor computation that all assume traditional basic blocks. HIR has its own ControlFlowInfo that must work around EBBs.

With a shared block model, these algorithms could be written once and reused, or at minimum follow the same patterns and assumptions.

3. Simplify CFG analysis

With EBBs, computing successors requires scanning every instruction in a block:

// Since ZJIT uses extended basic blocks, one must check all instructions
// for their ability to jump to another basic block, rather than just
// the instructions at the end of a given basic block.

This affects every pass that needs CFG information:

  • ControlFlowInfo::new(): successor/predecessor maps
  • po_from(): reverse post-order traversal
  • clean_cfg(): CFG cleanup / dead block elimination
  • fold_constants(): must break on mid-block terminators

With conventional basic blocks, successors are determined solely by the last 2 instructions.

4. Enable standard compiler optimizations

Many textbook optimizations assume traditional basic blocks:

  • Dominator tree construction
  • Loop detection and LICM
  • GVN/GCM (Global Code Motion)
  • Tail merging / block deduplication

These are simpler to implement and reason about when each block has exactly one entry point and one exit point.

How hard is it?

I think we mostly need to:

  • Change HIR construction to create new blocks as we're emitting (when we emit branching instructions like IfTrue / IfFalse).
  • Eliminate the block splitting code in LIR.
  • Update ControlFlowInfo
  • Fix tests

I think everything else in HIR would stay mostly the same (BranchEdge, SSA param passing, etc).

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions