Skip to content

Conversation

jserv
Copy link
Collaborator

@jserv jserv commented Aug 31, 2025

Optimizations Added:

  • Dead Code Elimination
    • Precise liveness tracking
    • Removes stores immediately overwritten
  • Extended SSA Analysis
    • 10-instruction window for load-store forwarding
    • Full basic block redundant load elimination
    • Comprehensive algebraic simplifications
  • Architecture Cleanup
    • Moved self-operations (x-x, x^x, x&x, x|x) to peephole
    • Consolidated duplicate code into eval_algebraic()

SSA handles constants, peephole handles registers.

@jserv jserv force-pushed the improve-ssa branch 3 times, most recently from a651980 to c6ebcf0 Compare September 4, 2025 04:27
@jserv
Copy link
Collaborator Author

jserv commented Sep 4, 2025

@cubic-dev-ai review this PR

@sysprog21 sysprog21 deleted a comment from cubic-dev-ai bot Sep 4, 2025
Copy link

@cubic-dev-ai cubic-dev-ai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

5 issues found across 7 files

React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai to give feedback, ask questions, or re-run the review.

@jserv jserv force-pushed the improve-ssa branch 4 times, most recently from cdd200f to 8b2fed7 Compare September 4, 2025 09:02
Comment on lines +1755 to +1767
if (rs1 && rs1->is_const && !rs1->ptr_level && !rs1->is_global) {
vd = require_var(parent);
gen_name_to(vd->var_name);
vd->is_const = true;
vd->init_val = !rs1->init_val;
opstack_push(vd);
add_insn(parent, *bb, OP_load_constant, vd, NULL, NULL, 0, NULL);
} else {
vd = require_var(parent);
gen_name_to(vd->var_name);
opstack_push(vd);
add_insn(parent, *bb, OP_log_not, vd, rs1, NULL, 0, NULL);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

require_var() and gen_name_to() can be moved to before the if statement to improve code reusability. That is:

vd = require_var(parent);
gen_name_to(vd->var_name);
if (rs1 && rs1->is_const && !rs1->ptr_level && !rs1->is_global) {
    vd->is_const = true;
    ...
} else {
    opstack_push(vd);
    add_insn(...);
}

Comment on lines +1774 to +1786
if (rs1 && rs1->is_const && !rs1->ptr_level && !rs1->is_global) {
vd = require_var(parent);
gen_name_to(vd->var_name);
vd->is_const = true;
vd->init_val = ~rs1->init_val;
opstack_push(vd);
add_insn(parent, *bb, OP_load_constant, vd, NULL, NULL, 0, NULL);
} else {
vd = require_var(parent);
gen_name_to(vd->var_name);
opstack_push(vd);
add_insn(parent, *bb, OP_bit_not, vd, rs1, NULL, 0, NULL);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Comment on lines +2205 to +2219
/* Constant folding for negation */
if (rs1 && rs1->is_const && !rs1->ptr_level && !rs1->is_global) {
vd = require_var(parent);
gen_name_to(vd->var_name);
vd->is_const = true;
vd->init_val = -rs1->init_val;
opstack_push(vd);
add_insn(parent, *bb, OP_load_constant, vd, NULL, NULL, 0,
NULL);
} else {
vd = require_var(parent);
gen_name_to(vd->var_name);
opstack_push(vd);
add_insn(parent, *bb, OP_negate, vd, rs1, NULL, 0, NULL);
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ditto.

Copy link
Collaborator

@DrXiao DrXiao left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of the introduced optimization look good. However, I currently don't understand "Phi node optimization" because I lack knowledge about the phi function.

It would be better if the other collaborators who have the relevant knowledge could review it. Otherwise, I'll need to spend some time learning the phi function before coming back to review the phi node optimization.

The BX (Branch and Exchange) instruction is needed for proper function
returns in ARM code generation. Unlike BLX which saves the return
address, BX simply branches to the address in the register.

This fixes undefined reference errors during compilation.
Optimize unary operations on constants at parse time:
- Logical NOT (!x) when x is constant
- Bitwise NOT (~x) when x is constant
- Negation (-x) when x is constant

This reduces the number of instructions generated and enables
further optimizations in later passes.
Enhanced dead store elimination to detect stores overwritten within
a small window (3 instructions). The optimization:
- Checks for intervening uses before marking as dead
- Stops at control flow boundaries for safety
- Marks dead stores for removal by DCE sweep

This catches common patterns like consecutive assignments while
remaining conservative to avoid incorrect elimination.
Implement zero-check guards to prevent undefined behavior in:
- x / x = 1 optimization (only when x is provably non-zero)
- x % x = 0 optimization (only when x is provably non-zero)
- x / 1 = x and x % 1 = 0 (always safe)

These guards ensure correctness by only applying optimizations
when operands are compile-time constants with non-zero values,
addressing reviewer concerns about potential division by zero.
Add comprehensive algebraic optimization patterns:
- Self-operations: x-x=0, x^x=0, x&x=x, x|x=x
- Comparisons: x==x=1, x!=x=0, x<x=0, x<=x=1
- Identity operations: x+0=x, x*1=x, x&-1=x
- Constant folding: x*0=0, x&0=0, x|-1=-1
- Special cases: x%1=0, x*-1=-x, x<<0=x

These patterns improve code generation by eliminating
redundant operations and simplifying expressions at
compile time.
Implement trivial phi node elimination:
- Remove phi nodes where all operands are the same variable
- Replace with simple assignment (phi(x,x,x) = x)
- Fold phi nodes with all same constant values
- Convert to load_constant for compile-time evaluation

This optimization reduces unnecessary phi operations in
SSA form, improving both compile time and generated code
quality by eliminating redundant merge points.
Add sophisticated cross-instruction optimizations:

Store-to-Load Forwarding:
- Forward stored values directly to subsequent loads
- Eliminate unnecessary memory round-trips
- Validate no intervening calls or branches

Redundant Load Elimination:
- Reuse values from previous loads to same location
- Check for no intervening stores or calls
- Convert redundant loads to simple assignments

Strength Reduction:
- Convert multiply by power-of-2 to left shift
- Convert divide by power-of-2 to right shift
- Convert modulo by power-of-2 to bitwise AND

These patterns analyze instruction sequences to find
optimization opportunities that single-instruction
analysis would miss.
@jserv jserv merged commit c38ee81 into master Sep 5, 2025
12 checks passed
@jserv jserv deleted the improve-ssa branch September 5, 2025 09:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants