-
Notifications
You must be signed in to change notification settings - Fork 143
Improve SSA optimizations #268
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
a651980
to
c6ebcf0
Compare
@cubic-dev-ai review this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
5 issues found across 7 files
React with 👍 or 👎 to teach cubic. You can also tag @cubic-dev-ai
to give feedback, ask questions, or re-run the review.
cdd200f
to
8b2fed7
Compare
if (rs1 && rs1->is_const && !rs1->ptr_level && !rs1->is_global) { | ||
vd = require_var(parent); | ||
gen_name_to(vd->var_name); | ||
vd->is_const = true; | ||
vd->init_val = !rs1->init_val; | ||
opstack_push(vd); | ||
add_insn(parent, *bb, OP_load_constant, vd, NULL, NULL, 0, NULL); | ||
} else { | ||
vd = require_var(parent); | ||
gen_name_to(vd->var_name); | ||
opstack_push(vd); | ||
add_insn(parent, *bb, OP_log_not, vd, rs1, NULL, 0, NULL); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
require_var()
and gen_name_to()
can be moved to before the if statement to improve code reusability. That is:
vd = require_var(parent);
gen_name_to(vd->var_name);
if (rs1 && rs1->is_const && !rs1->ptr_level && !rs1->is_global) {
vd->is_const = true;
...
} else {
opstack_push(vd);
add_insn(...);
}
if (rs1 && rs1->is_const && !rs1->ptr_level && !rs1->is_global) { | ||
vd = require_var(parent); | ||
gen_name_to(vd->var_name); | ||
vd->is_const = true; | ||
vd->init_val = ~rs1->init_val; | ||
opstack_push(vd); | ||
add_insn(parent, *bb, OP_load_constant, vd, NULL, NULL, 0, NULL); | ||
} else { | ||
vd = require_var(parent); | ||
gen_name_to(vd->var_name); | ||
opstack_push(vd); | ||
add_insn(parent, *bb, OP_bit_not, vd, rs1, NULL, 0, NULL); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
/* Constant folding for negation */ | ||
if (rs1 && rs1->is_const && !rs1->ptr_level && !rs1->is_global) { | ||
vd = require_var(parent); | ||
gen_name_to(vd->var_name); | ||
vd->is_const = true; | ||
vd->init_val = -rs1->init_val; | ||
opstack_push(vd); | ||
add_insn(parent, *bb, OP_load_constant, vd, NULL, NULL, 0, | ||
NULL); | ||
} else { | ||
vd = require_var(parent); | ||
gen_name_to(vd->var_name); | ||
opstack_push(vd); | ||
add_insn(parent, *bb, OP_negate, vd, rs1, NULL, 0, NULL); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of the introduced optimization look good. However, I currently don't understand "Phi node optimization" because I lack knowledge about the phi function.
It would be better if the other collaborators who have the relevant knowledge could review it. Otherwise, I'll need to spend some time learning the phi function before coming back to review the phi node optimization.
The BX (Branch and Exchange) instruction is needed for proper function returns in ARM code generation. Unlike BLX which saves the return address, BX simply branches to the address in the register. This fixes undefined reference errors during compilation.
Optimize unary operations on constants at parse time: - Logical NOT (!x) when x is constant - Bitwise NOT (~x) when x is constant - Negation (-x) when x is constant This reduces the number of instructions generated and enables further optimizations in later passes.
Enhanced dead store elimination to detect stores overwritten within a small window (3 instructions). The optimization: - Checks for intervening uses before marking as dead - Stops at control flow boundaries for safety - Marks dead stores for removal by DCE sweep This catches common patterns like consecutive assignments while remaining conservative to avoid incorrect elimination.
Implement zero-check guards to prevent undefined behavior in: - x / x = 1 optimization (only when x is provably non-zero) - x % x = 0 optimization (only when x is provably non-zero) - x / 1 = x and x % 1 = 0 (always safe) These guards ensure correctness by only applying optimizations when operands are compile-time constants with non-zero values, addressing reviewer concerns about potential division by zero.
Add comprehensive algebraic optimization patterns: - Self-operations: x-x=0, x^x=0, x&x=x, x|x=x - Comparisons: x==x=1, x!=x=0, x<x=0, x<=x=1 - Identity operations: x+0=x, x*1=x, x&-1=x - Constant folding: x*0=0, x&0=0, x|-1=-1 - Special cases: x%1=0, x*-1=-x, x<<0=x These patterns improve code generation by eliminating redundant operations and simplifying expressions at compile time.
Implement trivial phi node elimination: - Remove phi nodes where all operands are the same variable - Replace with simple assignment (phi(x,x,x) = x) - Fold phi nodes with all same constant values - Convert to load_constant for compile-time evaluation This optimization reduces unnecessary phi operations in SSA form, improving both compile time and generated code quality by eliminating redundant merge points.
Add sophisticated cross-instruction optimizations: Store-to-Load Forwarding: - Forward stored values directly to subsequent loads - Eliminate unnecessary memory round-trips - Validate no intervening calls or branches Redundant Load Elimination: - Reuse values from previous loads to same location - Check for no intervening stores or calls - Convert redundant loads to simple assignments Strength Reduction: - Convert multiply by power-of-2 to left shift - Convert divide by power-of-2 to right shift - Convert modulo by power-of-2 to bitwise AND These patterns analyze instruction sequences to find optimization opportunities that single-instruction analysis would miss.
Optimizations Added:
x-x
,x^x
,x&x
,x|x
) to peepholeSSA handles constants, peephole handles registers.