Skip to content

ethan0150/BFAdd-merge

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

40 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

BFAdd-merge

This peephole optimization pass looks for add operations to bitfields under the same uint32 in the same sturct and merge multiple add instructions into a single add instruction in the IR, reducing the IR instruction count. For example, say we have some struct like the following:

typedef struct x{
    unsigned a:5;
    unsigned b:27;
}t;
t w = {1, 2};

We want to perform something like this, where 2 bitfields covered by the same unsigned int are both added some constant:

w.a+=5;
w.b++;

Instead of doing 2 seperate additions, this can obviously be merged into 1 addition.

How this pass works

  1. Scan instructions between pairs of load/store of the same global value. Any or disjoint instructions that has its value stored by the pair are considered a candidate for optimization.
  2. Every candidate is the root of a expression tree, which is then matched against the following grammar (%1 is the value loaded by the load/store pair):
<BFAr> ::= <BFAr> or disjoint <BFAr> 
    | <BFadd> 
    | <BFmask> 
    | <BFMA>
<BFadd> ::= %1 add ConstInt
<BFmask> ::= %1 and ConstInt
<BFMA> ::= <BFmask> add ConstInt 
    | <BFadd> and ConstInt
  1. Replace all the %1 with 0 for all the matching trees. Since all nodes of the trees should now be constant, constant folding can be performed.
  2. Create a new instruction that add the folded value to %1
  3. Insert new instructions that masks certain bits of %1 for instructions that uses the value of nodes in the matching trees.
  4. Delete the matching trees.

How to run this pass on the example C code

  1. Install clang & llvm 18.1.8
  2. compile the pass w/ this command under build/ (If it doesn't exist mkdir first. Subsequent commands should also be run under the same path):
cmake ..
make -j
  1. compile both test cases
clang -S -emit-llvm -O3 -o ../tests/struct.ll ../tests/struct.c
clang -S -emit-llvm -O3 -o ../tests/loop_struct.ll ../tests/loop_struct.c
  1. generate the optimized IR.
opt -S -load-pass-plugin=skeleton/SkeletonPass.so -passes=skeleton-pass -o ../tests/loop_struct_trans.ll ../tests/loop_struct.ll
opt -S -load-pass-plugin=skeleton/SkeletonPass.so -passes=skeleton-pass -o ../tests/struct_trans.ll ../tests/struct.ll

this will also print the instruction count like below:

b4 transforming main: 73
after transforming main: 56
b4 transforming printf: 0
after transforming printf: 0
b4 transforming main: 23
after transforming main: 11
b4 transforming printf: 0
after transforming printf: 0
  1. run the 4 .ll files with lli
lli ../tests/struct.ll

the other 3 can be run similarly and the output of the original .ll and the *trans.ll should be identical.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • C++ 83.4%
  • C 9.1%
  • CMake 7.5%