Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

IPSCCP pass can't reach a fixed-point when conditions have dependency #57186

Open
bcl5980 opened this issue Aug 17, 2022 · 8 comments
Open

IPSCCP pass can't reach a fixed-point when conditions have dependency #57186

bcl5980 opened this issue Aug 17, 2022 · 8 comments
Assignees

Comments

@bcl5980
Copy link
Contributor

bcl5980 commented Aug 17, 2022

https://alive2.llvm.org/ce/z/FCT5eJ

define void @src(i1 %a, i1 %b) {
entry:
  br i1 %a, label %a.true_b.false, label %a.false
a.false:                                          ; preds = %entry
  br i1 %b, label %b.true, label %if.end
b.true:                                           ; preds = %a.false
  br i1 %b, label %if.end, label %a.true_b.false
a.true_b.false:                                   ; preds = %b.true, %entry
  br i1 %a, label %if.end, label %if.then
if.then:                                          ; preds = %a.true_b.false
  call void @use()
  br label %if.end
if.end:                                           ; preds = %if.then, %a.true_b.false, %b.true, %a.false
  ret void
}

declare void @use()

First time SCCPSolver's log:

markOverdefined: i1 %a
markOverdefined: i1 %b

Popped off OI-WL: i1 %b

Popped off OI-WL: i1 %a
Marking Block Executable: a.true_b.false
Marking Block Executable: a.false

Popped off BBWL: 
a.false:                                          ; preds = %entry
  %b.0 = call i1 @llvm.ssa.copy.i1(i1 %b)
  br i1 %b, label %b.true, label %if.end

Merged constantrange<-1, 0> into   %b.0 = call i1 @llvm.ssa.copy.i1(i1 %b) : constantrange<-1, 0>
Marking Block Executable: b.true
Marking Block Executable: if.end

Popped off BBWL: 
if.end:                                           ; preds = %if.then, %a.true_b.false, %b.true, %a.false
  ret void


Popped off BBWL: 
b.true:                                           ; preds = %a.false
  br i1 %b.0, label %if.end, label %a.true_b.false

Marking Edge Executable: b.true -> if.end

Popped off BBWL: 
a.true_b.false:                                   ; preds = %b.true, %entry
  br i1 %a, label %if.end, label %if.then

Marking Edge Executable: a.true_b.false -> if.end
Marking Block Executable: if.then

Popped off BBWL: 
if.then:                                          ; preds = %a.true_b.false
  call void @use()
  br label %if.end

Marking Edge Executable: if.then -> if.end

Popped off BBWL: 
entry:
  br i1 %a, label %a.true_b.false, label %a.false


Popped off I-WL:   %b.0 = call i1 @llvm.ssa.copy.i1(i1 %b)

We need to run IPSCCP twice to eliminate the dead branch with function @use.

@bcl5980 bcl5980 self-assigned this Aug 17, 2022
@fhahn
Copy link
Contributor

fhahn commented Aug 17, 2022

The issue here is that IPSCCP is using PredicateInfo to propagate conditional information. PredicateInfo only tracks information in blocks dominated by the true & false blocks of the branch. At construction time, neither a.false nor a.true_b.false dominate a.true_b.false, so we do not catch it.

@bcl5980
Copy link
Contributor Author

bcl5980 commented Aug 17, 2022

The issue here is that IPSCCP is using PredicateInfo to propagate conditional information. PredicateInfo only tracks information in blocks dominated by the true & false blocks of the branch. At construction time, neither a.false nor a.true_b.false dominate a.true_b.false, so we do not catch it.

Can we add a function to mark edge not executable when we propagate %b.0 = call i1 @llvm.ssa.copy.i1(i1 %b) to fix the issue?
Or loop runIPSCCP until not changed?

@fhahn
Copy link
Contributor

fhahn commented Aug 17, 2022

I think the edge .true->a.true_b.false is already not marked as executable. This is the point where we could in theory connect the condition to the earlier one, but I am not sure how to do this in a scalable way.

Not sure if running IPSCCP is desirable, it is already quite expensive.

@bcl5980
Copy link
Contributor Author

bcl5980 commented Aug 17, 2022

It looks PredicateInfo doesn't support update dynamic also. No easy way to make it work.

@bcl5980
Copy link
Contributor Author

bcl5980 commented Aug 23, 2022

Trying to add some test code here https://github.com/bcl5980/llvm-project/tree/perf/ipsccp-loop
A lot of compile time increase but got a little performance gain on SPEC2017, so maybe we need to find some other way to improve.

    ipsccp-loop     base
500.perlbench_r 355.2289 4.4816169   365.9355 4.350494
502.gcc_r 247.9069 5.7118226   251.1451 5.638175
505.mcf_r 307.4327 5.2564351   314.2372 5.142612
520.omnetpp_r 489.193 2.6819681   517.4859 2.535335
523.xalancbmk_r 215.4812 4.9006594   220.6564 4.785721
525.x264_r 203.8385 8.5901359   204.6118 8.557669
531.deepsjeng_r 280.6433 4.083475   285.7085 4.011081
541.leela_r 438.2885 3.7783331   432.7863 3.826369
548.exchange2_r          
557.xz_r 314.2906 3.4363098   322.7374 3.346373
           
SPECrate2017_int_base   4.5313302     4.441562

@bcl5980
Copy link
Contributor Author

bcl5980 commented Aug 28, 2022

Try loop runipsccp to a fix point: The loop count statistcs is here:

External/S...6.blender_r/526.blender_r.test 518.00
External/S...lancbmk_r/523.xalancbmk_r.test 230.00
External/S...lancbmk_s/623.xalancbmk_s.test 230.00
External/S...510.parest_r/510.parest_r.test 176.00
MultiSourc...marks/7zip/7zip-benchmark.test 157.00
External/S...0.omnetpp_s/620.omnetpp_s.test 153.00
External/S...0.omnetpp_r/520.omnetpp_r.test 153.00
External/S...8.imagick_s/638.imagick_s.test 74.00
External/S...8.imagick_r/538.imagick_r.test 74.00
MultiSourc...lications/ClamAV/clamscan.test 71.00
External/S...rlbench_s/600.perlbench_s.test 58.00
External/S...rlbench_r/500.perlbench_r.test 58.00
External/S...511.povray_r/511.povray_r.test 58.00
MultiSourc.../Applications/SPASS/SPASS.test 46.00
MultiSourc...ications/JM/lencod/lencod.test 45.00
MultiSourc.../Benchmarks/Bullet/bullet.test 41.00
MultiSourc...-typeset/consumer-typeset.test 40.00
MultiSourc...ocBench/espresso/espresso.test 36.00
MultiSourc...chmarks/MallocBench/gs/gs.test 35.00
MultiSourc...abench/jpeg/jpeg-6a/cjpeg.test 32.00
External/S...17speed/657.xz_s/657.xz_s.test 31.00
MultiSourc...nsumer-jpeg/consumer-jpeg.test 31.00
External/S...017rate/557.xz_r/557.xz_r.test 31.00
External/S...eed/625.x264_s/625.x264_s.test 28.00
External/S...ate/525.x264_r/525.x264_r.test 28.00
MultiSourc...CI_Purple/SMG2000/smg2000.test 27.00
MultiSourc...ProxyApps-C++/CLAMR/CLAMR.test 26.00
MultiSourc...ications/JM/ldecod/ldecod.test 21.00
MultiSourc...nsumer-lame/consumer-lame.test 20.00
MultiSourc...arks/mafft/pairlocalalign.test 19.00
MultiSource/Applications/lua/lua.test 18.00
MultiSourc...s/MallocBench/cfrac/cfrac.test 18.00
External/S...d/641.leela_s/641.leela_s.test 16.00
External/S...e/541.leela_r/541.leela_r.test 16.00
External/S...epsjeng_s/631.deepsjeng_s.test 12.00
MultiSourc...Applications/kimwitu++/kc.test 12.00
External/S...epsjeng_r/531.deepsjeng_r.test 12.00
External/S...speed/644.nab_s/644.nab_s.test 12.00
External/S...7rate/544.nab_r/544.nab_r.test 12.00
MultiSourc.../Prolangs-C/bison/mybison.test 11.00
MultiSourc...yApps-C++/PENNANT/PENNANT.test 11.00
MultiSourc...plications/d/make_dparser.test 10.00
MultiSourc...telecomm-gsm/telecomm-gsm.test 9.00
MultiSourc...ks/Prolangs-C/agrep/agrep.test 9.00
MultiSourc...lications/obsequi/Obsequi.test 9.00
External/S...ate/508.namd_r/508.namd_r.test 9.00
MultiSourc...ediabench/gsm/toast/toast.test 9.00
MultiSourc.../Applications/spiff/spiff.test 8.00
MultiSource/Applications/hbd/hbd.test 7.00
MultiSourc...ProxyApps-C++/HPCCG/HPCCG.test 7.00
MultiSourc...ks/Prolangs-C++/city/city.test 7.00
MultiSourc...peg2/mpeg2dec/mpeg2decode.test 7.00
MultiSourc...encode/alacconvert-encode.test 6.00
MultiSourc...s-C/Pathfinder/PathFinder.test 6.00
MultiSourc...oxyApps-C/miniAMR/miniAMR.test 6.00
MultiSourc...ch/g721/g721encode/encode.test 6.00
MicroBench...ALambdaLoops/lcalsALambda.test 6.00
MultiSourc...oxyApps-C++/miniFE/miniFE.test 6.00
MultiSourc...decode/alacconvert-decode.test 6.00
MicroBench...CLambdaLoops/lcalsCLambda.test 6.00
MultiSourc...arks/VersaBench/dbms/dbms.test 6.00
MicroBench...SubsetCRawLoops/lcalsCRaw.test 6.00
MultiSourc...cations/hexxagon/hexxagon.test 5.00
External/S...speed/605.mcf_s/605.mcf_s.test 5.00
MicroBench...SubsetARawLoops/lcalsARaw.test 5.00
MicroBench...SubsetBRawLoops/lcalsBRaw.test 5.00
External/S...7rate/505.mcf_r/505.mcf_r.test 5.00
MicroBench...BLambdaLoops/lcalsBLambda.test 5.00
MicroBench...eProcessing/Dilate/Dilate.test 4.00
MultiSourc...pps-C/SimpleMOC/SimpleMOC.test 4.00
MultiSourc...oxyApps-C/XSBench/XSBench.test 4.00
MicroBench...eProcessing/Dither/Dither.test 4.00
MultiSourc...marks/Ptrdist/yacr2/yacr2.test 4.00
MultiSourc...DOE-ProxyApps-C/CoMD/CoMD.test 4.00
MultiSourc.../Benchmarks/Ptrdist/bc/bc.test 4.00
MultiSourc...tions/lambda-0.1.3/lambda.test 4.00
MultiSourc...ks/Prolangs-C/gnugo/gnugo.test 4.00
MultiSourc...oxyApps-C/miniGMG/miniGMG.test 3.00
MultiSourc...count/automotive-bitcount.test 3.00
MultiSourc...nch/pcompress2/pcompress2.test 3.00
MultiSourc...oxyApps-C/RSBench/rsbench.test 3.00
MultiSourc.../Benchmarks/nbench/nbench.test 3.00
MultiSourc...s/ASC_Sequoia/AMGmk/AMGmk.test 3.00
MicroBench...sion/AnisotropicDiffusion.test 3.00
MicroBench...Filtering/BilateralFilter.test 3.00
MicroBench...ImageProcessing/Blur/blur.test 3.00
MultiSourc...lications/viterbi/viterbi.test 3.00
MicroBench...terpolation/Interpolation.test 3.00
MultiSource/Applications/siod/siod.test 3.00
MultiSourc.../Applications/sgefa/sgefa.test 3.00
MultiSourc...lications/SIBsim4/SIBsim4.test 3.00
MultiSourc...adpcm/rawcaudio/rawcaudio.test 2.00
MultiSourc.../Benchmarks/Olden/mst/mst.test 2.00
MultiSourc...Rodinia/backprop/backprop.test 2.00
MultiSourc...nchmarks/McCat/05-eks/eks.test 2.00
MultiSourc...cCat/03-testtrie/testtrie.test 2.00
MultiSourc...marks/SciMark2-C/scimark2.test 2.00
MultiSourc...nia/pathfinder/pathfinder.test 2.00
MultiSourc...s/Rodinia/hotspot/hotspot.test 2.00
MultiSourc...patricia/network-patricia.test 2.00
MultiSourc...rks/FreeBench/pifft/pifft.test 2.00
MultiSourc...lications/minisat/minisat.test 2.00
MultiSourc...lications/sqlite3/sqlite3.test 2.00
MicroBench...opVectorizationBenchmarks.test 2.00
MultiSourc...s/Fhourstones/fhourstones.test 2.00

  sccp.NumIPSccpRun

run ipsccp_loop
count 225.000000
mean 13.795556
std 46.372839
min 1.000000
25% 1.000000
50% 1.000000
75% 6.000000
max 518.000000

It looks we have a lot of cases benefit from ipsccp-loop

@fhahn
Copy link
Contributor

fhahn commented Aug 30, 2022

Try loop runipsccp to a fix point: The loop count statistcs is here:

Is that the number of times we re-run IPSCCP? It may be interesting to see the difference in number of simplifications applied? Should be doable with the existing ipsccp stats.

@bcl5980
Copy link
Contributor Author

bcl5980 commented Aug 30, 2022

Try loop runipsccp to a fix point: The loop count statistcs is here:

Is that the number of times we re-run IPSCCP? It may be interesting to see the difference in number of simplifications applied? Should be doable with the existing ipsccp stats.

Yes, it is the number of times we re-run IPSCCP.
Is this stat you want?
IPNumInstRemoved.txt

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants