Description
Hi, go compiler combines many passes(e.g. prove,deadcode,opt,phielim,etc) to achieve an effect similar to conditional constant propagation, but it cannot discover some constant facts:
func tt(a int, b int, c int) int {
i := 1
done := 0
for i > 0 && done != 1 {
if i == 1 {
done = 1
} else {
i = i + 1
}
}
return i
}
Which generates an unnecessary loop
--go
00000 (4) TEXT main.tt(SB), ABIInternal
00001 (4) FUNCDATA $0, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
00002 (4) FUNCDATA $1, gclocals·g2BeySu+wFnoycgXfElmcg==(SB)
00003 (4) FUNCDATA $5, main.tt.arginfo1(SB)
00004 (4) FUNCDATA $6, main.tt.argliveinfo(SB)
b1
00005 (4) PCDATA $3, $1
v16
00006 (4) MOVL $1, AX
v17
00007 (4) XORL CX, CX
v24
00008 (+7) TESTQ AX, AX
b2
00009 (7) JLE 18
v27
00010 (7) CMPQ CX, $1
b6
00011 (7) JEQ 18
v31
00012 (+8) CMPQ AX, $1
b3
00013 (8) JNE 16
v19
00014 (8) MOVL $1, CX
b9
00015 (7) JMP 8
v21
00016 (+11) INCQ AX
b10
00017 (11) JMP 8
b5
00018 (+14) RET
00019 (?) END
sccp(int):
mov eax, 1
ret
--Java
[Verified Entry Point]
# {method} {0x00007fb9b4400878} 'test6' '()I' in 'Test'
# [sp+0x20] (sp of caller)
;; N1: # out( B1 ) <- in( B1 ) Freq: 1
;; B1: # out( N1 ) <- BLOCK HEAD IS JUNK Freq: 1
0x00007fb9f8b2c2a0: mov %eax,-0x18000(%rsp)
0x00007fb9f8b2c2a7: push %rbp
0x00007fb9f8b2c2a8: sub $0x10,%rsp
0x00007fb9f8b2c2ac: cmpl $0x0,0x20(%r15)
0x00007fb9f8b2c2b4: jne 0x00007fb9f8b2c2f2 ;*synchronization entry
; - Test::test6@-1 (line 93)
;;;;;;;ok
0x00007fb9f8b2c2ba: mov $0x1,%eax ;*goto {reexecute=0 rethrow=0 return_oop=0}
; - Test::test6@20 (line 97)
0x00007fb9f8b2c2bf: mov 0x3c0(%r15),%r10 ; ImmutableOopMap {}
;*goto {reexecute=1 rethrow=0 return_oop=0}
; - (reexecute) Test::test6@20 (line 97)
0x00007fb9f8b2c2c6: test %eax,(%r10) ;*goto {reexecute=0 rethrow=0 return_oop=0}
; - Test::test6@20 (line 97)
; {poll}
0x00007fb9f8b2c2c9: add $0x10,%rsp
0x00007fb9f8b2c2cd: pop %rbp
0x00007fb9f8b2c2ce: cmp 0x3b8(%r15),%rsp ; {poll_return}
0x00007fb9f8b2c2d5: ja 0x00007fb9f8b2c2dc
0x00007fb9f8b2c2db: retq
I propose to add sparse conditional constant propagation early in optimization pass. sccp optimistically assumes that all SSA values are initially Top and then propagates constant facts only along reachable control flow paths. In this way, sccp can discover optimization opportunities that cannot be found by just applying constant folding and constant propagation and dead code elimination separately. sccp uses three level lattice for SSA value, each value is evaluated at most twice due to lower, resulting in a fast convergence speed of the algorithm. I have prepared a preliminary POC for it. I believe many duplicate or case-oriented phases can be cleaned up after applying this optimization. Do you think the go language community would welcome such capability? Thanks for your comments.