Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: redundant conditional jump #29853

Open
marigonzes opened this issue Jan 21, 2019 · 6 comments

Comments

Projects
None yet
3 participants
@marigonzes
Copy link

commented Jan 21, 2019

What did you do?

Hello. I was writing some code and decided to inspect the assembly instructions generated from it.
https://godbolt.org/z/zFjGRF

What did you expect to see?

I was expecting to see no redundant instructions.

What did you see instead?

There is a redundant conditional jump on line 31.
It is redundant, because the jump from line 24 to 31 happens if AX != 0 (that is, err is != nil). But on line 31, the conditional jump tests if AX == 0, which is never true.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Jan 21, 2019

Yeah, that looks bad.

Digging through the SSA, looks like we're not using a consistent type for itabs. Some are uintptr, some are *uintptr. That keeps the itabs from being CSEd, which keeps the branches from being unified.

Shouldn't be a hard fix. Leaving for 1.13.

@randall77 randall77 added this to the Go1.13 milestone Jan 21, 2019

@josharian

This comment has been minimized.

Copy link
Contributor

commented Jan 21, 2019

That keeps the itabs from being CSEd, which keeps the branches from being unified.

Huh. I'm seeing something different. The uintptr/*uint8 mismatch is for an itab vs type descriptor loaded from that itab. And at the end of the SSA, we have a fully empty block whose kind and control perfectly matches its predecessor. I think we simply don't have any mechanism anywhere that removes blocks in this situation. I hacked in an enhancement to the trim pass to handle only the most obvious of these cases and it triggered 800+ times during make.bash.

In case anyone wants to look at this again before I do (dunno when that might be), here's my replacement for trimmableBlock:

func trimmableBlock(b *Block) bool {
	if b == b.Func.Entry {
		return false
	}
	switch b.Kind {
	case BlockDefer, BlockRet, BlockRetJmp, BlockExit:
		return false
	case BlockPlain:
		s := b.Succs[0].b
		return s != b && (len(s.Preds) == 1 || emptyBlock(b))
	}
	if len(b.Preds) != 1 {
		return false
	}
	p := b.Preds[0].b
	if p.Kind != b.Kind || p.Control != b.Control {
		// TODO: detect inverted kind?
		return false
	}
	s0, s1 := b.Succs[0].b, b.Succs[1].b
	if s0 == b || s1 == b {
		return false
	}
	if p.Succs[0].b != b { // TODO: allow b in either position
		return false
	}
	return true
}

Things to do here include: make the code less ugly, fix a panic that this induces when generating debug info, handle inverted block kinds (i.e. NE b1 b2 === EQ b2 b1), and handle the case in which the duplicated block is not the first successor of its predecessor (which requires fixing up the calling code in trim).

cc @ysmolsky @mvdan

@randall77

This comment has been minimized.

Copy link
Contributor

commented Jan 21, 2019

For the code:

package main

var e error

func main() {
	err := e
	if err != nil {
		panic(err)
	}
}

After CSE, we have the following SSA:

b1:-
v1 (?) = InitMem <mem>
v2 (?) = SP <uintptr>
v3 (?) = SB <uintptr>
v4 (?) = Addr <*error> {"".e} v3
v5 (+6) = Load <error> v4 v1 (err[error])
v17 (?) = OffPtr <*interface {}> [0] v2
v22 (?) = ConstNil <uintptr>
v19 (?) = ConstNil <*uint8>
v13 (+7) = ITab <uintptr> v5
v6 (?) = IMake <error> v22 v19
v12 (+7) = ITab <uintptr> v6
v7 (+7) = NeqPtr <bool> v13 v12
If v7 → b2 b3 (7)
b2: ← b1-
v9 (+8) = ITab <*uintptr> v5
v11 (8) = IsNonNil <bool> v9
If v11 → b4 b5 (8)
b3: ← b1
Ret v1
b4: ← b2-
v14 (8) = NilCheck <void> v9 v1
v15 (8) = OffPtr <**uint8> [8] v9
v16 (8) = Load <*uint8> v15 v1
Plain → b5 (8)
b5: ← b2 b4-
v18 (8) = Phi <*uint8> v9 v16
v20 (8) = IData <*uint8> v5
v21 (8) = IMake <interface {}> v18 v20
v23 (8) = Store <mem> {interface {}} v17 v21 v1
v24 (8) = StaticCall <mem> {runtime.gopanic} [16] v23
Exit v24 (8)

v9 and v13 should be CSEd, I think. But they aren't as one is uintptr and the other is *uintptr.
Then the IsNonNil and NeqPtr need to be CSEd also. Not sure if that is a separate problem, or will naturally fall out once v9 and v13 are unified.

@josharian

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

Right. But after decompose builtin, both v9 and v13 have been eliminated in favor of v14:

b1:-
v1 (?) = InitMem <mem>
v2 (?) = SP <uintptr>
v3 (?) = SB <uintptr>
v4 (?) = Addr <*error> {"".e} v3
v17 (?) = OffPtr <*interface {}> [0] v2
v22 (?) = ConstNil <uintptr>
v19 (?) = ConstNil <*uint8>
v14 (+11) = Load <uintptr> v4 v1 (err.itab[uintptr])
v25 (+11) = OffPtr <**uint8> [8] v4
v6 (?) = IMake <error> v22 v19
v7 (+13) = NeqPtr <bool> v14 v22
v10 (+11) = Load <*uint8> v25 v1 (err.data[*uint8])
v5 (+11) = IMake <error> v14 v10
If v7 → b2 b3 (13)
b2: ← b1-
v11 (+15) = IsNonNil <bool> v14
If v11 → b4 b5 (15)
b3: ← b1
Ret v1
b4: ← b2-
v15 (15) = OffPtr <**uint8> [8] v14
v16 (15) = Load <*uint8> v15 v1
Plain → b5 (15)
b5: ← b2 b4-
v18 (15) = Phi <*uint8> v14 v16
v21 (15) = IMake <interface {}> v18 v10
v8 (15) = OffPtr <**uint8> [8] v17
v26 (15) = Store <mem> {uintptr} v17 v18 v1
v23 (15) = Store <mem> {*uint8} v8 v10 v26
v24 (15) = StaticCall <mem> {runtime.gopanic} [16] v23
Exit v24 (15)
name err.itab[uintptr]: v14
name err.data[*uint8]: v10

Then lowered CSE manages to combine the IsNonNil and NeqPtr into v19:

(This dump is from two passes later, to eliminate the dead values)

b1:-
v1 (?) = InitMem <mem>
v2 (?) = SP <uintptr>
v3 (?) = SB <uintptr>
v14 (+11) = MOVQload <uintptr> {"".e} v3 v1 (err.itab[uintptr])
v10 (+11) = MOVQload <*uint8> {"".e} [8] v3 v1 (err.data[*uint8])
v19 (+13) = TESTQ <flags> v14 v14
NE v19 → b2 b3 (13)
b2: ← b1
NE v19 → b4 b5 (+15)
b3: ← b1
Ret v1
b4: ← b2-
v16 (15) = MOVQload <*uint8> [8] v14 v1
Plain → b5 (15)
b5: ← b2 b4-
v18 (15) = Phi <*uint8> v14 v16
v26 (15) = MOVQstore <mem> v2 v18 v1
v23 (15) = MOVQstore <mem> [8] v2 v10 v26
v24 (15) = CALLstatic <mem> {runtime.gopanic} [16] v23
Exit v24 (15)
name err.itab[uintptr]: v14
name err.data[*uint8]: v10

So all is well, except that we don't eliminate the duplicate conditional block.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

I guess I was thinking that prove would remove the duplicate block if the comparisons were redundant.
That would require CSE happening early enough. You're fixing it after lowering, when the CSE currently happens, but by then it is too late for prove. So you're proposing to add a simple prove pass to trim. That would work, of course, but I think it would be in general better to catch this earlier if we can.

@josharian

This comment has been minimized.

Copy link
Contributor

commented Jan 22, 2019

Makes sense. Seems like there's some experimentation to do. FWIW, I can tell you that switching that particular *uint8 to uintptr isn't hard--I did it and discovered that it didn't fix this issue by itself, which is why I went digging further. :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.