New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: BLAS Idamax regression #14995

Open
btracey opened this Issue Mar 28, 2016 · 4 comments

Comments

Projects
None yet
6 participants
@btracey
Contributor

btracey commented Mar 28, 2016

The blas routine Idamax is seeing a regression between 1.6 and tip (go version go version devel +7e88826 Mon Mar 28 14:10:21 2016 +0000 darwin/amd64).

Idamax finds the index with maximum absolute value.

go get -u -t github.com/gonum/blas/native
cd $GOPATH/src/github.com/gonum/blas/native
go test -bench=Ida -tags=noasm -count=5
IdamaxSmallUnitaryInc-8   32.4ns ± 7%  39.1ns ± 6%  +20.80%  (p=0.008 n=5+5)
IdamaxSmallPosInc-8       28.7ns ±11%  41.1ns ±11%  +43.24%  (p=0.008 n=5+5)
IdamaxMediumUnitaryInc-8  1.58µs ± 2%  2.03µs ± 2%  +27.83%  (p=0.008 n=5+5)
IdamaxMediumPosInc-8      1.86µs ± 2%  2.38µs ±11%  +27.93%  (p=0.008 n=5+5)
IdamaxLargeUnitaryInc-8    150µs ± 2%   195µs ± 1%  +30.23%  (p=0.008 n=5+5)
IdamaxLargePosInc-8        202µs ± 1%   241µs ± 2%  +19.10%  (p=0.008 n=5+5)
IdamaxHugeUnitaryInc-8    15.1ms ± 1%  21.0ms ± 2%  +39.67%  (p=0.008 n=5+5)
IdamaxHugePosInc-8        27.9ms ± 3%  30.0ms ± 2%   +7.38%  (p=0.008 n=5+5)

@randall77 @josharian

@randall77

This comment has been minimized.

Contributor

randall77 commented Mar 28, 2016

import "math"

func f(x []float64) int {
    max := 0.0
    idx := 0
    for i, v := range x {
        absV := math.Abs(v)
        if absV > max {
            max = absV
            idx = i
        }
    }
    return idx
}

It looks like SSA is spilling i during the loop. Kind of an unfortunate consequence of how registers are picked for phi ops. It registerizes i correctly at the top of the loop, but then it decides to also allocate (Phi idx i) to the same register. At the Phi it basically flips a coin as to whether to use the register of idx or i for the phi, and picks the wrong one. Not sure what the right fix would be yet. Some sort of lookahead might help.

@dr2chase

This comment has been minimized.

Contributor

dr2chase commented Mar 28, 2016

Keith, where's that coin flip? I am pretty sure I can improve it with the loop-related information. I'll go look, if not right away, then after lunch.

@randall77

This comment has been minimized.

Contributor

randall77 commented Mar 28, 2016

It goes back to the layout pass. We place the idx generating block after the phi, so the i-providing block is the primary predecessor.

@bradfitz bradfitz added this to the Go1.7 milestone Apr 7, 2016

@bradfitz bradfitz added the Performance label Apr 7, 2016

@randall77

This comment has been minimized.

Contributor

randall77 commented May 1, 2016

Didn't happen for 1.7, punting to 1.8.

@randall77 randall77 modified the milestones: Go1.8, Go1.7 May 1, 2016

@quentinmit quentinmit added the NeedsFix label Oct 10, 2016

@rsc rsc modified the milestones: Go1.9Early, Go1.8 Oct 21, 2016

@bradfitz bradfitz modified the milestones: Go1.9Early, Go1.10Early May 3, 2017

@bradfitz bradfitz modified the milestones: Go1.10Early, Go1.10 Jun 14, 2017

@bradfitz bradfitz modified the milestones: Go1.10, Go1.11 Nov 28, 2017

@bradfitz bradfitz modified the milestones: Go1.11, Unreleased Jun 13, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment