Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: BLAS Idamax regression #14995

Open
btracey opened this issue Mar 28, 2016 · 4 comments
Open

cmd/compile: BLAS Idamax regression #14995

btracey opened this issue Mar 28, 2016 · 4 comments

Comments

@btracey
Copy link
Contributor

@btracey btracey commented Mar 28, 2016

The blas routine Idamax is seeing a regression between 1.6 and tip (go version go version devel +7e88826 Mon Mar 28 14:10:21 2016 +0000 darwin/amd64).

Idamax finds the index with maximum absolute value.

go get -u -t github.com/gonum/blas/native
cd $GOPATH/src/github.com/gonum/blas/native
go test -bench=Ida -tags=noasm -count=5
IdamaxSmallUnitaryInc-8   32.4ns ± 7%  39.1ns ± 6%  +20.80%  (p=0.008 n=5+5)
IdamaxSmallPosInc-8       28.7ns ±11%  41.1ns ±11%  +43.24%  (p=0.008 n=5+5)
IdamaxMediumUnitaryInc-8  1.58µs ± 2%  2.03µs ± 2%  +27.83%  (p=0.008 n=5+5)
IdamaxMediumPosInc-8      1.86µs ± 2%  2.38µs ±11%  +27.93%  (p=0.008 n=5+5)
IdamaxLargeUnitaryInc-8    150µs ± 2%   195µs ± 1%  +30.23%  (p=0.008 n=5+5)
IdamaxLargePosInc-8        202µs ± 1%   241µs ± 2%  +19.10%  (p=0.008 n=5+5)
IdamaxHugeUnitaryInc-8    15.1ms ± 1%  21.0ms ± 2%  +39.67%  (p=0.008 n=5+5)
IdamaxHugePosInc-8        27.9ms ± 3%  30.0ms ± 2%   +7.38%  (p=0.008 n=5+5)

@randall77 @josharian

@randall77
Copy link
Contributor

@randall77 randall77 commented Mar 28, 2016

import "math"

func f(x []float64) int {
    max := 0.0
    idx := 0
    for i, v := range x {
        absV := math.Abs(v)
        if absV > max {
            max = absV
            idx = i
        }
    }
    return idx
}

It looks like SSA is spilling i during the loop. Kind of an unfortunate consequence of how registers are picked for phi ops. It registerizes i correctly at the top of the loop, but then it decides to also allocate (Phi idx i) to the same register. At the Phi it basically flips a coin as to whether to use the register of idx or i for the phi, and picks the wrong one. Not sure what the right fix would be yet. Some sort of lookahead might help.

@dr2chase
Copy link
Contributor

@dr2chase dr2chase commented Mar 28, 2016

Keith, where's that coin flip? I am pretty sure I can improve it with the loop-related information. I'll go look, if not right away, then after lunch.

@randall77
Copy link
Contributor

@randall77 randall77 commented Mar 28, 2016

It goes back to the layout pass. We place the idx generating block after the phi, so the i-providing block is the primary predecessor.

@bradfitz bradfitz added this to the Go1.7 milestone Apr 7, 2016
@randall77
Copy link
Contributor

@randall77 randall77 commented May 1, 2016

Didn't happen for 1.7, punting to 1.8.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants