Skip to content

cmd/compile: performance regression with SSA backend #14606

@dgryski

Description

@dgryski

Please answer these questions before submitting your issue. Thanks!

  1. What version of Go are you using (go version)?
    go version devel +0d1a98e Wed Mar 2 13:01:44 2016 +0000 linux/amd64
    and
    go version go1.6 linux/amd64
  2. What operating system and processor architecture are you using (go env)?
    linux/amd64
  3. What did you do?
    I ran some benchmarks for a consistent hashing algorithm: (multi-probe consistent hashing, code at https://github.com/dgryski/go-mpchash ) with the new SSA backend and also with 1.6.
 ~/work/src/cvs/go.tip/bin/go test -test.bench=Lookup -test.count=20 >choose.ssa
go test -test.bench=Lookup -test.count=20 >choose.16
  1. What did you expect to see?

Not performance regressions.

  1. What did you see instead?

The number after the benchmark is the number of shards. For 8 and 32 shards, the new SSA backend is faster. For larger numbers of shards, the lookup time is worse.

<dgryski@kamek[go-mpchash] \ʕ◔ϖ◔ʔ/ > benchstat choose.16 choose.ssa 
name          old time/op  new time/op  delta
Lookup8-4      248ns ± 1%   209ns ± 3%  -15.67%  (p=0.000 n=20+17)
Lookup32-4     262ns ± 2%   221ns ± 2%  -15.77%  (p=0.000 n=19+18)
Lookup128-4    410ns ± 1%   410ns ± 1%     ~     (p=0.482 n=20+19)
Lookup512-4    458ns ± 1%   502ns ± 1%   +9.58%  (p=0.000 n=19+20)
Lookup2048-4   501ns ± 1%   548ns ± 1%   +9.41%  (p=0.000 n=20+20)
Lookup8192-4   584ns ± 1%   624ns ± 2%   +6.97%  (p=0.000 n=18+20)

To determine if this was runtime changes or the SSA backend, I also ran it against tip with SSA off (GOSSAHASH=x). Between 1.6 and tip (without SSA), the performance regression is ~5%.

<dgryski@kamek[go-mpchash] \ʕ◔ϖ◔ʔ/ > benchstat choose.16 choose.tip
name          old time/op  new time/op  delta
Lookup8-4      248ns ± 1%   246ns ± 1%  -1.13%  (p=0.000 n=20+20)
Lookup32-4     262ns ± 2%   276ns ± 2%  +5.16%  (p=0.000 n=19+19)
Lookup128-4    410ns ± 1%   431ns ± 2%  +5.31%  (p=0.000 n=20+20)
Lookup512-4    458ns ± 1%   485ns ± 1%  +5.83%  (p=0.000 n=19+20)
Lookup2048-4   501ns ± 1%   540ns ± 3%  +7.77%  (p=0.000 n=20+20)
Lookup8192-4   584ns ± 1%   624ns ± 2%  +7.00%  (p=0.000 n=18+20)

Between tip and tip-with-ssa, it's faster for 8 and 32 shards, but then only slight changes at larger numbers.

<dgryski@kamek[go-mpchash] \ʕ◔ϖ◔ʔ/ > benchstat choose.tip choose.ssa
name          old time/op  new time/op  delta
Lookup8-4      246ns ± 1%   209ns ± 3%  -14.71%  (p=0.000 n=20+17)
Lookup32-4     276ns ± 2%   221ns ± 2%  -19.90%  (p=0.000 n=19+18)
Lookup128-4    431ns ± 2%   410ns ± 1%   -4.95%  (p=0.000 n=20+19)
Lookup512-4    485ns ± 1%   502ns ± 1%   +3.55%  (p=0.000 n=20+20)
Lookup2048-4   540ns ± 3%   548ns ± 1%   +1.52%  (p=0.000 n=20+20)
Lookup8192-4   624ns ± 2%   624ns ± 2%     ~     (p=0.506 n=20+20)

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions