New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: inefficient increment of global array #10432

Open
dvyukov opened this Issue Apr 13, 2015 · 7 comments

Comments

Projects
None yet
6 participants
@dvyukov
Member

dvyukov commented Apr 13, 2015

go version devel +a02d925 Sun Apr 12 12:14:36 2015 +0300 linux/amd64

Program is:

var tab *[256]byte

func foo(i int) {
    tab[i&255]++
}

Currently it is compiled to:

0000000000400c30 <main.foo>:
  400c30:       64 48 8b 0c 25 f8 ff    mov    %fs:0xfffffffffffffff8,%rcx
  400c37:       ff ff 
  400c39:       48 3b 61 10             cmp    0x10(%rcx),%rsp
  400c3d:       77 07                   ja     400c46 <main.foo+0x16>
  400c3f:       e8 cc 24 04 00          callq  443110 <runtime.morestack_noctxt>
  400c44:       eb ea                   jmp    400c30 <main.foo>
  400c46:       48 8b 44 24 08          mov    0x8(%rsp),%rax
  400c4b:       48 8b 1d 36 a0 0a 00    mov    0xaa036(%rip),%rbx        # 4aac88 <main.tab>
  400c52:       48 25 ff 00 00 00       and    $0xff,%rax
  400c58:       48 83 fb 00             cmp    $0x0,%rbx
  400c5c:       74 41                   je     400c9f <main.foo+0x6f>
  400c5e:       48 3d 00 01 00 00       cmp    $0x100,%rax
  400c64:       73 32                   jae    400c98 <main.foo+0x68>
  400c66:       48 8d 1c 03             lea    (%rbx,%rax,1),%rbx
  400c6a:       0f b6 2b                movzbl (%rbx),%ebp
  400c6d:       48 8b 1d 14 a0 0a 00    mov    0xaa014(%rip),%rbx        # 4aac88 <main.tab>
  400c74:       48 83 fb 00             cmp    $0x0,%rbx
  400c78:       74 1a                   je     400c94 <main.foo+0x64>
  400c7a:       48 3d 00 01 00 00       cmp    $0x100,%rax
  400c80:       73 0b                   jae    400c8d <main.foo+0x5d>
  400c82:       48 8d 1c 03             lea    (%rbx,%rax,1),%rbx
  400c86:       48 ff c5                inc    %rbp
  400c89:       40 88 2b                mov    %bpl,(%rbx)
  400c8c:       c3                      retq   
  400c8d:       e8 be a3 01 00          callq  41b050 <runtime.panicindex>
  400c92:       0f 0b                   ud2    
  400c94:       89 03                   mov    %eax,(%rbx)
  400c96:       eb e2                   jmp    400c7a <main.foo+0x4a>
  400c98:       e8 b3 a3 01 00          callq  41b050 <runtime.panicindex>
  400c9d:       0f 0b                   ud2    
  400c9f:       89 03                   mov    %eax,(%rbx)
  400ca1:       eb bb                   jmp    400c5e <main.foo+0x2e>

While it could be compiled to something along the lines of:

0000000000400c30 <main.foo>:
    mov 0x8(%rsp),%rax
    mov 0xaa036(%rip),%rbx        # 4aac88 <main.tab>
    add $1, (%rbx, %rax)
    ret

Here are several issues:

  1. Address calculation, bounds check and nil check are done twice.
  2. Nil check is not necessary here, as the increment itself will trigger paging fault if tab is nil.
  3. Bounds check is not necessary as i&255 is guaranteed to be <256.
@dvyukov

This comment has been minimized.

Member

dvyukov commented Apr 13, 2015

FTR, the full program is:

package main

func main() {
    foo(42)
}

var tab *[256]byte

func foo(i int) {
    tab[i&255]++
}

build with -l.

@dvyukov

This comment has been minimized.

Member

dvyukov commented Apr 13, 2015

@ianlancetaylor ianlancetaylor changed the title from cmd/gc: inefficient increment of global array to cmd/compile: inefficient increment of global array Jun 3, 2015

@ianlancetaylor ianlancetaylor added this to the Go1.6 milestone Jun 3, 2015

@rsc rsc modified the milestones: Unplanned, Go1.6 Nov 4, 2015

@navytux

This comment has been minimized.

Contributor

navytux commented Mar 25, 2017

For the reference, today foo compiles to this:

TEXT ·foo(SB), $0-8 // dv.go:9
        NO_LOCAL_POINTERS
        // FUNCDATA $0, gclocals·2a5305abe05176240e61b8620e19a815(SB) (args)
        // FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (no locals)
        MOVQ       ·tab(SB), AX  // dv.go:10
        TESTB      AL, (AX)
        MOVQ       i+0(FP), CX
        MOVBLZX    CL, CX
        MOVBLZX    (AX)(CX*1), DX
        INCL       DX
        MOVB       DL, (AX)(CX*1)
        RET                       // dv.go:11

TESTB AL, (AX) seems to be not needed, and probably also MOVQ i+0(FP), CX and MOVBLZX CL, CX could be combined into one MOVBLZX from stack (not sure here), but otherwise foo looks to be much closer to optimal than it was in 2015.

@josharian

This comment has been minimized.

Contributor

josharian commented Mar 25, 2017

I'm surprised that the nil check isn't eliminated. It should be. @cherrymui?

Combining the stack load and the zero extension is #15300.

@randall77

This comment has been minimized.

Contributor

randall77 commented Mar 25, 2017

I don't think we do any nil check removal on non-constant array indexing. We could.

@navytux

This comment has been minimized.

Contributor

navytux commented May 4, 2018

For the reference: with today's tip (go version devel +8aa316c139 Fri May 4 17:01:04 2018 +0000 linux/amd64) original program compiles to essentially the same as in #10432 (comment):

TEXT ·foo(SB), NOSPLIT, $0-8 // dv.go:9
        NO_LOCAL_POINTERS
        // FUNCDATA $0, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (args)
        // FUNCDATA $1, gclocals·33cdeccccebe80329f1fdbee7f5874cb(SB) (no locals)
        MOVQ       ·tab(SB), AX  // dv.go:10
        TESTB      AL, (AX)
        MOVQ       i+8(SP), CX    // dv.go:9
        MOVBLZX    CL, CX         // dv.go:10
        MOVBLZX    (AX)(CX*1), DX
        INCL       DX
        MOVB       DL, (AX)(CX*1)
        RET                       // dv.go:11

There is ongoing CL 91415 which teaches nilcheckelim to understand that offsets from non-nil pointers are also non-nil. However it currently works with only fixed offsets, not non-constant indexing.

/cc @mrosier-qdt

@navytux

This comment has been minimized.

Contributor

navytux commented May 4, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment