Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/compile: Use SDIV and UDIV for ARM #19118

Closed
benshi001 opened this issue Feb 16, 2017 · 15 comments

Comments

Projects
None yet
8 participants
@benshi001
Copy link
Member

commented Feb 16, 2017

The UDIV and SDIV instructions are optional on ARM, so arm gcc generates __armeabi_udiv(a, b) for a/b by default, but it also emits "udiv a, b" while -march=armv7ve is specified.

Golang should also allow user to choose hardware or software division. Maybe by adding a GOARMHDIV environment variable?

@minux

This comment has been minimized.

Copy link
Member

commented Feb 16, 2017

@benshi001

This comment has been minimized.

Copy link
Member Author

commented Feb 16, 2017

A hardware divider is usually much faster than a software one. However, there is no proper way to let the div routine decide in runtime.

  1. A register of ARM shows whether hardware dividers are integrated. And this register is only accessible in PL1, while normal linux programs run in PL0.
    2.A PL0 program can read /proc/cpuinfo for "idiva idivt" flags, but a div routine should not involve a file operation.
@benshi001

This comment has been minimized.

Copy link
Member Author

commented Feb 16, 2017

There is no proper way to let the div routine decide in runtime. Unless the user specifies it explicitly.

@randall77

This comment has been minimized.

Copy link
Contributor

commented Feb 16, 2017

It is easy enough to check /proc/cpuinfo once on startup and cache the result.

@benshi001

This comment has been minimized.

Copy link
Member Author

commented Feb 16, 2017

Check /proc/cpuinfo for "idiva" flag on startup might be a way, how are the core developers' opinion?

@cherrymui

This comment has been minimized.

Copy link
Contributor

commented Feb 16, 2017

Checking on startup sounds good to me.

@minux

This comment has been minimized.

Copy link
Member

commented Feb 16, 2017

@benshi001

This comment has been minimized.

Copy link
Member Author

commented Feb 16, 2017

I would vote for call getauxval() / AT_HWCAP at startup.

@davecheney

This comment has been minimized.

Copy link
Contributor

commented Feb 16, 2017

@josharian josharian changed the title Use SDIV and UDIV for ARM cmd/compile: Use SDIV and UDIV for ARM Feb 17, 2017

@benshi001

This comment has been minimized.

Copy link
Member Author

commented Feb 27, 2017

I have implemented this feature.
https://go-review.googlesource.com/#/c/37496/

A rough test shows the performance improves 40-50%.

@benshi001

This comment has been minimized.

Copy link
Member Author

commented Feb 27, 2017

For a rough test case

**package main

import "fmt"
import "math/rand"
import "time"

func main() {
var c, g int
var a [70000000]uint32
var b [70000000]uint32
var d [70000000]uint32

r := rand.New(rand.NewSource(time.Now().UnixNano()))

for c = 0; c < cap(b); c++ {
	a[c] = uint32(r.Intn(0x7ffffff0))
	b[c] = uint32(r.Intn(0x3ffffff0))
}

for g = 0; g < 10; g++ {
	k := time.Now()
	for c = 0; c < cap(b); c++ {
		d[c] = a[c] / b[c]
	}
	w := time.Now()
	fmt.Println(w.Sub(k))
}

}**

The hardware divider outputs
5.4371964s
4.209324055s
4.205575531s
4.205909284s
4.205245892s
4.218614714s
4.210164271s
4.205585791s
4.205325164s
4.207941386s

And the software divider outputs
8.238922377s
5.943205893s
5.911372837s
5.914151872s
5.909159482s
5.933686273s
5.909419953s
5.90930063s
5.913579159s
5.909221097s

@gopherbot

This comment has been minimized.

Copy link

commented Mar 2, 2017

CL https://golang.org/cl/37496 mentions this issue.

@bradfitz bradfitz added this to the Go1.9 milestone Mar 21, 2017

@benshi001

This comment has been minimized.

Copy link
Member Author

commented Mar 31, 2017

How do we proceed in CL 37496? Keep the simulated DIV/DIVU/MOD/MODU, or remove them?

@benshi001

This comment has been minimized.

Copy link
Member Author

commented Apr 1, 2017

In patch set 14 of CL 37496,

  1. rebased to the newest master branch
  2. keep old DIV/DIVU/MOD/MODU while add DIVHW/DIVUHW
@benshi001

This comment has been minimized.

Copy link
Member Author

commented Apr 7, 2017

Any conclusion for this issue? Whether keep simulated div/mod or not?

@gopherbot gopherbot closed this in 69261ec Apr 11, 2017

lparth added a commit to lparth/go that referenced this issue Apr 13, 2017

runtime: use hardware divider to improve performance
The hardware divider is an optional component of ARMv7. This patch
detects whether it is available in runtime and use it or not.

1. The hardware divider is detected at startup and a flag is set/clear
   according to a perticular bit of runtime.hwcap.
2. Each call of runtime.udiv will check this flag and decide if
   use the hardware division instruction.

A rough test shows the performance improves 40-50% for ARMv7. And
the compatibility of ARMv5/v6 is not broken.

fixes golang#19118

Change-Id: Ic586bc9659ebc169553ca2004d2bdb721df823ac
Reviewed-on: https://go-review.googlesource.com/37496
Run-TryBot: Cherry Zhang <cherryyz@google.com>
TryBot-Result: Gobot Gobot <gobot@golang.org>
Reviewed-by: Cherry Zhang <cherryyz@google.com>

@golang golang locked and limited conversation to collaborators Apr 11, 2018

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
You can’t perform that action at this time.