New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/internal/obj/x86: add fma #8037

Open
gopherbot opened this Issue May 20, 2014 · 9 comments

Comments

Projects
None yet
9 participants
@gopherbot

gopherbot commented May 20, 2014

by odysseus9672:

What does 'go version' print?
go version devel +5b9ac653acf6 Mon May 19 22:57:59 2014 -0400 darwin/amd64

What steps reproduce the problem?
If possible, include a link to a program on play.golang.org.

1. Go to $GOROOT/src/cmd/6l/6.out.h
2. According to http://golang.org/doc/asm#architectures , this file contains a list of
all the assembler instructions Go recognizes.
3. No fused-multiply-add instructions are in the list.

Please provide any additional information below.

Using FMA allows for several faster and more accurate versions of several algorithms. My
current main use case is polynomial evaluation (Horner's scheme and Estrin's scheme are
both built on FMA), but another very important one is matrix multiplication. Adding at
least a subset of the FMA instructions available on newer AMD and Intel CPUs (available
on PowerPC since at least the 600 series) would improve the ability of performance
minded coders and compiler optimization writers to make faster running code. 

For instance, a more accurate version of the update in the Mandelbrot set is:
if the starting point is Cr + i*Ci (and halfCi = 0.5 * Ci)
Zr, Zi = fma( Zr - Zi, Zr + Zi, Cr ), 2 * fma( Zr, Zi, halfCi )

The update to Zr is likely slower than what is presently done for the The Computer
Language Benchmarks Game, but an fma based update to Zi would definitely speed things up.

Granted, the speedup would only materialize when issue #4978 (
https://golang.org/issue/4978 ) is resolved, but having code in
useful libraries in place to take advantage of that optimization would be nice instead
of having to recode everything afterwords.

To that end, I would request exposing at least the following (would do it myself if I
could find an understandable listing of the opcodes and documentation on how to add them
to the compiler): 

For greatest compatibility with AMD chips, 4 operand forms ( from
http://en.wikipedia.org/wiki/X86_instruction_listings#FMA_instructions ): 
VFMADDPD, VFMADDPS, VFMADDSD, VFMADDSS.

And for Intel chips, the 3 operand forms ( from Vol. 1 14-21 of Intel's developer's
manual:
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html?iid=tech_vt_tech+64-32_manuals
):
VFMADD231PD, VFMADD231PS, VFMADD231SD, VFMADD231SS

All of the other operations are permutations of minus signs applied to these (and which
register gets clobbered for the 3 op version - the one I'm requesting is the one that
works best with polynomial evaluation, naturally). 

Solving this issue would also make issue #681 more straightforward (
https://golang.org/issue/681 ), although even having the opcodes
would be enough for that one.
@ianlancetaylor

This comment has been minimized.

Show comment
Hide comment
@ianlancetaylor

ianlancetaylor May 20, 2014

Contributor

Comment 1:

Labels changed: added repo-main, release-go1.4.

Contributor

ianlancetaylor commented May 20, 2014

Comment 1:

Labels changed: added repo-main, release-go1.4.

@btracey

This comment has been minimized.

Show comment
Hide comment
@btracey

btracey Jun 11, 2014

Contributor

Comment 2:

Were the compiler to add and begin to support this feature, the Daxpy function in
https://github.com/gonum/blas/blob/master/goblas/level1double.go is a good place to
start.
Contributor

btracey commented Jun 11, 2014

Comment 2:

Were the compiler to add and begin to support this feature, the Daxpy function in
https://github.com/gonum/blas/blob/master/goblas/level1double.go is a good place to
start.
@rsc

This comment has been minimized.

Show comment
Hide comment
@rsc

rsc Sep 15, 2014

Contributor

Comment 3:

Labels changed: added release-go1.5, removed release-go1.4.

Status changed to Accepted.

Contributor

rsc commented Sep 15, 2014

Comment 3:

Labels changed: added release-go1.5, removed release-go1.4.

Status changed to Accepted.

@bradfitz bradfitz modified the milestone: Go1.5 Dec 16, 2014

@bradfitz bradfitz removed the release-go1.5 label Dec 16, 2014

@rsc rsc removed accepted labels Apr 14, 2015

@rsc rsc modified the milestones: Unplanned, Go1.5 Apr 26, 2015

@rsc rsc changed the title from cmd/6a: Assembler Lacks FMA Instructions (Feature Request) to cmd/internal/obj/x86: add fma Jun 8, 2015

@gopherbot

This comment has been minimized.

Show comment
Hide comment
@gopherbot

gopherbot commented Jan 22, 2016

CL https://golang.org/cl/18850 mentions this issue.

gopherbot pushed a commit that referenced this issue Jan 24, 2016

cmd/asm: add generated test of amd64 instruction encodings
Generated by x86test, from https://golang.org/cl/18842
(still in progress).

The commented out lines are either missing or misspelled
or incorrectly handled instructions.

For #4816, #8037, #13822, #14068, #14069.

Change-Id: If309310c97d9d2a3c71fc64c51d4a957e9076ab7
Reviewed-on: https://go-review.googlesource.com/18850
Reviewed-by: Rob Pike <r@golang.org>
Run-TryBot: Russ Cox <rsc@golang.org>
TryBot-Result: Gobot Gobot <gobot@golang.org>
@odysseus9672

This comment has been minimized.

Show comment
Hide comment
@odysseus9672

odysseus9672 Oct 5, 2016

Is there documentation for how to add commands to the assembler?

This documentation says that the process is straightforward, and even describes how to use unsupported instructions with known opcodes.

odysseus9672 commented Oct 5, 2016

Is there documentation for how to add commands to the assembler?

This documentation says that the process is straightforward, and even describes how to use unsupported instructions with known opcodes.

@randall77

This comment has been minimized.

Show comment
Hide comment
@randall77

randall77 Oct 5, 2016

Contributor

@odysseus9672 I don't think there is any documentation, but you can follow other CLs that added instructions (e.g. https://go-review.googlesource.com/c/14127/). git blame will give you a more comprehensive list.

Contributor

randall77 commented Oct 5, 2016

@odysseus9672 I don't think there is any documentation, but you can follow other CLs that added instructions (e.g. https://go-review.googlesource.com/c/14127/). git blame will give you a more comprehensive list.

@bradfitz

This comment has been minimized.

Show comment
Hide comment
@bradfitz

bradfitz Aug 8, 2017

Member

@randall77, planning on doing this for Go 1.10?

Member

bradfitz commented Aug 8, 2017

@randall77, planning on doing this for Go 1.10?

@josharian

This comment has been minimized.

Show comment
Hide comment
@josharian

josharian Aug 8, 2017

Contributor

If the compiler is to emit these, I think this would require a GOAMD64, since FMA is not part of the minimum supported amd64 instruction set: #19593

Contributor

josharian commented Aug 8, 2017

If the compiler is to emit these, I think this would require a GOAMD64, since FMA is not part of the minimum supported amd64 instruction set: #19593

@Quasilyte

This comment has been minimized.

Show comment
Hide comment
@Quasilyte

Quasilyte Dec 26, 2017

Contributor

And for Intel chips, the 3 operand forms ( from Vol. 1 14-21 of Intel's developer's
manual:
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html?iid=tech_vt_tech+64-32_manuals
):
VFMADD231PD, VFMADD231PS, VFMADD231SD, VFMADD231SS

Implemented in https://go-review.googlesource.com/#/c/go/+/75490/.
Included in Go1.10.

Contributor

Quasilyte commented Dec 26, 2017

And for Intel chips, the 3 operand forms ( from Vol. 1 14-21 of Intel's developer's
manual:
http://www.intel.com/content/www/us/en/processors/architectures-software-developer-manuals.html?iid=tech_vt_tech+64-32_manuals
):
VFMADD231PD, VFMADD231PS, VFMADD231SD, VFMADD231SS

Implemented in https://go-review.googlesource.com/#/c/go/+/75490/.
Included in Go1.10.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment