New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

internal/cpu: MIPS[64] feature detection #26538

Open
smasher164 opened this Issue Jul 22, 2018 · 15 comments

Comments

Projects
None yet
5 participants
@smasher164
Copy link
Contributor

smasher164 commented Jul 22, 2018

This might already be in the works, but certain mathematical functions can be sped up by taking advantage of built-in MIPS instructions. In my case, an FMA intrinsic (#25819) can use MADDF.D fd, fs, ft, which is available to a subset of mips64r6. The cp0.Config0, cp0.Config1, and cp1.FIR registers contain the necessary information to detect this feature. Unfortunately, reading from cp0 requires the privileged instruction mfc0.

That means that the following code to detect FMA support on mips64 will receive a SIGILL on the marked line:

TEXT ·isFMASupported(SB),NOSPLIT,$0
#ifndef GOMIPS64_softfloat
    // CP0.Config0[14:13] == 0b10 <==> Release 6
    MOVW M16, R8 // <-- mfc0 causes a SIGILL
    SRL $13, R8
    AND $3, R8
    MOVW $2, R9
    BNE R9, R8, nosupport
    // CP0.Config1[1:0] == 1 <==> FPU enabled
    // mfc0 8, 16, 1
    WORD $0x40088001
    AND $1, R8
    BEQ R0, R9, nosupport
    // CP1.FIR[18:17] == 1 <==> Double-precision operations supported
    // CFC1 $0, R8
    MOVW FCR0, R8
    SRL $17, R8
    AND $1, R8
    MOVB R8, ret(FP)
nosupport:
#endif
    RET

If internal/cpu exports the required Config registers as described in sections 9.42 and 9.43 in https://s3-eu-west-1.amazonaws.com/downloads-mips/documents/MD00091-2B-MIPS64PRA-AFP-05.04.pdf, runtime specialization of MIPS code will become much simpler.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jul 22, 2018

I'm a bit confused as to how internal/cpu is going to run the privileged instruction successfully. It runs at the same privilege as your example code does.

@smasher164

This comment has been minimized.

Copy link
Contributor

smasher164 commented Jul 22, 2018

I should clarify that internal/cpu would use the HWCAP bits to provide this info.

@randall77

This comment has been minimized.

Copy link
Contributor

randall77 commented Jul 22, 2018

What package(s) are you intending to use this in?
The reason I ask is because if it's only a single package, it might be better for this code to live in that package. Only if multiple packages want it would it be better off in internal/cpu.

@smasher164

This comment has been minimized.

Copy link
Contributor

smasher164 commented Jul 22, 2018

The use-case is for the math package, where I'm working on an FMA intrinsic. There's a lot of room for specialization though, since almost none of the functions use a built-in instruction. crypto can benefit as well, since SIMD also needs to be detected by accessing cp0.

@martisch

This comment has been minimized.

Copy link
Member

martisch commented Jul 23, 2018

I think all cpu feature detection code should be in internal/cpu as it provides a single point of allowing to turn off features and thereby can be used to benchmark and test different code paths and consolidates all feature detection code in the std library and runtime into one place. Starting in internal/cpu also avoids having to move code later or missing that the detection is already implemented elsewhere. There already are some features e.g. HasFMA in internal/cpu that are only used in math. Some architectures e.g. arm, arm64 and ppc64 use hwcap in internal/cpu to detect features so adding mips and mips64 support once needed in math should be able to be added in the same manner to internal/cpu. However, only features that are used by the runtime/compiler or std lib should be added.

@smasher164

This comment has been minimized.

Copy link
Contributor

smasher164 commented Jul 23, 2018

Here is a working set that can be covered by AT_HWCAP, the FIR register, and some heuristics.

// From AT_HWCAP
IsRelease6 (more portably detected via https://github.com/v8mips/v8mips/issues/97).
HasSIMD
HasCRC32

// From FIR
Has3DASE
HasPairedSingle
HasDoublePrecision
HasSinglePrecision

What should we return for users who have enabled an environment variable, i.e. softfloat? Does internal/cpu accommodate envvars or should standard library functions that use internal/cpu accommodate envvars?

That begs the question:

HasFP <-- Is this necessary since we assume that FP operations work?
@martisch

This comment has been minimized.

Copy link
Member

martisch commented Jul 23, 2018

internal/cpu up until now tried to use for naming what linux reported for cpu flags (e.g. with /proc/cpuinfo) or used as hwcap naming.

e.g. for mips and hwcap i find only these:
https://github.com/torvalds/linux/blob/master/arch/mips/include/uapi/asm/hwcap.h

There is only one env var that is relevant to internal/cpu at the moment and that is GODEBUGCPU to mask cpu feature detection for debugging and benchmarking. f045ddc

I dont think any other setting should effect internal/cpu feature variables at the moment. If the setting is not from HWCAP or feature flag registers of the cpu it is likely not something that should be in internal/cpu (e.g. whether the user has enabled softfloat as compilation option).

Note that some feature variables are assumed to always be true but we still have to check if they are true in case we start on a cpu that does not support e.g. FP. You can assume that they are always true in code but we should have a test that makes sure HasFP is indeed detected to be true if e.g. HWCAP supplies that feature.

Could you please clarify what you meant by:
"should standard library functions that use internal/cpu accommodate envvars?"

@smasher164

This comment has been minimized.

Copy link
Contributor

smasher164 commented Jul 23, 2018

Although HasFP is not in HWCAP, it is in the cp0.Config register that can only be read from a privileged instruction. Setting HasFP would need a different method, possibly:

  • Issuing a read instruction to FIR and catching a SIGILL.
  • Parsing the output of /proc/cpuinfo.

I meant to ask should stdlib assembly functions continue to check for envvars, but the question doesn't make much sense since macros are compile-time detection.

@martisch

This comment has been minimized.

Copy link
Member

martisch commented Jul 23, 2018

I think neither parsing /proc/cpuinfo nor running into SIGILL are a good fit internal/cpu and for the early stage that internal/cpu is initialized during runtime. Seems the only features that seem to be detectable reliably and without privileged instructions by internal/cpu are the once mentioned for HWCAP so far.

@smasher164

This comment has been minimized.

Copy link
Contributor

smasher164 commented Jul 23, 2018

Fair enough, although given the alternative way to determine that a CPU is MIPS Release 6, we can probably hold off on MIPS feature detection. The following code should do the trick of detecting FMA on big-endian systems:

TEXT ·isFMASupported(SB),NOSPLIT,$0
	MOVV R0, R2
#ifndef GOMIPS64_softfloat
	// Detect Release 6. ADDI < R6 == BOVC on R6.
	// See https://github.com/v8mips/v8mips/issues/97#issue-44761752
	WORD $0x20420001
	BNE R0, R2, nosupport
	// Detect double-precision. CP1.FIR[18:17] == 1
	MOVW FCR0, R2
	MOVW $(1<<17), R9
	AND R9, R2, R2
	SRL $17, R2, R2
nosupport:
#endif
	MOVV R2, ret(FP)
	RET
@martisch

This comment has been minimized.

Copy link
Member

martisch commented Jul 24, 2018

If the above isFMASupported code works in a normal go program and on every mips cpu then the two features mentioned in there should be added and used from internal/cpu. Release 6 detection seems to be available through HWCAP. If FCR0 is readable by a normal go program then the features detected through it can also be added to intearnal/cpu once needed.

@smasher164

This comment has been minimized.

Copy link
Contributor

smasher164 commented Jul 26, 2018

Cool! I can work on a CL for mips[64][le] that incorporates HWCAP and FCR0. I'll open a similar issue for ARM as well.

@gopherbot

This comment has been minimized.

Copy link

gopherbot commented Jul 29, 2018

Change https://golang.org/cl/126657 mentions this issue: internal/cpu: expose mips[le][64] feature flags for FMA

@milanknezevic

This comment has been minimized.

Copy link
Contributor

milanknezevic commented Aug 1, 2018

@smasher164 This kind of detection can be a little overkill on mips[64}r6, if not imposible. R6 is not backward-compatible release, so the binaries that are built for former releases can't be run on R6. There are some plans that mips[64}r6 support will be available through GOMIPS{64} environment variable. You can read more about this here.

@smasher164

This comment has been minimized.

Copy link
Contributor

smasher164 commented Aug 1, 2018

@milanknezevic Looking at the latest MIPS ISA, there is a large overlap between R6 and previous releases, even if there is backwards-incompatibility. The only instruction-based detection that is done for now is to read a value from the floating-point implementation register, which has existed since release 1. If we want to specialize mips code without sub-arch flags like in the thread you mentioned, runtime detection is necessary.

That said, given the scarce number for release 6 processors, holding off on R6-specific code isn't a bad option.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment