Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MISA bit definition for half-precision floating-point extension #414

Closed
chuanhua opened this issue Jul 22, 2019 · 25 comments
Closed

MISA bit definition for half-precision floating-point extension #414

chuanhua opened this issue Jul 22, 2019 · 25 comments
Assignees
Labels

Comments

@chuanhua
Copy link

Could we define the MISA bit for half-precision floating-point extension using reserved K/O/R bits in MISA?

This bit is needed to indicate that if half-precision floating-point extension is present, then

  1. If RVS is present, then FCVT.S.H and FCVT.H.S instructions will be added to RVS.
  2. If RVD is present, then FCVT.H.D and FCVT.D.H instructions will be added to RVD.
@aswaterman
Copy link
Member

Perhaps this has changed, but IIRC, the plan was for half-precision support to be coupled to the vector unit. So half-precision is present if (misa.V && misa.F).

@chuanhua
Copy link
Author

The above condition of "if (misa.V && misa.F) is not sufficient to indicate half-precision support is present. It is possible that half-precision is not supported, so when SEW is 16, and a floating-point vector instruction will generate an exception.

@aswaterman
Copy link
Member

That’s true.

@chuanhua
Copy link
Author

chuanhua commented Aug 7, 2019

I narrow my MISA bit selection for half-precision floating-point extension to K because the number of 16 is related to a hexaKaidecagon.

@aswaterman
Copy link
Member

Ha! I was not aware of that term; I have only heard hexadecagon.

I agree allocating a letter like K is a reasonable approach.

@jscheid-ventana
Copy link
Contributor

jscheid-ventana commented Aug 7, 2019 via email

@jim-wilson
Copy link

I did a little searching, and found that kai means "and" in greek, so hexakaidecagon is six and ten (poly)gon. Still, it looks like a reasonable mnemonic to remember what K stands for.

@aswaterman
Copy link
Member

@jscheid-ventana IEEE 754. But given interest in bfloat16, there might be some desire to support that, too, possibly with the same opcodes and an fcsr mode switch.

@eak
Copy link

eak commented Aug 19, 2019

@aswaterman I would like to suggest two bits in the FCSR be added to indicate the format of 16-bit FP. 00 would IEEE, 01 would be bfloat16, and 10 would be some sort of 16-bit posit (since there seems to be some interest in posits, according to Krste). Of course an implementation is free to implement any subset of these if they implement 16-bit FP, trapping if the FCSR contains any unsupported value.

Comments?

@aswaterman
Copy link
Member

We had been thinking along the same lines. (Note that for backwards compatibility, the field needs to be WARL, anyway, so the implementation can just refuse to write unsupported values, rather than trapping them.)

@eak
Copy link

eak commented Aug 19, 2019

WARL sounds good. Bits 8-9 perhaps? Or would it be good to reserve bits for 16-bit, 32-bit, 64-bit, and 128-bit at the same time? That might suggest 16-17, 18-19, 20-21, 22-23.

@aswaterman
Copy link
Member

I don't think we need to support full associativity of format: it doesn't make much sense to be able to configure the machine such that the 16-bit mode is bfloat, the 32-bit mode is posit, and the 64-bit mode is IEEE, for instance.

In any case, all of the fcsr bits are reserved for standard use, so we don't need to pre-allocate them. Along the same lines, we don't actually need to reserve space for the posit format until we specify such an extension.

@pdonahue-ventana
Copy link
Contributor

There was a recent proposal to add a bit to fcsr to indicate support for half-precision FP. An argument was made that this was undesirable partially due to headaches for classical virtualization.

Is it any more desirable to expose WARL bits to U mode in fcsr without the ability to trap the fcsr read access (other than the big hammer of trapping all FP operations)? The term WARL is only in the privileged architecture and there are no WARL bits in the unprivileged ISA.

@aswaterman
Copy link
Member

aswaterman commented Aug 20, 2019 via email

@pdonahue-ventana
Copy link
Contributor

I just want to bring to your attention that software cannot emulate an implementation that supports non-IEEE half-precision without trapping all FP instructions of all sizes (even if single and double are natively supported or software doesn't actually use non-IEEE half-precision), solely because there is no mechanism to trap accesses to fcsr without trapping accesses to all FP.

I see that ARMv8 has the same limitation so perhaps nobody cares about this case.

@aswaterman
Copy link
Member

Yep. That concern should be weighed when making this decision.

@eak
Copy link

eak commented Aug 20, 2019

This could be viewed as a reason to get it into the ISA as soon as possible, so implementations can add the bit(s) and trap if it is set if they don't implement bfloat16. It is another reason to make it a 2-bit field initially (again so they can trap if a non-zero value is there).

@aswaterman
Copy link
Member

aswaterman commented Aug 20, 2019 via email

@eak
Copy link

eak commented Aug 20, 2019

Do we need to move this to the tech mailing list?

@aswaterman
Copy link
Member

aswaterman commented Aug 20, 2019 via email

@nihui
Copy link

nihui commented Jun 11, 2021

Hi, what is the status now ?
Is there any way for non-privileged code to discover if the processor supports risc-v zfh extension ?
We need a runtime detector function for dispatching proper optimized code path based on processor capability.
Thanks.

@gfavor
Copy link
Collaborator

gfavor commented Jun 11, 2021

A new "unified low-level RISC-V discovery method" is in the works. This will support discovery, for all arch extensions, of the support for the extension as well as the support for any options within the extension and supported values for any key parameters within the extension.

Note that this is the "low-level" discovery method that, for example, M-mode software would use to populate info into high-level discovery structures such as Device Tree. The "runtime detector function" obviously is a function of the system environment (e.g Linux, RTOS, bare-metal, ...) and how the toolchain and loader and user discovery are handle all this.

@kersten1
Copy link
Collaborator

Any recent status for this one?

@kersten1 kersten1 added the Stale label Mar 29, 2024
@aswaterman
Copy link
Member

I think @gfavor’s comment from a few years ago is conclusive.

@nihui
Copy link

nihui commented Mar 30, 2024

Hello everyone, if you want to implement detection on an old version of the kernel, the ruapu project is a good reference. It handles sigill and completes the detection of the extended instruction set.

https://github.com/nihui/ruapu

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

10 participants