Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Naming strategy #7

Closed
Hsiangkai opened this issue Apr 8, 2020 · 6 comments
Closed

Naming strategy #7

Hsiangkai opened this issue Apr 8, 2020 · 6 comments

Comments

@Hsiangkai
Copy link
Collaborator

The intrinsic API should have the goal to make all the V-ext instructions accessible from C. We will provide intrinsics 1-to-1 mapping to assembly mnemonics and additional intrinsics for semantic reason, e.g. fma, splat, etc.

@zakk0610
Copy link
Collaborator

zakk0610 commented Apr 15, 2020

based on current SiFive proposal, the function name would changed from

vint16m2_t vwadd_vs_i16m2 (vint8m1_t op1, int8_t op2);
vuint16m2_t vwadd_vs_u16m2 (vuint8m1_t op1, uint8_t op2);
vfloat32m2_t vwadd_vs_f32m2 (vfloat16m1_t op1, float16_t op2);

to

vint16m2_t vwadd_vx_i16m2 (vint8m1_t op1, int8_t op2);
vuint16m2_t vwaddu_vx_u16m2 (vuint8m1_t op1, uint8_t op2);
vfloat32m2_t vfwadd_vf_f32m2 (vfloat16m1_t op1, float16_t op2);

@zakk0610
Copy link
Collaborator

zakk0610 commented Apr 15, 2020

There are also some operations don't care signed/unsigned, so its intrinsic function name would look likes

vint32m1_t vadd_vv_i32m1 (vint32m1_t op1, vint32m1_t op2);
vint32m1_t vadd_vv_u32m1 (vuint32m1_t op1, vuint32m1_t op2);

maybe consistency is important thing but we would break it due to inconsistency asm op name.

@nick-knight
Copy link
Collaborator

My opinion is that our first intrinsics effort should match the assembly language instruction names as closely as possible. Users are free to rename them as they see fit using (e.g.) preprocessor macros, and as time goes on we might decide to standardize these "aliases".

@Hsiangkai
Copy link
Collaborator Author

I provide a document for general naming rules and exceptional naming rules.
https://github.com/sifive/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md
We could discuss the naming rules based on this primitive version.

@rdolbeau
Copy link
Collaborator

@knightsifive Regarding the comments in https://github.com/sifive/rvv-intrinsic-doc/issues/7#issuecomment-614420632 and https://github.com/sifive/rvv-intrinsic-doc/issues/8#issuecomment-615358742...

I don't think it's a good idea to let the users do some workarounds to make the intrinsics more usable. This will generate (bad) legacy and causes trouble down the line. I think it's best to specify everything that is going to be done eventually, even if it's much later on due to manpower availability or other constraints (e.g. unspecified vector calling convention/ABI will prevent d) below). You want the users to know the level of support they will have from the tools (and maybe contribute some work rather than do it just for themselves ;-) ).

In my mind, there's four categories of 'intrinsics' that are eventually needed, and whose namespaces should be compatible:

a) The '1-to-1' intrinsics. Those will generate a known, single assembly instruction (unless perhaps overriden by a compiler option saying 'you can optimize some stuff away'). They are effectively equivalent to a single-instruction volatile asm statement (the compiler option merely take away the 'volatile' part) but with a more restrictive typing system (i.e. vint32m1_t vs vint64m1_t instead of a single 'vector' inline ASM constraint).

b) The 'basic support' intrinsics. Those makes the writing of intrinsics somewhat easier by removing the need to over-specify things. They include zero- or undefined-generation intrinsics, abstract version of some instructions (e.g. FMA but without specifying which register is overwritten) reinterpret (zero instruction, naturally), missing cases that are trivially generated (i.e. comparisons that aren't in hardware because they are trivially emulated with those actually available) ... They should usually generate up to one instruction (e.g. undefined-generation is likely to be zero instruction, comparisons will be one).

c) the 'helper/orthogonalization/compatibility' intrinsics. Those are meant to ease code writing/porting by adding 'missing' features that are reasonably easy to emulate, and compatibility between (unavailable) extensions. They generate a small number of instructions. You'd find the 64-bits vslide1up in RV32 (a couple of instructions), emulation of extensions disabled at compile time (i.e. allowing to generate replacement code for Zvlsseg '1-to-1' intrinsics such as a pair of strided load), etc. They allow to have a single codebase compile and work relatively painlessly across implementations (even though the generated binary might be more specific, such as in the Zvlsseg case).

d) the 'heavy duty' not-quite-intrinsics. Things that would be useful in a vectorization context, but are too heavy-duty to be called intrinsics, such as calling the vectorized version of libm, etc. Will require a proper vector calling convention/ABI, as many will be 'true' functions and not intrinsics. This may include a 'porting layer' that offers an easy way to move from other SIMD ISAs, but warning the user it may not be very efficient (e.g. the vi32.h and vi64.h files you can find in my Chacha20 code, which allows some data-manipulation SVE code to be ported straightforwardly, if not efficiently due to the heavy reliance on vrgather).

All those should be specified early, even if most implementations are #error "unimplemented", to tell the user what programming complexity to expect from V.

@Hsiangkai
Copy link
Collaborator Author

Hsiangkai commented May 5, 2020

I provide 1-to-1 intrinsics, utility functions, and semantic intrinsics in the RFC document.
I also add naming guideline and exceptional cases for these intrinsics.
https://github.com/sifive/rvv-intrinsic-doc/blob/master/rvv-intrinsic-rfc.md

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants