Skip to content
This repository has been archived by the owner on Mar 20, 2024. It is now read-only.

K-extension vector instructions #566

Open
JamesKenneyImperas opened this issue Sep 4, 2020 · 5 comments
Open

K-extension vector instructions #566

JamesKenneyImperas opened this issue Sep 4, 2020 · 5 comments
Labels
Resolve after v1.0 Does not need to be resolved for v1.0 draft

Comments

@JamesKenneyImperas
Copy link

JamesKenneyImperas commented Sep 4, 2020

I've been looking at the proposed vector instructions in the K-extension silo. I'm wondering if some thought is required about the best way to make the requirements of those fit with this specification.

The K-extension vector instructions as specified all require SEW=128, and will raise Illegal Instruction exceptions for other settings. I think this is unattractive from an implementation perspective: if an SEW of 128 must be supported in vtype for these instructions, what is the implication for other vector instructions? Do all (non-floating-point) vector instructions then need to be supported at this width as well? If not, do any such instructions need to support this width? If so, which ones? Given the existing definition, would continual changes to vtype be required to use the instructions at all?

There might be a better way to do this. There is now a precedent with the vl/vs instructions to have EEW encoded in the instruction. Could the K-extension vector instructions instead be defined to force EEW=128 in a similar fashion to vl/vs, and be defined to execute whatever the currently-enabled SEW, even if this is smaller than 128? This would allow implementation of these instructions even on machines which only support the base vector extension with ELEN<128.

I don't know whether this works either from a hardware or encryption algorithmic perspective, but I thought it was worth raising.

@jnk0le
Copy link

jnk0le commented Sep 5, 2020

Maybe instead of requiring full functionality when SEW >= 128, we could introduce an "half-operational element" state for such, where only a subset of base instructions are operable like load/store/bitwise/ediv etc.

There could be also an vill like bit in vtype or another csr for a discovery mechanism.

@JamesKenneyImperas
Copy link
Author

There has been some discussion of this in riscv/riscv-crypto#49. @aswaterman, could you scan through these two issues and give your thoughts? Perhaps a change is needed to section 11.1 to describe implementation constraints on non-floating-point datatypes: the expectation there is that the wider ELEN required for cryptographic instructions is not required to be supported for any base instructions, but this kind of constraint is not described in this specification.

@aswaterman
Copy link
Member

aswaterman commented Sep 17, 2020

@JamesKenneyImperas Although I think it would be workable to allow SEW to be set to 128 while making most instructions illegal, I tentatively agree with you that instead baking EEW into (some?) crypto instructions might be a cleaner solution, especially for those that only make sense with exactly one EEW.

One hitch is the rule that EMUL would be (EEW/SEW)*LMUL, but that's not a problem since we have fractional LMUL. e.g., if you want EMUL=1 for an EEW=128 instruction, you could set LMUL=1/4 and SEW=32.

@JamesKenneyImperas
Copy link
Author

Another data point for this discussion: would any implementations that only support ELEN=32 for base instructions also want to implement the K extension? If so, that would mean SEW=32 and SEW=128 would be supported for some instructions, but not SEW=64. Ugh.

I'm also not sure how workable my original proposal is, since subsequent discussion in the K extension issue I mentioned above implies that some of those instructions are intended to be polymorphic.

Perhaps a working party from both K and V extensions should be convened to thrash this out?

@jnk0le
Copy link

jnk0le commented Jan 24, 2021

Hardcoded EEW=128 is not gonna work. Section 6.2 allows vl to be set anywhere between ceil(AVL / 2) and VLMAX, which requires extra boilerplate code to avoid AVL < (2 * VLMAX) condition. (related #182)

I think that ediv would allow us to do the whole ctr/gcm loop with single vsetvl.
Sub element masking allows use of mask instedad of splatted (increment) group for ctr updates.

@kasanovic kasanovic added the Resolve after v1.0 Does not need to be resolved for v1.0 draft label Jun 7, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Resolve after v1.0 Does not need to be resolved for v1.0 draft
Projects
None yet
Development

No branches or pull requests

4 participants