Clarifications and cleanups

Expanded text on which Vector extensions can be used as a base.
riscv · Dec 19, 2022 · 4069530 · 4069530
1 parent 08f578e
commit 4069530
Show file tree

Hide file tree

Showing 6 changed files with 27 additions and 34 deletions.
diff --git a/doc/vector/insns/vclmul.adoc b/doc/vector/insns/vclmul.adoc
@@ -82,9 +82,9 @@ Operation::
 function clause execute (VCLMUL(vs2, vs1, vd, suffix)) = {
 
   foreach (i from vstart to vl-1) {
-    let op1 : bits (64) = if suffix =="vv" then get_velem(vs1,i) else X(vs1); // X(vs1) is truncated / zero-extended when appropriate
+    let op1 : bits (64) = if suffix =="vv" then get_velem(vs1,i) else zext_or_truncate_to_sew(X(vs1));
     let op2 : bits (64) = get_velem(vs2,i);
-    let product : bits (64) = clmul(op1,op2,EEW);
+    let product : bits (64) = clmul(op1,op2,SEW);
     set_velem(vd, i, product);
   }
   RETIRE_SUCCESS

diff --git a/doc/vector/insns/vclmulh.adoc b/doc/vector/insns/vclmulh.adoc
@@ -74,7 +74,7 @@ function clause execute (VCLMULH(vs2, vs1, vd, suffix)) = {
   foreach (i from vstart to vl-1) {
     let op1 : bits (64) = if suffix =="vv" then get_velem(vs1,i) else zext_or_truncate_to_sew(X(vs1));
     let op2 : bits (64) = get_velem(vs2, i);
-    let product : bits (64) = clmulh(op1, op2, width);
+    let product : bits (64) = clmulh(op1, op2, SEW);
     set_velem(vd, i, product);
   }
   RETIRE_SUCCESS

diff --git a/doc/vector/insns/vsha2c.adoc b/doc/vector/insns/vsha2c.adoc
@@ -77,11 +77,6 @@ The two forms of the instruction save code from having to swap these two words
 if there were just a single instruction.
 ====
 
-// Many vector units that are wider than 128 bits may choose to only implement one
-// 128-bit datapath for this instruction. This needs to be transparent to code in terms
-// of functionality. A vector length setting of wider than 128 bits would require some
-// sort of instruction expansion.
-
 This instruction is not masked. If any 128-bit element groups are not to be processed,
 the _vl_ must be set accordingly.
 VLMUL must be at least 1. In typical usage it is expected to be 1.
@@ -106,8 +101,7 @@ executing one of these instructions so that would be available as input to the n
 instruction for the input of _c_, _d_, _g_, and _h_. This would use up one more
 vector register and require one more instruction, without any benefit.
 
-
-The case where the `vd` register group overlap with either `vs1` or `vs2` is _reserved_.
+The case where the `vd` register group overlaps with either `vs1` or `vs2` is _reserved_.
 
 [NOTE]
 ====
@@ -132,7 +126,7 @@ Likewise, `vstart` must be a multiple of `EGS=4`.
 Operation::
 [source,sail]
 --
-function clause execute (VSHA2c(vs2, vs1, vd, vv)) = {
+function clause execute (VSHA2c(vs2, vs1, vd)) = {
 
   assert((vl%EGS)<>0)       // vl must be a multiple of EGS
   assert((vstart%EGS)<>0) //  vstart must be a multiple of EGS

diff --git a/doc/vector/insns/vsha2ms.adoc b/doc/vector/insns/vsha2ms.adoc
@@ -94,7 +94,7 @@ The number of words to be processed is `vl`/`EGS`.
 therefore must be a multiple of `EGS=4`. +
 Likewise, `vstart` must be a multiple of `EGS=4`
 
-The case where the `vd` register group overlap with either `vs1` or `vs2` is _reserved_.
+The case where the `vd` register group overlaps with either `vs1` or `vs2` is _reserved_.
 
 [NOTE]
 ====
@@ -104,21 +104,11 @@ that `vd`, `vs1` and `vs2` each contain different portions of the message schedu
 ====
 
 
-// This instruction requires that `Zvl128b` be implemented (i.e `VLEN>=128`).
-
 [NOTE]
 ====
 W~13~ is not used by the instruction for producing the next 4 message schedule words.
 ====
 
-// [NOTE]
-// ====
-// Many vector units that are wider than 128 bits may choose to only implement one
-// 128-bit datapath for this instruction. This needs to be transparent to code in terms
-// of functionality. A vector length setting of wider than 128 bits would require some
-// sort of instruction expansion.
-// ====
-
 This instruction is not masked. If any element groups are not to be processed, the _vl_
 must be set accordingly. It is not possible to skip an intermediary element group.
 `VLMUL` must be at least 1. In typical usage it is expected to be 1.

diff --git a/doc/vector/insns/vsm3me.adoc b/doc/vector/insns/vsm3me.adoc
@@ -64,12 +64,13 @@ The number of element groups to be processed is `vl`/`EGS`.
 therefore must be a multiple of `EGS=4`. +
 Likewise, `vstart` must be a multiple of `EGS=4`.
 
-The case where the `vd` register group overlap with `vs2` is _reserved_.
+The case where the `vd` register group overlaps with `vs2` is _reserved_.
 
 [NOTE]
 ====
-Preventing overlap between `vd` and `vs2` simplifies implementation with `VLEN < EGW`.
-Overlap between `vs1` and `vd` is not reserved as it could be useful for larger VLEN implementation while not impacting smaller VLEN.
+Preventing overlap between `vd` and `vs2` simplifies implementations with `VLEN < EGW`.
+This restriction should not have any coding impact since the algorithm requires these
+values to be preserved for generating the next 8 words.
 ====
 
 Operation::

diff --git a/doc/vector/riscv-crypto-spec-vector.adoc b/doc/vector/riscv-crypto-spec-vector.adoc
@@ -108,11 +108,21 @@ include::./riscv-crypto-vector-scalar-instructions.adoc[]
 The section introduces all of the  extensions in the Vector Cryptography
 Instruction Set Extension Specification.
 
-Each of these extensions has a minimum ELEN, and can be built on any base vector extension that
-supports that minimum ELEN.
+These Vector Crypto Extensions can be built on any RISC-V base. However, XLEN=32 implementations
+will only be able to provide 32 bit values to the .vx vector-scalar instructions. 
+
+With the exception of Zvknhb, each of these Vector Crypto Extensions can be build on _any_ 
+base Vector Extension, embedded (Zve*) or application ("V"). Zvknhb requires ELEN=64 and therefore cannot be implemented on a Zve32* base.
+
+While the Zvkb extension can be built on an Zve32* base, the vclmul[h] instructions will not be
+supported in such a case as they require SEW=64.
+
+While these Vector Crypto Extensions _can_ be built on implementations with `ELEN<128`, this will
+require code to be written with high `LMUL` values which result in too few register groups
+available to effectively code the intended algorithms. 
+See <<crypto-vector-element-groups>> for more details on vector element groups and the drawbacks of
+small `ELEN` values..
 
-Most of these extensions can be implemented on XLEN=32 designs. However, `Zvkb` requires
-XLEN=64 since `vcmul[h].vx` is only defined for SEW=64.
 
 [%header,cols="^2,^2,^2"]
 |===
@@ -123,16 +133,14 @@ XLEN=64 since `vcmul[h].vx` is only defined for SEW=64.
 |Zvkns  | 32 | 32 
 |Zvknha | 32 | 32
 |Zvknhb | 64 | 32
-|Zvkb   | 64 | 64
+|Zvkb   | 32^1^ | 32^2^
 |Zvkg   | 32 | 32
 |Zvksed | 32 | 32
 |Zvksh  | 32 | 32
 |===
+1 - When ELEN=32, the clmul[h] instructions are not supported as they are only defined for SEW=64
 
-It is recommended that the Vector Crypto Extensions be implemented on a base with `ELEN`
-of at least 128. Smaller `ELEN` values could make it hard to efficiently code cryptographic
-algorithms. See <<crypto-vector-element-groups>> for more details on the drawbacks of
-small `ELEN` values.
+2 - When XLEN=32, scalar inputs are limited to zero/sign-extended 32-bit values.
 
 
 All _cryptography-specific_ instructions defined in this Vector Crypto specification (i.e., <<zvkns>>,