diff --git a/Zc-specification/Zc.adoc b/Zc-specification/Zc.adoc index 8f099bb..c524c9f 100644 --- a/Zc-specification/Zc.adoc +++ b/Zc-specification/Zc.adoc @@ -1,21 +1,24 @@ +:sectnums: [#Zc] -== Zc* v0.70.5 +== Zc* v1.0.0-RC4 === Change history since v0.70.1 (tagged release) .Change history [width="100%",options=header] |==================================================================================== -|Version | change -|v0.70.5 | Resolve https://github.com/riscv/riscv-code-size-reduction/issues/163 - jvt.base is WARL and fewer bits than the max can be implemented -|v0.70.4 | Clarified https://github.com/riscv/riscv-code-size-reduction/issues/159 - Need Zbb and Zba for RV64 and M/ZMmul to get _all_ of Zcb -| | Resolved https://github.com/riscv/riscv-code-size-reduction/issues/161 -| | Resolved https://github.com/riscv/riscv-code-size-reduction/issues/160 - Allocated Smstateen bit 2 and added the relevant text -|v0.70.3 | Added rule that Zcf and Zcmt imply Zca (this text was missing, this is not a spec change: https://github.com/riscv/riscv-code-size-reduction/pull/151) -| | Added that Zcf is illegal for RV64, as it contains no instructions (clarification: https://github.com/riscv/riscv-code-size-reduction/issues/149) -| | Added push/pop examples in the push/pop section -|v0.70.2 | Stylistic changes only, removing redundant text. -| | Corrected field names on JVT CSR diagram, and fixed synopsis for cm.mvsa01 +|Version | change +|v1.0.0-RC4| Release candidate +| | Remove Zcmb as benefit is low. Remove cm.jalt, read LSB of jump table entry to determine whether to link +|v0.70.5 | Resolve https://github.com/riscv/riscv-code-size-reduction/issues/163 - jvt.base is WARL and fewer bits than the max can be implemented +|v0.70.4 | Clarified https://github.com/riscv/riscv-code-size-reduction/issues/159 - Need Zbb and Zba for RV64 and M/ZMmul to get _all_ of Zcb +| | Resolved https://github.com/riscv/riscv-code-size-reduction/issues/161 +| | Resolved https://github.com/riscv/riscv-code-size-reduction/issues/160 - Allocated Smstateen bit 2 and added the relevant text +|v0.70.3 | Added rule that Zcf and Zcmt imply Zca (this text was missing, this is not a spec change: https://github.com/riscv/riscv-code-size-reduction/pull/151) +| | Added that Zcf is illegal for RV64, as it contains no instructions (clarification: https://github.com/riscv/riscv-code-size-reduction/issues/149) +| | Added push/pop examples in the push/pop section +|v0.70.2 | Stylistic changes only, removing redundant text. +| | Corrected field names on JVT CSR diagram, and fixed synopsis for cm.mvsa01 |==================================================================================== === Zc* Overview @@ -23,74 +26,83 @@ This document is in the Stable state. Assume anything could still change, but limited change should be expected. For more information see: https://riscv.org/spec-state -Zc* is a group of extensions which define subsets of the existing C extension (Zca, Zcf) and new extensions which only contain 16-bit encodings. +Zc* is a group of extensions which define subsets of the existing C extension (Zca, Zcd, Zcf) and new extensions which only contain 16-bit encodings. Zcm* all reuse the encodings for _c.fld_, _c.fsd_, _c.fldsp_, _c.fsdsp_. .Zc* extension overview [width="100%",options=header] |==================================================================================== -|Instruction |Zca|Zcf|Zcb|Zcmb|Zcmp|Zcmpe|Zcmt -8+|*Define a subset of C with the floating point load/stores removed* -|C excl. c.f* |✓| | | | | | -8+|*The single precision floating point load/stores become a separate extension* -|c.flw | |✓| | | | | -|c.flwsp | |✓| | | | | -|c.fsw | |✓| | | | | -|c.fswsp | |✓| | | | | +|Instruction |Zca |Zcf |Zcd |Zcb |Zcmp |Zcmpe |Zcmt +8+|*The Zca extension is added as way to refer to instructions in the C extension that do not include the floating-point loads and stores* +|C excl. c.f* |✓| | | | | | +8+|*The Zcf extension is added as a way to refer to compressed single-precision floating-point load/stores* +|c.flw | |✓| | | | | +|c.flwsp | |✓| | | | | +|c.fsw | |✓| | | | | +|c.fswsp | |✓| | | | | +8+|*The Zcd extension is added as a way to refer to compressed double-precision floating-point load/stores* +|c.fld | | |✓| | | | +|c.fldsp | | |✓| | | | +|c.fsd | | |✓| | | | +|c.fsdsp | | |✓| | | | 8+|*Simple operations for use on all architectures* -|c.lbu | | |✓| | | | -|c.lh | | |✓| | | | -|c.lhu | | |✓| | | | -|c.sb | | |✓| | | | -|c.sh | | |✓| | | | -|c.zext.b | | |✓| | | | -|c.sext.b | | |✓| | | | -|c.zext.h | | |✓| | | | -|c.sext.h | | |✓| | | | -|c.zext.w | | |✓| | | | -|c.mul | | |✓| | | | -|c.not | | |✓| | | | -8+|*Load/store byte/half which overlap with _c.fld_, _c.fldsp_, _c.fsd_* -|cm.lb | | | |✓ | | | -|cm.lbu | | | |✓ | | | -|cm.lh | | | |✓ | | | -|cm.lhu | | | |✓ | | | -|cm.sb | | | |✓ | | | -|cm.sh | | | |✓ | | | +|c.lbu | | | |✓| | | +|c.lh | | | |✓| | | +|c.lhu | | | |✓| | | +|c.sb | | | |✓| | | +|c.sh | | | |✓| | | +|c.zext.b | | | |✓| | | +|c.sext.b | | | |✓| | | +|c.zext.h | | | |✓| | | +|c.sext.h | | | |✓| | | +|c.zext.w | | | |✓| | | +|c.mul | | | |✓| | | +|c.not | | | |✓| | | 8+|*PUSH/POP and double move which overlap with _c.fsdsp_* -|cm.push | | | | |✓ | ✓ | -|cm.pop | | | | |✓ | ✓ | -|cm.popret | | | | |✓ | ✓ | -|cm.popretz | | | | |✓ | ✓ | -|cm.mva01s | | | | |✓ | | -|cm.mvsa01 | | | | |✓ | | +|cm.push | | | | |✓|✓| +|cm.pop | | | | |✓|✓| +|cm.popret | | | | |✓|✓| +|cm.popretz | | | | |✓|✓| +|cm.mva01s | | | | |✓| | +|cm.mvsa01 | | | | |✓| | 8+|*Reserved for EABI versions of PUSH/POP and double move which overlap with _c.fsdsp_* -|cm.push.e | | | | | | ✓ | -|cm.pop.e | | | | | | ✓ | -|cm.popret.e | | | | | | ✓ | -|cm.popretz.e | | | | | | ✓ | -|cm.mva01s.e | | | | | | ✓ | -|cm.mvsa01.e | | | | | | ✓ | -8+|*Table jump* -|cm.jt | | | | | | |✓ -|cm.jalt | | | | | | |✓ +|cm.push.e | | | | | |✓| +|cm.pop.e | | | | | |✓| +|cm.popret.e | | | | | |✓| +|cm.popretz.e | | | | | |✓| +|cm.mva01s.e | | | | | |✓| +|cm.mvsa01.e | | | | | |✓| +8+|*Table jump* +|cm.jalt | | | | | |✓ |==================================================================================== [#Zca] === Zca -Zca is all of the existing C extension, _excluding_ all 16-bit floating point loads and stores: _c.flw_, _c.flwsp_, _c.fsw_, _c.fswsp_, _c.fld_, _c.fldsp_, _c.fsd_, _c.fsdsp_. +The Zca extension is added as way to refer to instructions in the C extension that do not include the floating-point loads and stores. + +Therefore it _excluded_ all 16-bit floating point loads and stores: _c.flw_, _c.flwsp_, _c.fsw_, _c.fswsp_, _c.fld_, _c.fldsp_, _c.fsd_, _c.fsdsp_. + +NOTE: the the C extension only includes F/D instructions when D and F are also specified [#Zcf] === Zcf (RV32 only) -Zcf is the existing set of single precision floating point loads and stores: _c.flw_, _c.flwsp_, _c.fsw_, _c.fswsp_. +Zcf is the existing set of compressed single precision floating point loads and stores: _c.flw_, _c.flwsp_, _c.fsw_, _c.fswsp_. Zcf is only relevant to RV32, it cannot be specified for RV64. Zcf requires the <> extension. +[#Zcd] +=== Zcd + +Zcd is the existing set of compressed double precision floating point loads and stores: _c.fld_, _c.fldsp_, _c.fsd_, _c.fsdsp_. + +Zcd requires the <> extension. + + <<< [#Zcb] @@ -179,64 +191,13 @@ The _c.mul_ encoding uses the CR register format along with other instructions s <<< -[#Zcmb] -=== Zcmb - -This extension reuses some encodings from _c.fld_, _c.fldsp_, and _c.fsd_. Therefore it is _incompatible_ with the full C-extension. -It is compatible with F, D with Zdinx. - -Zcmb requires the <> extension, which in turn requires the <> extension. - -The instructions are all 16-bit versions of existing 32-bit load/store instructions. - -[%header,cols="^1,^1,4,8"] -|=== -|RV32 -|RV64 -|Mnemonic -|Instruction - -|✓ -|✓ -|cm.lbu _rd'_, uimm(_rs1'_) -|<<#insns-cm_lbu>> - -|✓ -|✓ -|cm.lhu _rd'_, uimm(_rs1'_) -|<<#insns-cm_lhu>> - -|✓ -|✓ -|cm.lb _rd'_, uimm(_rs1'_) -|<<#insns-cm_lb>> - -|✓ -|✓ -|cm.lh _rd'_, uimm(_rs1'_) -|<<#insns-cm_lh>> - -|✓ -|✓ -|cm.sb _rs2'_, uimm(_rs1'_) -|<<#insns-cm_sb>> - -|✓ -|✓ -|cm.sh _rs2'_, uimm(_rs1'_) -|<<#insns-cm_sh>> - -|=== - -<<< - [#Zcmp] === Zcmp -Zcmp is the set of sequenced instuctions for code-size reduction. +The Zcmp extension is a set of instuctions which may be executed as a series of existing 32-bit RISC-V instructions. -This extension reuses some encodings from _c.fsdsp_. Therefore it is _incompatible_ with the full C-extension. -It is compatible with F, D with Zdinx. +This extension reuses some encodings from _c.fsdsp_. Therefore it is _incompatible_ with <>, + which is included when C and D extensions are both present. Zcmp requires the <> extension. @@ -292,12 +253,12 @@ The PUSH/POP assembly syntax uses several variables, the meaning of which are: [#Zcmpe] === Zcmpe -This extension reuses some encodings from _c.fsdsp_. Therefore it is _incompatible_ with the full C-extension. -It is compatible with F, D with Zdinx. +The Zcmpe extension offers EABI support for register mappings from <> where the _x_ register mapping is different to the UABI. -Zcmpe requires the <> extension. +This extension reuses some encodings from _c.fsdsp_. Therefore it is _incompatible_ with <>, + which is included when C and D extensions are both present. -Zcmpe offers EABI support for register mappings from <> where the _x_ register mapping is different to the UABI. +Zcmpe requires the <> extension. [NOTE] @@ -306,12 +267,12 @@ Zcmpe offers EABI support for register mappings from <> where the _x_ regi [#Zcmt] === Zcmt -This extension reuses some encodings from _c.fsdsp_. Therefore it is _incompatible_ with the full C-extension. -It is compatible with F, D with Zdinx. - -Zcmt is the set of table jump instuctions for code-size reduction, and also adds the JVT CSR. The JVT CSR requires a +Zcmt adds a table jump instuction and also adds the JVT CSR. The JVT CSR requires a state enable if Smstateen is implemented. See <> for details. +This extension reuses some encodings from _c.fsdsp_. Therefore it is _incompatible_ with <>, + which is included when C and D extensions are both present. + Zcmt requires the <> extension. [%header,cols="^1,^1,4,8"] @@ -321,11 +282,6 @@ Zcmt requires the <> extension. |Mnemonic |Instruction -|✓ -|✓ -|cm.jt _index_ -|<<#insns-cm_jt>> - |✓ |✓ |cm.jalt _index_ @@ -347,13 +303,6 @@ include::c_zext_w.adoc[] include::c_not.adoc[] include::c_mul.adoc[] -include::cm_lbu.adoc[] -include::cm_lhu.adoc[] -include::cm_lb.adoc[] -include::cm_lh.adoc[] -include::cm_sb.adoc[] -include::cm_sh.adoc[] - include::pushpop.adoc[] include::cm_push.adoc[] include::cm_pop.adoc[] @@ -364,6 +313,5 @@ include::cm_mva01s.adoc[] include::tablejump.adoc[] include::jvt_csr.adoc[] -include::cm_jt.adoc[] include::cm_jalt.adoc[] diff --git a/Zc-specification/Zcb_footer.adoc b/Zc-specification/Zcb_footer.adoc index c647568..4483786 100644 --- a/Zc-specification/Zcb_footer.adoc +++ b/Zc-specification/Zcb_footer.adoc @@ -7,6 +7,6 @@ Included in:: |Lifecycle state |Zcb (<>) -|v0.70.5 +|v1.0.0-RC3 |Stable |=== diff --git a/Zc-specification/Zcf_footer.adoc b/Zc-specification/Zcf_footer.adoc index 4fa6025..df56942 100644 --- a/Zc-specification/Zcf_footer.adoc +++ b/Zc-specification/Zcf_footer.adoc @@ -7,6 +7,6 @@ Included in:: |Lifecycle state |Zcf (<>) -|v0.70.5 -|Stable +|v1.0.0-RC4 +|Frozen |=== diff --git a/Zc-specification/Zcmp_footer.adoc b/Zc-specification/Zcmp_footer.adoc index b5fb847..63bbc7e 100644 --- a/Zc-specification/Zcmp_footer.adoc +++ b/Zc-specification/Zcmp_footer.adoc @@ -7,6 +7,6 @@ Included in:: |Lifecycle state |Zcmp (<>) -|v0.70.5 +|v1.0.0-RC3 |Stable |=== diff --git a/Zc-specification/Zcmpe_footer.adoc b/Zc-specification/Zcmpe_footer.adoc index ab6d116..472c7f1 100644 --- a/Zc-specification/Zcmpe_footer.adoc +++ b/Zc-specification/Zcmpe_footer.adoc @@ -7,6 +7,6 @@ Included in:: |Lifecycle state |Zcmpe (<>) -|v0.70.5 +|v1.0.0-RC4 |Stable |=== diff --git a/Zc-specification/Zcmt_footer.adoc b/Zc-specification/Zcmt_footer.adoc index 3d11810..78c1ca3 100644 --- a/Zc-specification/Zcmt_footer.adoc +++ b/Zc-specification/Zcmt_footer.adoc @@ -7,6 +7,6 @@ Included in:: |Lifecycle state |Zcmt (<>) -|v0.70.5 -|Stable +|v1.0.0-RC4 +|Frozen |=== diff --git a/Zc-specification/c_lbu.adoc b/Zc-specification/c_lbu.adoc index fc064a9..e2e929f 100644 --- a/Zc-specification/c_lbu.adoc +++ b/Zc-specification/c_lbu.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_lbu,reftext="c.lbu: Load unsigned byte, 16-bit encoding"] +[#insns-c_lbu,reftext="Load unsigned byte, 16-bit encoding"] === c.lbu Synopsis:: @@ -29,12 +29,6 @@ This instruction loads a byte from the memory address formed by adding _rs1'_ to [NOTE] _rd'_ and _rs1'_ are from the standard 8-register set x8-x15. -[NOTE] - For an longer immediate with a 16-bit encoding see <>. - -[NOTE] - To load _signed_ bytes with a 16-bit encoding see <>. - Prerequisites:: None diff --git a/Zc-specification/c_lh.adoc b/Zc-specification/c_lh.adoc index c779446..426e039 100644 --- a/Zc-specification/c_lh.adoc +++ b/Zc-specification/c_lh.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_lh,reftext="c.lh: Load signed halfword, 16-bit encoding"] +[#insns-c_lh,reftext="Load signed halfword, 16-bit encoding"] === c.lh Synopsis:: @@ -30,9 +30,6 @@ This instruction loads a halfword from the memory address formed by adding _rs1' [NOTE] _rd'_ and _rs1'_ are from the standard 8-register set x8-x15. -[NOTE] - For an longer immediate with a 16-bit encoding see <>. - Prerequisites:: None diff --git a/Zc-specification/c_lhu.adoc b/Zc-specification/c_lhu.adoc index 7dd9c76..1ae1012 100644 --- a/Zc-specification/c_lhu.adoc +++ b/Zc-specification/c_lhu.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_lhu,reftext="c.lhu: Load unsigned halfword, 16-bit encoding"] +[#insns-c_lhu,reftext="Load unsigned halfword, 16-bit encoding"] === c.lhu Synopsis:: @@ -30,9 +30,6 @@ This instruction loads a halfword from the memory address formed by adding _rs1' [NOTE] _rd'_ and _rs1'_ are from the standard 8-register set x8-x15. -[NOTE] - For an longer immediate with a 16-bit encoding see <>. - Prerequisites:: None diff --git a/Zc-specification/c_mul.adoc b/Zc-specification/c_mul.adoc index 826bbf9..0c5c8cf 100644 --- a/Zc-specification/c_mul.adoc +++ b/Zc-specification/c_mul.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_mul,reftext="c.mul: Multiply, 16-bit encoding"] +[#insns-c_mul,reftext="Multiply, 16-bit encoding"] === c.mul Synopsis:: diff --git a/Zc-specification/c_not.adoc b/Zc-specification/c_not.adoc index 01af86b..5adc2d3 100644 --- a/Zc-specification/c_not.adoc +++ b/Zc-specification/c_not.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_not,reftext="c.not: Bitwise not, 16-bit encoding"] +[#insns-c_not,reftext="Bitwise not, 16-bit encoding"] === c.not Synopsis:: diff --git a/Zc-specification/c_sb.adoc b/Zc-specification/c_sb.adoc index 7309a9b..597f840 100644 --- a/Zc-specification/c_sb.adoc +++ b/Zc-specification/c_sb.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_sb,reftext="c.sb: Store byte, 16-bit encoding"] +[#insns-c_sb,reftext="Store byte, 16-bit encoding"] === c.sb Synopsis:: @@ -29,9 +29,6 @@ This instruction stores the least significant byte of _rs2'_ to the memory addre [NOTE] _rs1'_ and _rs2'_ are from the standard 8-register set x8-x15. -[NOTE] - For an longer immediate with a 16-bit encoding see <>. - Prerequisites:: None diff --git a/Zc-specification/c_sext_b.adoc b/Zc-specification/c_sext_b.adoc index bf22886..3355554 100644 --- a/Zc-specification/c_sext_b.adoc +++ b/Zc-specification/c_sext_b.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_sext_b,reftext="c.sext.b: Sign extend byte, 16-bit encoding"] +[#insns-c_sext_b,reftext="Sign extend byte, 16-bit encoding"] === c.sext.b Synopsis:: @@ -30,7 +30,7 @@ in the byte (i.e., bit 7) to all of the more-significant bits. _rsd'_ is from the standard 8-register set x8-x15. Prerequisites:: -Zbb must also be configured. +Zbb is also required. 32-bit equivalent:: <> from Zbb diff --git a/Zc-specification/c_sext_h.adoc b/Zc-specification/c_sext_h.adoc index 234a9b0..dd9c6d8 100644 --- a/Zc-specification/c_sext_h.adoc +++ b/Zc-specification/c_sext_h.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_sext_h,reftext="c.sext.h: Sign extend halfword, 16-bit encoding"] +[#insns-c_sext_h,reftext="Sign extend halfword, 16-bit encoding"] === c.sext.h Synopsis:: @@ -30,7 +30,7 @@ in the halfword (i.e., bit 15) to all of the more-significant bits. _rsd'_ is from the standard 8-register set x8-x15. Prerequisites:: -Zbb must also be configured. +Zbb is also required. 32-bit equivalent:: <> from Zbb diff --git a/Zc-specification/c_sh.adoc b/Zc-specification/c_sh.adoc index 0dbcad4..03df877 100644 --- a/Zc-specification/c_sh.adoc +++ b/Zc-specification/c_sh.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_sh,reftext="c.sh: Store halfword, 16-bit encoding"] +[#insns-c_sh,reftext="Store halfword, 16-bit encoding"] === c.sh Synopsis:: @@ -30,9 +30,6 @@ This instruction stores the least significant halfword of _rs2'_ to the memory a [NOTE] _rs1'_ and _rs2'_ are from the standard 8-register set x8-x15. -[NOTE] - For an longer immediate with a 16-bit encoding see <>. - Prerequisites:: None diff --git a/Zc-specification/c_zca_required.adoc b/Zc-specification/c_zca_required.adoc index 75cc449..f7b460c 100644 Binary files a/Zc-specification/c_zca_required.adoc and b/Zc-specification/c_zca_required.adoc differ diff --git a/Zc-specification/c_zext_b.adoc b/Zc-specification/c_zext_b.adoc index e772e9a..3350ad9 100644 --- a/Zc-specification/c_zext_b.adoc +++ b/Zc-specification/c_zext_b.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_zext_b,reftext="c.zext.b: Zero extend byte, 16-bit encoding"] +[#insns-c_zext_b,reftext="Zero extend byte, 16-bit encoding"] === c.zext.b Synopsis:: diff --git a/Zc-specification/c_zext_h.adoc b/Zc-specification/c_zext_h.adoc index 3f9b821..6720b48 100644 --- a/Zc-specification/c_zext_h.adoc +++ b/Zc-specification/c_zext_h.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_zext_h,reftext="c.zext.h: Zero extend halfword, 16-bit encoding"] +[#insns-c_zext_h,reftext="Zero extend halfword, 16-bit encoding"] === c.zext.h Synopsis:: @@ -30,7 +30,7 @@ the bits more significant than 15. _rsd'_ is from the standard 8-register set x8-x15. Prerequisites:: -Zbb must also be configured. +Zbb is also required. 32-bit equivalent:: <> from Zbb diff --git a/Zc-specification/c_zext_w.adoc b/Zc-specification/c_zext_w.adoc index 3836a10..5a62118 100644 --- a/Zc-specification/c_zext_w.adoc +++ b/Zc-specification/c_zext_w.adoc @@ -1,5 +1,5 @@ <<< -[#insns-c_zext_w,reftext="c.zext.w: Zero extend word, 16-bit encoding"] +[#insns-c_zext_w,reftext="Zero extend word, 16-bit encoding"] === c.zext.w Synopsis:: @@ -30,7 +30,7 @@ the bits more significant than 31. _rsd'_ is from the standard 8-register set x8-x15. Prerequisites:: -Zba must also be configured. +Zba is also required. 32-bit equivalent:: [source,sail] diff --git a/Zc-specification/cm_decbnez.adoc b/Zc-specification/cm_decbnez.adoc index 1d944bd..4428621 100644 --- a/Zc-specification/cm_decbnez.adoc +++ b/Zc-specification/cm_decbnez.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_decbnez,reftext="cm.decbnez: Decrement and branch, 16-bit encoding"] +[#insns-cm_decbnez,reftext="Decrement and branch, 16-bit encoding"] === cm.decbnez: This is in the _development_ phase, for benchmarking and prototyping only Synopsis:: diff --git a/Zc-specification/cm_jalt.adoc b/Zc-specification/cm_jalt.adoc index e646272..1a51c62 100644 --- a/Zc-specification/cm_jalt.adoc +++ b/Zc-specification/cm_jalt.adoc @@ -1,32 +1,24 @@ <<< -[#insns-cm_jalt,reftext="cm.jalt: jump via table and link to ra"] +[#insns-cm_jalt,reftext="Jump via table with optional link"] === cm.jalt Synopsis:: -jump via table and link to ra +jump via table with optional link Mnemonic:: -cm.jalt _index_ +cm.jalt _index_ Encoding (RV32, RV64):: [wavedrom, , svg] .... {reg:[ { bits: 2, name: 0x2, attr: ['C2'] }, - { bits: 8, name: 'index' , attr: [] }, + { bits: 8, name: 'index', attr: [] }, { bits: 3, name: 0x0, attr: [] }, { bits: 3, name: 0x5, attr: ['FUNCT3'] }, ],config:{bits:16}} .... -[NOTE] - - For this encoding to decode as _cm.jalt_, _index>=64_, otherwise it decodes as <>. - -[NOTE] - - The equivalent encoding with bit[10]=1 is reserved to allow future expansion of the table index. - Assembly Syntax:: [source,sail] @@ -36,7 +28,9 @@ cm.jalt index Description:: -_cm.jalt_ reads an entry from the jump vector table in memory and jumps to the address that was read, linking to _ra_. +_cm.jalt_ reads an entry from the jump vector table in memory and jumps to the address that was read. If the LSB of the table entry is 0 then link to ra, otherwise don't link. + +_cm.jalt_ is reserved if executing from a mode with higher privilege than user mode where the current XLEN does not match UXLEN. For further information see <>. @@ -55,6 +49,11 @@ Operation:: -- //This is not SAIL, it's pseudo-code. The SAIL hasn't been written yet. +if (mode==M && MXLEN !=UXLEN) reserved(); +if (mode==S && SXLEN !=UXLEN) reserved(); +if (mode==HS && HSXLEN!=UXLEN) reserved(); +if (mode==VS && VSXLEN!=UXLEN) reserved(); + # target_address is temporary internal state, it doesn't represent a real register # InstMemory is byte indexed @@ -66,12 +65,13 @@ switch(XLEN) { //fetch from the jump table target_address[XLEN-1:0] = InstMemory[table_address][XLEN-1:0]; -//jump to the target address -jalr ra, target_address[XLEN-1:0]&~0x1; +//jump to the target address, check the LSB to see whether to link +if (target_address[0]==1'b0) + jalr target_address[XLEN-1:0]&~0x1; +else + jr target_address[XLEN-1:0]&~0x1; -- - - include::Zcmt_footer.adoc[] diff --git a/Zc-specification/cm_jt.adoc b/Zc-specification/cm_jt.adoc index 9ed3b7f..0edd47b 100644 --- a/Zc-specification/cm_jt.adoc +++ b/Zc-specification/cm_jt.adoc @@ -1,9 +1,9 @@ <<< -[#insns-cm_jt,reftext="cm.jt: jump via table without link"] +[#insns-cm_jt,reftext="Jump via table with optional link"] === cm.jt Synopsis:: -jump via table without link +jump via table with optional link Mnemonic:: cm.jt _index_ @@ -13,8 +13,8 @@ Encoding (RV32, RV64):: .... {reg:[ { bits: 2, name: 0x2, attr: ['C2'] }, - { bits: 6, name: 'index', attr: [] }, - { bits: 5, name: 0x0, attr: [] }, + { bits: 8, name: 'index', attr: [] }, + { bits: 3, name: 0x0, attr: [] }, { bits: 3, name: 0x5, attr: ['FUNCT3'] }, ],config:{bits:16}} .... @@ -28,7 +28,7 @@ cm.jt index Description:: -_cm.jt_ reads an entry from the jump vector table in memory and jumps to the address that was read, without linking. +_cm.jt_ reads an entry from the jump vector table in memory and jumps to the address that was read. If the LSB of the table entry is 1 then link to ra, otherwise don't link. For further information see <>. @@ -58,8 +58,11 @@ switch(XLEN) { //fetch from the jump table target_address[XLEN-1:0] = InstMemory[table_address][XLEN-1:0]; -//jump to the target address -jr target_address[XLEN-1:0]&~0x1; +//jump to the target address, check the LSB to see whether to link +if (target_address[0]==1'b1) + jalr target_address[XLEN-1:0]&~0x1; +else + jr target_address[XLEN-1:0]&~0x1; -- diff --git a/Zc-specification/cm_lb.adoc b/Zc-specification/cm_lb.adoc index 1458454..06567d8 100644 --- a/Zc-specification/cm_lb.adoc +++ b/Zc-specification/cm_lb.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_lb,reftext="cm.lb: Load signed byte, 16-bit encoding"] +[#insns-cm_lb,reftext="Load signed byte, 16-bit encoding"] === cm.lb Synopsis:: diff --git a/Zc-specification/cm_lbu.adoc b/Zc-specification/cm_lbu.adoc index 4820cd0..556505d 100644 --- a/Zc-specification/cm_lbu.adoc +++ b/Zc-specification/cm_lbu.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_lbu,reftext="cm.lbu: Load unsigned byte, 16-bit encoding"] +[#insns-cm_lbu,reftext="Load unsigned byte, 16-bit encoding"] === cm.lbu Synopsis:: diff --git a/Zc-specification/cm_lh.adoc b/Zc-specification/cm_lh.adoc index f032517..3c21949 100644 --- a/Zc-specification/cm_lh.adoc +++ b/Zc-specification/cm_lh.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_lh,reftext="cm.lh: Load signed halfword, 16-bit encoding"] +[#insns-cm_lh,reftext="Load signed halfword, 16-bit encoding"] === cm.lh Synopsis:: diff --git a/Zc-specification/cm_lhu.adoc b/Zc-specification/cm_lhu.adoc index 1bf9c7d..7bcdde3 100644 --- a/Zc-specification/cm_lhu.adoc +++ b/Zc-specification/cm_lhu.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_lhu,reftext="cm.lhu: Load unsigned halfword, 16-bit encoding"] +[#insns-cm_lhu,reftext="Load unsigned halfword, 16-bit encoding"] === cm.lhu Synopsis:: diff --git a/Zc-specification/cm_mva01s.adoc b/Zc-specification/cm_mva01s.adoc index f98b0c0..efb6bd1 100644 --- a/Zc-specification/cm_mva01s.adoc +++ b/Zc-specification/cm_mva01s.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_mva01s,reftext="cm.mva01s: move two s0-s7 registers into a0-a1"] +[#insns-cm_mva01s,reftext="Move two s0-s7 registers into a0-a1"] === cm.mva01s Synopsis:: @@ -50,7 +50,7 @@ Operation:: -- //This is not SAIL, it's pseudo-code. The SAIL hasn't been written yet. if (RV32E && (sreg1>1 || sreg2>1)) { - take_illegal_instruction_exception(); + reserved(); } xreg1 = {sreg1[2:1]>0,sreg1[2:1]==0,sreg1[2:0]}; xreg2 = {sreg2[2:1]>0,sreg2[2:1]==0,sreg2[2:0]}; diff --git a/Zc-specification/cm_mvsa01.adoc b/Zc-specification/cm_mvsa01.adoc index 4c30bf2..f07eeeb 100644 --- a/Zc-specification/cm_mvsa01.adoc +++ b/Zc-specification/cm_mvsa01.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_mvsa01,reftext="cm.mvsa01: move a0-a1 into two different s0-s7 registers"] +[#insns-cm_mvsa01,reftext="Move a0-a1 into two different s0-s7 registers"] === cm.mvsa01 Synopsis:: @@ -53,7 +53,7 @@ Operation:: -- //This is not SAIL, it's pseudo-code. The SAIL hasn't been written yet. if (RV32E && (sreg1>1 || sreg2>1)) { - take_illegal_instruction_exception(); + reserved(); } xreg1 = {sreg1[2:1]>0,sreg1[2:1]==0,sreg1[2:0]}; xreg2 = {sreg2[2:1]>0,sreg2[2:1]==0,sreg2[2:0]}; diff --git a/Zc-specification/cm_pop.adoc b/Zc-specification/cm_pop.adoc index 3f6dcdf..5344ea9 100644 --- a/Zc-specification/cm_pop.adoc +++ b/Zc-specification/cm_pop.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_pop,reftext="cm.pop: Destroy stack frame: pop registers, deallocate stack frame."] +[#insns-cm_pop,reftext="Pop registers, deallocate stack frame."] === cm.pop Synopsis:: diff --git a/Zc-specification/cm_popret.adoc b/Zc-specification/cm_popret.adoc index 25a8f92..ba2fbaa 100644 --- a/Zc-specification/cm_popret.adoc +++ b/Zc-specification/cm_popret.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_popret,reftext="cm.popret: Destroy stack frame: pop registers, deallocate stack frame, return."] +[#insns-cm_popret,reftext="Pop registers, deallocate stack frame, return."] === cm.popret Synopsis:: diff --git a/Zc-specification/cm_popretz.adoc b/Zc-specification/cm_popretz.adoc index 65ee128..c7e942f 100644 --- a/Zc-specification/cm_popretz.adoc +++ b/Zc-specification/cm_popretz.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_popretz,reftext="cm.popretz: Destroy stack frame: pop registers, deallocate stack frame, return zero."] +[#insns-cm_popretz,reftext="Pop registers, deallocate stack frame, return zero."] === cm.popretz Synopsis:: diff --git a/Zc-specification/cm_push.adoc b/Zc-specification/cm_push.adoc index 866af6c..b0a574e 100644 --- a/Zc-specification/cm_push.adoc +++ b/Zc-specification/cm_push.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_push,reftext="cm.push: Create stack frame: push registers, allocate additional stack space."] +[#insns-cm_push,reftext="Create stack frame: push registers, allocate additional stack space."] === cm.push Synopsis:: diff --git a/Zc-specification/cm_sb.adoc b/Zc-specification/cm_sb.adoc index 40da6c0..cef16a8 100644 --- a/Zc-specification/cm_sb.adoc +++ b/Zc-specification/cm_sb.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_sb,reftext="cm.sb: Store byte, 16-bit encoding"] +[#insns-cm_sb,reftext="Store byte, 16-bit encoding"] === cm.sb Synopsis:: diff --git a/Zc-specification/cm_sh.adoc b/Zc-specification/cm_sh.adoc index 0467655..659bb85 100644 --- a/Zc-specification/cm_sh.adoc +++ b/Zc-specification/cm_sh.adoc @@ -1,5 +1,5 @@ <<< -[#insns-cm_sh,reftext="cm.sh: Store halfword, 16-bit encoding"] +[#insns-cm_sh,reftext="Store halfword, 16-bit encoding"] === cm.sh Synopsis:: diff --git a/Zc-specification/jvt_csr.adoc b/Zc-specification/jvt_csr.adoc index 910efb6..80e25b6 100644 --- a/Zc-specification/jvt_csr.adoc +++ b/Zc-specification/jvt_csr.adoc @@ -11,24 +11,33 @@ Address:: Permissions:: URW -Format (RV32, RV64):: +Format (RV32):: [wavedrom, , svg] .... {reg:[ { bits: 6, name: 'mode', attr: ['6'] }, - { bits: 26, name: 'base', attr: ['XLEN-6'] }, + { bits: 26, name: 'base[XLEN-1:6] (WARL)}', attr: ['XLEN-6'] }, ],config:{bits:32}} .... +Format (RV64):: +[wavedrom, , svg] +.... +{reg:[ + { bits: 6, name: 'mode', attr: ['6'] }, + { bits: 58, name: 'base[XLEN-1:6] (WARL)', attr: ['XLEN-6'] }, +],config:{bits:64}} +.... + Description:: -_JVT.base_ is a virtual address, whenever virtual memory is enabled. +The _JVT_ register is an UXLEN-bit *WARL* read/write register that holds the jump table configuration, consisting of the jump table base address (BASE) and the jump table mode (MODE). -_JVT.base[5:0]_ is implicitly zero, and is naturally aligned for all legal values of _XLEN_. +If <> is implemented then _JVT_ must also be implemented, but can contain a read-only value. If _JVT_ is writable, the set of values the register may hold can vary by implementation. The value in the BASE field must always be aligned on a 64-byte boundary. -_JVT.base_ is a WARL field as the set of values the register may hold can vary by implementation. +_JVT.base_ is a virtual address, whenever virtual memory is enabled. -The memory pointed to by _JVT.base_ is treated as instruction memory for the purpose of executing table jump instructions. +The memory pointed to by _JVT.base_ is treated as instruction memory for the purpose of executing table jump instructions, implying execute access permission. [#JVT-config-table] ._JVT.mode_ definition @@ -39,20 +48,18 @@ The memory pointed to by _JVT.base_ is treated as instruction memory for the pur | others | *reserved for future standard use* |============================================================================================= -_JVT.mode_ is a WARL field, so can only be programmed to modes which are implemented. Therefore the discovery mechanism is to +_JVT.mode_ is a *WARL* field, so can only be programmed to modes which are implemented. Therefore the discovery mechanism is to attempt to program different modes and read back the values to see which are available. Jump table mode _must_ be implemented. +NOTE: in future the RISC-V Unified Discovery method will report the available modes. + Architectural State:: -JVT adds architectural state to the context, therefore must be saved/restored on context switches. +_JVT_ adds architectural state to the system software context (such as an OS process), therefore must be saved/restored on context switches. State Enable:: -Bit 2 of the Smstateen CSRs are allocated to control access to JVT. If the Smstateen extension is implemented, the following text is valid: - -Bit 2 applies only for the case that Zcmt is implemented, which includes the JVT CSR and the _cm.jt_ and _cm.jalt_ instructions. -If bit 2 of a controlling stateen0 CSR is zero, then the _cm.jt_ and _cm.jalt_ instructions both cause an illegal instruction trap -(or virtual instruction trap, if relevant). +If the Smstateen extension is implemented, then bit 2 in _mstateen0_, _sstateen0_, and _hstateen0_ is implemented. If bit 2 of a controlling _stateen0_ CSR is zero, then access to the _JVT_ CSR and execution of a _cm.jalt_ instruction result in an Illegal Instruction trap (or, if appropriate, a Virtual Instruction trap). include::Zcmt_footer.adoc[] diff --git a/Zc-specification/pushpop.adoc b/Zc-specification/pushpop.adoc index 99f3d85..7a3fc76 100644 --- a/Zc-specification/pushpop.adoc +++ b/Zc-specification/pushpop.adoc @@ -23,8 +23,8 @@ Common details for these instructions are in this section. PUSH, POP, POPRET are used to reduce the size of function prologues and epilogues. . The PUSH instruction -** pushes (stores) the registers specified in the register list to the stack frame ** adjusts the stack pointer to create the stack frame +** pushes (stores) the registers specified in the register list to the stack frame . The POP instruction ** pops (loads) the registers in the register list from the stack frame @@ -125,22 +125,23 @@ calling the millicode _save/restore_ routines and so may also perform better. ==== Stack pointer adjustment handling The instructions all automatically adjust the stack pointer by enough to cover the memory required for the registers being saved or restored. -Additionally the _spimm_ field in the encoding allows the stack pointer to be adjusted by extra 16-byte blocks. There is only a small restricted +Additionally the _spimm_ field in the encoding allows the stack pointer to be adjusted in additional increments of 16-bytes. There is only a small restricted range available in the encoding; if the range is insufficient then a separate _c.addi16sp_ can be used to increase the range. ==== Register list handling -The instructions do not directly support _{ra, s0-s10}_ to reduce the amount of encoding space required. If this register list is required then _s11_ -should also be included. This costs a small amount of memory and performance, but saves code-size. +There is no support for the _{ra, s0-s10}_ register list without also adding _s11_. Therefore the _{ra, s0-s11}_ register list must be used in this case. [#pushpop-idempotent-memory] === PUSH/POP Fault handling Correct execution requires that _sp_ refers to idempotent memory (also see <>), because the core must be able to -handle faults detected during the sequence. -The entire PUSH/POP sequence is re-executed after returning from the fault handler, and multiple faults are possible during the sequence. +handle traps detected during the sequence. +The entire PUSH/POP sequence is re-executed after returning from the trap handler, and multiple traps are possible during the sequence. + +If a trap occurs during the sequence then _xEPC_ is updated with the PC of the instruction, _xTVAL_ (if not read-only-zero) updated with the bad address if it was an access fault and _xCAUSE_ updated with the type of trap. -It is implementation defined whether interrupts can also be taken during the sequence execution. +NOTE: It is implementation defined whether interrupts can also be taken during the sequence execution. [#pushpop-software-view] === Software view of execution @@ -155,11 +156,9 @@ From a software perspective the PUSH sequence appears as: ** Any of the bytes may be written multiple times. * A stack pointer adjustment -If an implementation allows interrupts during the sequence, and the interrupt handler uses _sp_ to allocate stack memory, then any stores which were executed before the interrupt may be overwritten by the handler. -This is safe because the memory is idempotent and the stores will be re-executed when execution resumes. +NOTE: If an implementation allows interrupts during the sequence, and the interrupt handler uses _sp_ to allocate stack memory, then any stores which were executed before the interrupt may be overwritten by the handler. This is safe because the memory is idempotent and the stores will be re-executed when execution resumes. -The stack pointer adjustment must only be committed only when it is certain that the entire PUSH instruction will complete -without triggering any precise faults (for example, page faults), and without the core taking an interrupt. +The stack pointer adjustment must only be committed only when it is certain that the entire PUSH instruction will commit. Stores may also return imprecise faults from the bus. It is platform defined whether the core implementation waits for the bus responses before continuing to the final stage of the sequence, @@ -207,12 +206,11 @@ From a software perspective the POP/POPRET sequence appears as: * An optional `li a0, 0` * An optional `ret` -If an implementation allows interrupts during the sequence, then any loads which were executed before the interrupt may update architectural state. -The loads will be re-executed once the handler completes, so the values will be overwritten. -Therefore it is permitted for an implementation to update some of the destination registers before taking an interrupt or other fault. +If a trap occurs during the sequence, then any loads which were executed before the trap may update architectural state. +The loads will be re-executed once the trap handler completes, so the values will be overwritten. +Therefore it is permitted for an implementation to update some of the destination registers before taking a fault. -The optional `li a0, 0`, stack pointer adjustment and optional `ret` must only be committed only when it is certain that the entire POP/POPRET instruction will complete -without triggering any precise faults (for example, page faults), and without the core taking an interrupt. +The optional `li a0, 0`, stack pointer adjustment and optional `ret` must only be committed only when it is certain that the entire POP/POPRET instruction will commit. For POPRET once the stack pointer adjustment has been committed the `ret` must execute. @@ -244,10 +242,6 @@ addi sp, sp, 32 ret -- -=== Forward progress guarantee - -The PUSH/POP sequence has the same forward progress guarantee as executing the instructions from the equivalent assembly sequences. - [[pushpop_non-idem-mem]] === Non-idempotent memory handling @@ -257,7 +251,7 @@ If the core implementation does not support PUSH/POP to non-idempotent memories, load (POP/POPRET) or store (PUSH) access fault exception in order to avoid unpredictable results. If the core implementation does support PUSH/POP to non-idempotent memory, then it may not be possible to re-execute the sequence after a fault. -In this case the fault handler should complete the sequence in software. +In this case the fault handler should complete the sequence in software. In this case xTVAL must be written with the bad address to allow the handler to complete the sequence. <<< diff --git a/Zc-specification/pushpop_extra_info.adoc b/Zc-specification/pushpop_extra_info.adoc index 14a083b..fb7c1d6 100644 --- a/Zc-specification/pushpop_extra_info.adoc +++ b/Zc-specification/pushpop_extra_info.adoc @@ -7,9 +7,9 @@ For further information see <>. Stack Adjustment Calculation:: -_stack_adj_base_ is the minimum number of bytes, in multiples of 16-byte blocks, required to cover the registers in the list. +_stack_adj_base_ is the minimum number of bytes, in multiples of 16-byte address increments, required to cover the registers in the list. -_spimm_ is the number of additional 16-byte blocks allocated for the stack frame. +_spimm_ is the number of additional 16-byte address increments allocated for the stack frame. The total stack adjustment represents the total size of the stack frame, which is _stack_adj_base_ added to _spimm_ scaled by 16, as defined above. diff --git a/Zc-specification/pushpop_vars.adoc b/Zc-specification/pushpop_vars.adoc index c1672ab..84c48aa 100644 --- a/Zc-specification/pushpop_vars.adoc +++ b/Zc-specification/pushpop_vars.adoc @@ -7,7 +7,7 @@ switch (rlist){ case 4: {reg_list="ra"; xreg_list="x1";} case 5: {reg_list="ra, s0"; xreg_list="x1, x8";} case 6: {reg_list="ra, s0-s1"; xreg_list="x1, x8-x9";} - default: take_illegal_instruction_exception(); + default: reserved(); } stack_adj = stack_adj_base + spimm[5:4] * 16; -- @@ -30,7 +30,7 @@ switch (rlist){ case 14: {reg_list="ra, s0-s9"; xreg_list="x1, x8-x9, x18-x25";} //note - to include s10, s11 must also be included case 15: {reg_list="ra, s0-s11"; xreg_list="x1, x8-x9, x18-x27";} - default: take_illegal_instruction_exception(); + default: reserved(); } stack_adj = stack_adj_base + spimm[5:4] * 16; -- diff --git a/Zc-specification/tablejump.adoc b/Zc-specification/tablejump.adoc index dbb918d..1aa140f 100644 --- a/Zc-specification/tablejump.adoc +++ b/Zc-specification/tablejump.adoc @@ -1,93 +1,48 @@ <<< -[#insns-tablejump,reftext="Table Jump Instructions"] -== Table Jump Instructions +[#insns-tablejump,reftext="Table Jump Overview"] +== Table Jump Overview -These instructions are collectively referred to as table jump: +<<#insns-cm_jalt>> is referred to as table jump. -* <<#insns-cm_jt>> -* <<#insns-cm_jalt>> +Table jump uses a 256-entry UXLEN wide table in instruction memory to contain function addresses, +and a flag in bit zero of each entry indicating whether jumping to the function address should link or not. +The table must be a minimum of 64-byte aligned. -Common details for these instructions are in this section. +_cm./jalt_ encodings index the table, giving access to functions within the full UXLEN wide address space. -=== Table Jump Overview +This is used as a form of dictionary compression used to reduce the code size of _jal_ / _auipc+jalr_ / _jr_ / _auipc+jr_ instructions. -Table jump is a form of dictionary compression used to reduce the code size of _jal_ / _auipc+jalr_ / _jr_ / _auipc+jr_ instructions. +Table jump allows the linker to replace the following instruction sequences with a _cm.jalt_ encoding, and an entry in the table: -Function calls and jumps to fixed labels typically take 32-bit or 64-bit instruction sequences. - -Table jump allows the linker to: - -* replace 32-bit _j_ calls with _cm.jt_ -* replace 32-bit _jal_ ra calls with _cm.jalt_ -* replace 64-bit _auipc/jalr_ calls to fixed locations with _cm.jt_ -* replace 64-bit _auipc/jalr ra_ calls to fixed locations with _cm.jalt_ +* 32-bit _j_ calls +* 32-bit _jal_ ra calls +* 64-bit _auipc/jalr_ calls to fixed locations +* 64-bit _auipc/jalr ra_ calls to fixed locations ** The _auipc+jr/jalr_ sequence is used because the offset from the PC is out of the ±1MB range. === JVT -The base of the table is in the JVT CSR (see <>), each table entry is XLEN bits. - -The table entry number is from the _index_ field in the encoding, which controls the link register. - -* cm.jt : entries 0-63, link to _zero_ -* cm.jalt : entries 64-255, link to _ra_ +The base of the table is in the JVT CSR (see <>), each table entry is UXLEN bits. -Note that the LSB of every jump vector table entry is _ignored_ which matches standard _jalr_ behaviour. +Bit zero of every table entry indicates whether to update the link register. If the same function is called with and without linking then it must have two entries in the table. -This case does happen in practice but only affects a small number of entries so it does not waste much space in the table. -It is typically caused by the same function being called with and without tail calling. - -<<< -[#tablejump-algorithm] -=== Recommended algorithm for allocating entries in the jump vector table - -Calls to each function are categorised as shown in <>. +This is typically caused by the same function being called with and without tail calling. -[#tablejump-savings] -.Table jump code size saving for each function call replacement -[width="100%",options=header] -|======================================================================================================================= -| original sequence | Table Jump saving -| _j_ | A*2-(XLEN/8) bytes -| _auipc+jr_ | B*6-(XLEN/8) bytes -| _jal ra_ | C*2-(XLEN/8) bytes -| _auipc+jalr ra_ | D*6-(XLEN/8) bytes -|======================================================================================================================= - -Each function is called by using one of the two link registers. The total saving per function is calculated by counting the number of calls and adding up the total saving from each replacement of the existing sequence with a Table Jump instruction, as follows: -[source,sourceCode,text] ----- -saving_per_function_cm_jt = A * 2 + B * 6 - 2*(XLEN-8) -saving_per_function_cm_jalt = C * 2 + D * 6 - 2*(XLEN-8) ----- - -The functions are sorted so that the one with the highest saving is in table entry 0, the second highest in entry 1 etc. for that encoding. - -[NOTE] - - This algorithm assumes that each function is only called with one link register. - If the same function is called with more than one link register, then it must have two entries in the table. - -This allows the core to cache the most frequent targets by caching the lowest numbered entries of each section of the jump vector table. -Only caching a few entries will greatly improve the performance. - -<<< [#tablejump-fault-handling] === Table Jump Fault handling For a table jump instruction, the table entry that the instruction selects is considered an extension of the instruction itself. -Hence, the execution of a table jump instruction involves two instruction fetches, the first to read the main instruction (_cm.jt_ -or _cm.jalt_) and the second to read from the jump vector table (JVT). Both instruction fetches are _implicit_ reads, and both require -execute permission; read permission is irrelevant. The address of the second fetch is not considered to be a PC, and so cannot be used for PC based debugging, tracing etc. +Hence, the execution of a table jump instruction involves two instruction fetches, the first to read the instruction (_cm.jalt_) +and the second to read from the jump vector table (JVT). Both instruction fetches are _implicit_ reads, and both require +execute permission; read permission is irrelevant. It is recommended that the second fetch be ignored for hardware triggers and breakpoints. Memory writes to the jump vector table require an instruction barrier (_fence.i_) to guarantee that they are visible to the instruction fetch. Multiple contexts may have different jump vector tables. JVT may be switched between them without an instruction barrier if the tables have not been updated in memory since the last _fence.i_. -If an exception occurs on either instruction fetch, xEPC is set to the PC of the table jump instruction, xCAUSE is set as expected for the type of fault and -xTVAL (if not set to zero) contains the address which caused the fault. +If an exception occurs on either instruction fetch, xEPC is set to the PC of the table jump instruction, xCAUSE is set as expected for the type of fault and xTVAL (if not set to zero) contains the fetch address which caused the fault. include::Zcmt_footer.adoc[] diff --git a/Zc-specification/tablejump_pseudocode.adoc b/Zc-specification/tablejump_pseudocode.adoc deleted file mode 100644 index 0f0e2ff..0000000 --- a/Zc-specification/tablejump_pseudocode.adoc +++ /dev/null @@ -1,25 +0,0 @@ -[source,sail] --- -//This is not SAIL, it's pseudo-code. The SAIL hasn't been written yet. - -# target_address is temporary internal state, it doesn't represent a real register -# Mem is byte indexed - -switch(XLEN) { - 32: table_address[XLEN-1:0] = JVT.base + (index<<2); - 64: table_address[XLEN-1:0] = JVT.base + (index<<3); -} - -//fetch from the jump table -target_address[XLEN-1:0] = InstMemory[table_address][XLEN-1:0]; - -//jump to the target address -if (OPCODE=="cm.jalt") { - jalr ra, target_address[XLEN-1:0]&~0x1; -} else { - jr target_address[XLEN-1:0]&~0x1; -} - --- - -