Replies: 3 comments 4 replies
-
Hey there!
The CFU has no access to the CPU's memory port so it cannot read/write data from/to memory by itself. The CFU operates on the CPU's register file only. Hence, all source operands (= the elements of your array/vector) have to be in the core's register file. If your custom hardware module (= the CFU in this case) needs direct access to memory, you might be better off using a custom co-processor outside of the CPU/processor. Another option to accelerate CFU data access would be to add additional read ports to the core's register file. For example, having one more read port you could use instructions with 3 source registers: neorv32_cfu_cmd0(0b0000000, arr[0], arr[1], arr[2]); // load arr[0] ... arr[2]
neorv32_cfu_cmd0(0b0000001, arr[3], arr[4], arr[5]); // load arr[3] ... arr[5] Having more than 3 register file read ports would require a customization of the compiler because there are no standard RISC-V instructions with more than 3 register sources (at least not in the scalar extensions). |
Beta Was this translation helpful? Give feedback.
-
In case you're interested - I started upgrading the CFU to support R4-type instructions (three source operands) in #449. The modifications are fully backwards-compatible. |
Beta Was this translation helpful? Give feedback.
-
The CFU now supports R4-type instructions. There are two custom opcodes left: With these opcodes it would be possible to implement R5-type instructions with 4 source registers. As far as I know, there is no pre-defined RISC-V instruction word layout for this, but that is not really a problem. However, having 4 source registers and 1 destination register (each requiring 5-bits in the instruction) plus the 7-bit opcode would not leave any further bits to specify the actual operation. |
Beta Was this translation helpful? Give feedback.
-
Hello,
I would like to create some custom instructions to do 128-bit vector operations. The way I implement this functionality for now is to do something like this:
Is it possible for CFU to access multiple memory elements based on some offset?
For example the command
could make the CFU access
arr[0], arr[0]+1, arr[0]+2, arr[0]+3
from memory immediately and concatenate them to my vector register file.Thanks in advance,
Jimmy
Beta Was this translation helpful? Give feedback.
All reactions