Optimize wide vectors to use 64 bit entries #1629
Original Assignee: Wilson Snyder (@wsnyder)
This is to capture notes on a performance experiment.
Verilator uses arrays of 32-bit words for signals over 64 bits. Analyzing some performance reports, it looked like using an array of 64-bit words would greatly shrink the code.
For signals that are e.g. 65-96 bits, going from 332-bits to 264-bits would increase code size and data cache pressure. A quick experiment showed only 3-4% loss.
Then made enough changes to generate some code, got a good 10% improvement though wrong simulation results. Cleaning up further found was missing setting some upper values, and resulted in a 4% loss overall. Therefore abandoning this for now.
Committed to master some changes for this. "EData" is now the entry that a wide vector (WData) is composed of. EData may be redefined as 64 bits. Many defines like VL_EXTEND_I needed changing to have a flavor (generally ..._E) to work when EData is 64 bits. Part of the changes are kept out of master in a branch, as would add dead code that would probably be confusing, and would harm code coverage. The biggest missing piece is code to allow the historical VPI/DPI/user code that needs to see 32-bit arrays.