- Use
iaddl
and modify the order of some instructions, get 12.96 - Apply loop unrolling, finally choose to unroll 6 times, get 10.62
- Solve load/use hazard, get 9.83
- Use binary search tree to improve performence in rest, get 8.96
- Modify the
pipe-full.hcl
to makeJXX
more efficient (learn from zztoy / ComputerArch-Prj1), get 7.62 - delete the
pushl
andpopl
to see a better CPE, finally 7.18 (just for test because in this class these instructions can't be modified, different form CSAPP's original version)
- 32x32: 259 misses(minimum)
- 64x64: 1083 misses
- 61x67: 1758 misses