## Scalar Processor Diagram



## Planned Architectural Features

- Pipeline to allow for ILP
- Pipeline registers (between each pipeline stage)
- Multiple Execution Units, With varying tasks for better ILP
- Branch Prediction
- Reservation Stations for OoO execution
- ROB
- Floating Point Support

## **Future Experiments**

- Show that implementing a single Pipeline reduces the number of cycles needed for execution of vector addition.
- Show that expanding number of EUs reduces number of cycles further for vector addition
- See that a 2-bit dynamic branch prediction has a miss rate of <10% on a quick sort of a list of 30 items
- Show that branch prediction is within 10% performance (no. of cycles needed) of a unrolled vector addition loop
- Use of Reservation stations reduces stalls by 25% for a recursive algorithm factorial (lots multiplication and memory access so chance for stalls)
- Changing from unified to grouped reservation station will reduce stalls by 25%
- Using ROB with reservation station will minimise pipeline stalls and reduce total cycles