## Core Microarchitecture Enhancements



|                                  | Broadwell<br>uArch | Skylake<br>uArch                    |
|----------------------------------|--------------------|-------------------------------------|
| Out-of-order<br>Window           | 192                | 224                                 |
| In-flight Loads +<br>Stores      | 72 + 42            | 72 + <b>56</b>                      |
| Scheduler Entries                | 60                 | 97                                  |
| Registers –<br>Integer + FP      | 168 + 168          | <b>180</b> + 168                    |
| Allocation Queue                 | 56                 | 64/thread                           |
| L1D BW (B/Cyc) –<br>Load + Store | 64 + 32            | 128 + 64                            |
| L2 Unified TLB                   | 4K+2M: 1024        | 4K+2M: <b>1536</b><br><b>1G: 16</b> |

- Larger and improved branch predictor, higher throughput decoder, larger window to extract ILP
- Improved scheduler and execution engine, improved throughput and latency of divide/sqrt
- More load/store bandwidth, deeper load/store buffers, improved prefetcher
- Data center specific enhancements: Intel® AVX-512 with 2 FMAs per core, larger 1MB MLC

ABOUT 10% PERFORMANCE IMPROVEMENT PER CORE ON INTEGER APPLICATIONS AT SAME FREQUENCY