Skip to content

Performance Papers

Travis Downs edited this page Oct 3, 2021 · 6 revisions

CPU performance papers, often Intel or x86 specific.

Non-Determinism and Overcount on Modern Hardware Performance Counter Implementations

This paper describes cases where x86 PMU counters are inexact, usually due to overcounting when external events occur.

Attack Directories, Not Caches: Side-Channel Attacks in a Non-Inclusive World

This paper describes in some detail the structure of the SKX (Skylake-SP, Skylake-X, etc) non-inclusive L3 cache, including the snoop filter structure. Interesting even apart from any cache side channel possibilities.

Reverse Engineering of Cache Replacement Policies in Intel Microprocessors and Their Evaluation

This paper describes PLRU replacement models for the CPU cache, including experimental evaluation of the Core 2 Duo caches, with the result that three different strategies appear to be used across three models.

BlackjackBench: Portable Hardware Characterization with Automated Results Analysis This paper has micro-benchmarks designed to suss out hardware details like cache sizes, instruction latencies, etc and includes description of automatically interpreting the results.

Learning to Superoptimize Real-world Programs Using AI for superoptimization. Introduces a "Big Assembly" benchmark of 25K kernels extracted from open source projects.

Clone this wiki locally