Skip to content

ithemal/timing-harness

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

19 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

More information about this profiler can be found in this paper.

  • BHive: A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models
    Yishen Chen, Ajay Brahmakshatriya, Charith Mendis, Alex Renda, Eric Atkinson, Ondrej Sykora, Saman Amarasinghe, and Michael Carbin
    2019 IEEE International Symposium on Workload Characterization
@inproceedings{bhive,
  title={BHive: A Benchmark Suite and Measurement Framework for Validating x86-64 Basic Block Performance Models},
  author={Chen, Yishen and Brahmakshatriya, Ajay and Mendis,  Charith and Renda, Alex and Atkinson, Eric and Sykora, Ondrej and Amarasinghe, Saman and Carbin, Michael},
  booktitle={2019 IEEE international symposium on workload characterization (IISWC)},
  year={2019},
  organization={IEEE}
}

Use it as follow, where the hex encoding of the basic block that you want to profile. ./test <hex> <reps>

For instance, you can profile the throughput of pushq %rax like this.

$ ./test 50 100 # hex code of `pushq %rax` is `50`
Core_cyc	L1_read_misses	L1_write_misses	iCache_misses	Context_switches
868	         21	              -1	             0	           0
840	          0	              -1	             0	           0
790	          0	              -1	             0	           0
791	          0	              -1	             0	           0
790               0	              -1	             0	           0
791	          0	              -1	             0	           0
793	          0	              -1	             0	           0
791	          0	              -1	             0	           0
793	          0	              -1	             0	           0
794	          0	              -1	             0	           0
792	          0	              -1	             0	           0
791	          0	              -1	             0	           0
790	          0	              -1	             0	           0
794	          0	              -1	             0	           0
795	          0                   -1	             0	           0

Note that -1 signifies that the performance counter is not available in your hardware setup.

# Get latency for 200 iterations 
$ ./test 50 200
Core_cyc	L1_read_misses	L1_write_misses	iCache_misses	Context_switches
1142              24	              -1	             0	         0
933	            0	              -1	             0	         0
891	            0	              -1	             0	         0
892	            0	              -1	             0	         0
891	            0	              -1	             0	         0
897	            0	              -1	             0	         0
896	            0	              -1	             0	         0
894	            0	              -1	             0	         0
891	            0	              -1	             0	         0
895	            0	              -1	             0	         0
894	            0	              -1	             0	         0
892	            0	              -1	             0	         0
891	            0	              -1	             0	         0
895	            0	              -1	             0	         0
894	            0	              -1	             0	         0

# Core_cyc column reports latency (including measurement overhead) of executing the basic block 100 (200) iterations.
# We calculate the throughput as follows.
$ python <(echo 'print "Throughput:", (891.0 - 791.0)/100')
Throughput: 1.00

If you fancy using the harness as a library, just include harness.h and link against harness.a.

About

Harness for profiling arbitrary basic blocks.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages