Automatically exported from
Cuda Assembly C++ Other
Switch branches/tags
Nothing to show
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Failed to load latest commit information.


README written by Jee W. Choi.
Benchmarks written by Jee W. Choi, Marat Dukhan, and Xing Liu.

This is a collection of ubenchmarks that has been optimized for performance for different architectures.

The list of benchmarks are
1) intensity
2) L1 cache
3) L2 cache
4) random access

The list of target systems are
  Type Name         Done ubenchmark               To-do ubenchmark
1) CPU (Nehalem)    (DRAM/cache read/random,intensity)              (DONE)
2) CPU (Ivy Bridge) (DRAM/cache read/random,intensity)              (DONE)
3) GPU (Fermi)      (intensity,comp,DRAM/cache/SM read,DRAM random) (DONE)
4) GPU (Kepler)     (intensity,comp,DRAM/cache/SM read,DRAM random) (DONE)
5) ARM (Cortex A9)  (DRAM/cache read/random,intensity)              (DONE)
6) ARM (Cortex A15) (DRAM/cache read/random,intensity)              (DONE)
7) GPU (Mali T604)  (intensity,DRAM rand,cache read)                (DONE)
8) CPU (AMD APU)    (intensity,DRAM/cache read/random)              (DONE)
9) GPU (Radeon)     (intensity,cache read,DRAM random)              (DONE)
10) Xeon Phi        (DRAM/cache read/random,intensity)               (DONE)

The DRAM/cache read, random access benchmarks for ARM, x86, and Phis are located under ./generic/