Table A: CPU Cortex-A53 and Cortex-A73 performance counters

|  |  |  |  |  |  |  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- | --- |
| *Branch(A53)* | Immediate | Indirect | Mispredicted | PC change | Potential prediction | Taken |  |  |  |  |  |
| *Branch(A73)* | Immediate | Indirect | Mispredicted | PC change | Potential prediction |  |  |  |  |  |  |
| *Bus(A53)* | Access |  |  |  | Cycle | Read | Write |  |  |  |  |
| *Bus(A73)* | Access | Access normal | Access not shared | Access shared | Cycle |  |  | Peripheral |  |  |  |
| *Cache(A53)* |  | Allocate mode | Allocate mode enter | Data access | Data refill | Data TLB refill |  | Inst TLB refill | Instruction refill |  |  |
| *Cache(A73)* | BATC read |  |  | Data access | Data refill | Data TLB refill | Data ways | Inst TLB refill | Instruction refill |  |  |
| *Cache L1 (A53)* |  |  |  | Data error | Data write | Inst access | Inst error |  |  |  |  |
| *Cache L1 (A73)* | CP15 TLB refill | Data access write | Data read |  | Data write | Inst access |  | PLD TLB refill | TLB flush |  |  |
| *Cache L2 (A53)* | Data access |  |  |  |  | Data refill |  | Data write |  |  |  |
| *Cache L2 (A73)* | Data access | Data access write | Data clean | Data invalidate | Data read | Data refill | Data victim | Data write | TLB access | TLB miss |  |
| *Cache (A53)* | Linefill | Throttle | TLB error |  |  |  |  |  |  |  |  |
| *Cache (A73)* |  |  |  |  |  |  |  |  |  |  |  |
| *Clock(A53)* | Cycles |  |  |  |  |  |  |  |  |  |  |
| *Clock(A73)* | Cycles |  |  |  |  |  |  |  |  |  |  |
| *Counter chain(A53)* | Odd performance |  |  |  |  |  |  |  |  |  |  |
| *Counter chain(A73)* | Odd performance |  |  |  |  |  |  |  |  |  |  |
| *Exception(A53)* | Return | Taken |  |  |  |  |  |  |  |  |  |
| *Exception(A73)* | Return | Taken | Hypervisor |  |  |  |  |  |  |  |  |
| *ETM (A53)* |  |  |  |  |  |  |  |  |  |  |  |
| *ETM (A73)* | Output 0 | Output 1 |  |  |  |  |  |  |  |  |  |
| *Hypervisor (A53)* |  |  |  |  |  |  |  |  |  |  |  |
| *Hypervisor (A73)* | Traps |  |  |  |  |  |  |  |  |  |  |
| *Instruction(A53)* |  | CONTEXTIDR | Data read | Data executed | Memory write |  |  |  |  |  |  |
| *Instruction(A73)* | Advanced SIMD | CONTEXTIDR |  | Data executed |  | Crypto | DMB | DSB | Integer | ISB |  |
| *Instruction (A53)* |  |  |  |  |  |  |  |  |  |  |  |
| *Instruction (A73)* | Load | Load/store | Speculative | Stalled lindfill | Stalled page table walk | Store | VFP |  |  |  |  |
| *Intrinsic (A53)* |  |  |  |  |  |  |  |  |  |  |  |
| *Intrinsic (A73)* | LDREX | STREX fail |  |  |  |  |  |  |  |  |  |
| *Interrupts(A53)* | FIQ | IRQ |  |  |  |  |  |  |  |  |  |
| *Interrupts(A73)* |  |  |  |  |  |  |  |  |  |  |  |
| *Memory(A53)* | Error | External request | Memory access |  |  |  |  | Non-cacheable ext req | Snoop | Unaligned access | Write stall |
| *Memory(A73)* |  |  | Memory access | Read | Translation table | Unaligned | write |  |  |  |  |
| *Pre-decoder(A53)* | Error |  |  |  |  |  |  |  |  |  |  |
| *Pre-decoder(A73)* |  |  |  |  |  |  |  |  |  |  |  |
| *Procedure(A53)* | Return |  |  |  |  |  |  |  |  |  |  |
| *Procedure(A73)* | Return |  |  |  |  |  |  |  |  |  |  |
| *MMU (A53)* |  |  |  |  |  |  |  |  |  |  |  |
| *MMU (A73)* | cp15 Table Walk | Instruction Table Walk | LSU Table Walk | Preload Table Walk | Stage1 Table Walk | Stage2 Table Walk | Table Walk |  |  |  |  |
| *Slots (A53)* |  |  |  |  |  |  |  |  |  |  |  |
| *Slots (A73)* | Data engine issue Q | Data processing issue Q | Load/store issue Q | Load/store unit |  |  |  |  |  |  |  |
| *Software(A53)* | Increment |  |  |  |  |  |  |  |  |  |  |
| *Software(A73)* |  |  |  |  |  |  |  |  |  |  |  |
| *Stall(A53)* | Cache miss | DPU IP empty | Interlock address | Interlock other | SIMD/FPU | Load miss | Pre-decoder error | Store | TLB miss |  |  |
| *Stall(A73)* |  |  |  |  |  |  |  |  |  |  |  |

Table B: Other performance counters ( like GPU mali performance counters)

|  |  |  |  |  |  |
| --- | --- | --- | --- | --- | --- |
| Ftrace | Linux | Mali job manager | Mali L2 cache | Mali shared core | Mali Tiler |
| Block: block\_rq\_complete | Clock: Frequency (Cortex-A73) | Cycles: Fragment cycles | Beats: Read beats | Mali Core Cycles: Compute cycles | Culling: Culled by facing test |
| Block: block\_rq\_issue | Clock: Frequency (Cortex-A53) | Cycles: Fragment tasks | Beats: Write beats | Mali Core Cycles: Execution Core cycles | Culling: Culled by frustum test |
| Ext4: ext4\_da\_write | CPU Activity: System (Cortex-A73) | Cycles: GPU cycles | Lookups: Lookups | Mali Core Cycles: Fragment cycles | Culling: Culled by sample test |
| Kmem: kmalloc | CPU Activity: System (Cortex-A53) | Cycles: IRQ cycles | Lookups: Reads | Mali Core Cycles: Fragments queued cycles | Culling: Visible |
| Power: clock\_set\_rate | CPU Activity: User (Cortex-A73) | Cycles: Vertex-Tiling-Compute cycles | Lookups: Writes | Mali EZS Test: Fragment quads early ZS killed | Cycles: Tiler cycles |
| Power: cpu\_idle | CPU Activity: User (Cortex-A53) |  | Read Latency: 0-127 | Mali EZS Test: Fragment quads early ZS tested | Facing: Back facing prims |
|  | CPU Contention: Wait |  | Read Latency: 128-191 | Mali EZS Test: Fragment quads early ZS updated | Facing: Front facing prims |
|  | CPU I/O: Wait |  | Read Latency: 192-255 | Mali EZS Test: Fragment quads rasterized | Primitives: Lines |
|  | Disk I/O: Read |  | Read Latency: 256-319 | Mali Fragment Primitives: Primitives rasterized | Primitives: Points |
|  | Disk I/O: Write |  | Read Latency: 320-383 | Mali Fragment Quads: Opaque quads queued | Primitives: Triangles |
|  | Interrupts: IRQ (Cortex-A73) |  | Read Outstanding: 0-25 | Mali Instructions: Attribute instructions | Vertex Shading: Position shading requests |
|  | Interrupts: IRQ (Cortex-A53) |  | Read Outstanding: 25-50 | Mali Instructions: Instruction count | Vertex Shading: Varying shading requests |
|  | Interrupts: SoftIRQ (Cortex-A73) |  | Read Outstanding: 50-75 | Mali Instructions: Instruction diverged |  |
|  | Interrupts: SoftIRQ (Cortex-A53) |  | Stall: Read stalls | Mali Instructions: Texture instructions |  |
|  | Memory: Buffer |  | Stall: Write stalls | Mali Instructions: Varying instructions |  |
|  | Memory: Cached |  | Transactions: Reads | Mali LSC: Full reads |  |
|  | Memory: Free |  | Transactions: Writes | Mali LSC: Full writes |  |
|  | Memory: Slab |  | Write Outstanding: 0-25 | Mali LSC: Short reads |  |
|  | Memory: Used |  | Write Outstanding: 25-50 | Mali LSC: Short writes |  |
|  | Network: Receive |  | Write Outstanding: 50-75 | Mali LZS Test: Fragment threads late ZS killed |  |
|  | Network: Transmit |  |  | Mali LZS Test: Fragment threads late ZS tested |  |
|  | Scheduler: Switch (Cortex-A73) |  |  | Mali Quads: Compute quads shaded |  |
|  | Scheduler: Switch (Cortex-A53) |  |  | Mali Quads: Fragment partial quads shaded |  |
|  |  |  |  | Mali Quads: Fragment quads shaded |  |
|  |  |  |  | Mali Shader External Reads: LSC external read beats |  |
|  |  |  |  | Mali Shader External Reads: Texture external read beats |  |
|  |  |  |  | Mali Shader Reads: LSC read beats |  |
|  |  |  |  | Mali Shader Reads: Texture read beats |  |
|  |  |  |  | Mali Shader Writes: LSC write beats |  |
|  |  |  |  | Mali Shader Writes: Tilebuffer write beats |  |
|  |  |  |  | Mali Texture Cycles: Texture cycles |  |
|  |  |  |  | Mali Texture Usage: 3D |  |
|  |  |  |  | Mali Texture Usage: Compressed |  |
|  |  |  |  | Mali Texture Usage: Trilinear |  |
|  |  |  |  | Mali Tiles: Tiles rendered |  |
|  |  |  |  | Mali Tiles: Tiles writes discarded |  |
|  |  |  |  | Mali Varying Usage: Varying 16-bit cycles |  |
|  |  |  |  | Mali Varying Usage: Varying 32-bit cycles |  |

**CPU Activity**

The percentage of the CPU time that is spent in system or user code, the remainder being idle

time.

**Cache**

The number of memory reads or writes that cause a cache access or a cache refill of at least the

level of data or unified cache closest to the processor.

**Clock**

The number of cycles that are used by each core.

**Disk I/O**

The number of bytes read from or written to disk.

**Instruction**

An approximate count of the total number of instructions that each core executes, and the

number of instructions that read from or write to memory.

**Interrupts**

Maps the amount of both soft IRQs and standard, hardware IRQs. Soft IRQs are similar to

IRQs, but are handled in software. Soft IRQs are usually delivered at a time that is relatively

convenient for the kernel code.

**Memory**

Charts the available system memory over the time of the execution.