@FIR-781 - LLama.cpp ggml Stats:Adding Backend and Unary OP Detail #31

akapoor3518 · 2025-06-30T23:15:51Z

Enhancements to Performance Statistics:

Added backend-level breakdown (e.g., CPU, TSAVORITE) for each operation.

Included unary operation details in both summary and detailed outputs.

Fixed column formatting and alignment in the summary and detailed CSV output for improved readability.

##########
Terminal output
[akapoor@wssw01 llama.cpp]$ ./build-posix/bin/llama-cli -p "my cat's name is" -m /proj/work/akapoor/llama.cpp-may22/llama.cpp/models/Tiny-Llama-v0.3-FP32-1.1B-F32.gguf --device tsavorite -c 12288 --temp 0.0 --n-predict 1 --repeat-penalty 1.5 -b 1024 --top-k 50 --top-p 0.9 --repeat-last-n 5 --no-warmup
my cat's name is L

llama_perf_sampler_print: sampling time = 2.02 ms / 8 runs ( 0.25 ms per token, 3966.29 tokens per second)llama_perf_context_print: load time = 16983.31 ms
llama_perf_context_print: prompt eval time = 16428.90 ms / 7 tokens ( 2346.99 ms per token, 0.43 tokens per second)
llama_perf_context_print: eval time = 0.00 ms / 1 runs ( 0.00 ms per token, inf tokens per second)
llama_perf_context_print: total time = 16985.71 ms / 8 tokens

=== GGML Perf Summary ===
Op : Runs Total us Avg us
ADD : 171 28077 164.19
[CPU ] : 170 5135 30.21
[TSAVORITE ] : 1 22942 22942.00
MUL : 133 6164882 46352.50
[CPU ] : 88 7098 80.66
[TSAVORITE ] : 45 6157784 136839.64
RMS_NORM : 180 3266 18.14
[CPU ] : 180 3266 18.14
MUL_MAT : 713 7003799 9823.00
[CPU ] : 713 7003799 9823.00
CPY : 170 1426 8.39
[CPU ] : 170 1426 8.39
CONT : 86 264 3.07
[CPU ] : 86 264 3.07
RESHAPE : 310 183 0.59
[CPU ] : 310 183 0.59
VIEW : 294 42 0.14
[CPU ] : 294 42 0.14
PERMUTE : 303 68 0.22
[CPU ] : 303 68 0.22
TRANSPOSE : 78 19 0.24
[CPU ] : 78 19 0.24
GET_ROWS : 11 6916 628.73
[CPU ] : 11 6916 628.73
SOFT_MAX : 88 5600 63.64
[CPU ] : 88 5600 63.64
ROPE : 170 2998 17.64
[CPU ] : 170 2998 17.64
UNARY : 22 8308663 377666.50
[TSAVORITE ] : 22 8308663 377666.50
-> SILU : 22 8308663 377666.50

GGML Tsavorite Profiling Results:

Calls Total(ms) T/call Self(ms) Function

1      2.000     2.000     2.000  [ 0%] GGML Tsavorite

========================================================================================================================
1 18573.000 18573.000 18573.000 [100%] TOTAL

[akapoor@wssw01 llama.cpp]$

Snapshot on detail written at file:
#########
[akapoor@wssw01 llama.cpp]$ cat ggml_perf-all-shape.log |more

=== GGML Detailed Backend Op CPU GET_ROWS CPU RMS_NORM TSAVORITE MUL CPU MUL_MAT CPU RESHAPE CPU ROPE CPU MUL_MAT CPU RESHAPE CPU ROPE CPU MUL_MAT CPU RESHAPE CPU VIEW CPU CPY CPU RESHAPE CPU TRANSPOSE CPU VIEW CPU CPY CPU VIEW CPU PERMUTE CPU VIEW CPU PERMUTE CPU PERMUTE CPU MUL_MAT CPU SOFT_MAX --More-- Op Perf (21526.203 ms total) ===
Runs Total ms Avg ms ne[0] ne[1] ne[2] ne[3]
4 6.902 1.726 2048 7 1 1
4 0.347 0.087 2048 7 1 1
1 142.612 142.612 2048 7 1 1
4 34.957 8.739 2048 7 1 1
2 0.004 0.002 64 32 7 1
4 0.270 0.068 64 32 7 1
4 3.840 0.960 256 7 1 1
4 0.003 0.001 64 4 7 1
3 0.027 0.009 64 4 7 1
4 3.811 0.953 256 7 1 1
4 0.002 0.001 64 4 7 1
2 0.003 0.002 1792 1 1 1
4 0.043 0.011 1792 1 1 1
2 0.000 0.000 256 7 1 1
3 0.003 0.001 7 256 1 1
3 0.000 0.000 7 256 1 1
4 0.034 0.009 7 256 1 1
4 0.001 0.000 32 4 64 1
4 0.004 0.001 32 64 4 1
2 0.000 0.000 64 4 32 1
2 0.001 0.001 64 32 4 1
3 0.000 0.000 64 7 32 1
4 0.868 0.217 32 7 32 1
4 0.256 0.064 32 7 32 1

@FIR-781 - LLama.cpp ggml Stats:Adding Backend and Unary OP Detail

fa46e7e

akapoor3518 requested review from LewisLui777, atrivedi-tsavoritesi and sh1r1sh June 30, 2025 23:16

atrivedi-tsavoritesi approved these changes Jun 30, 2025

View reviewed changes

akapoor3518 merged commit 8736109 into master Jul 1, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

@FIR-781 - LLama.cpp ggml Stats:Adding Backend and Unary OP Detail #31

@FIR-781 - LLama.cpp ggml Stats:Adding Backend and Unary OP Detail #31

Uh oh!

akapoor3518 commented Jun 30, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

@FIR-781 - LLama.cpp ggml Stats:Adding Backend and Unary OP Detail #31

@FIR-781 - LLama.cpp ggml Stats:Adding Backend and Unary OP Detail #31

Uh oh!

Conversation

akapoor3518 commented Jun 30, 2025

GGML Tsavorite Profiling Results:

Calls Total(ms) T/call Self(ms) Function

======================================================================================================================== 1 18573.000 18573.000 18573.000 [100%] TOTAL

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

========================================================================================================================
1 18573.000 18573.000 18573.000 [100%] TOTAL