@FIR-770 - LLama.cpp: Adding Perf Status for all GGML Operation #26

akapoor3518 · 2025-06-27T05:50:21Z

Perf Summary:
=== GGML Perf Summary ===
ADD : 172 runs, 63454 us total, avg 368.92 us
MUL : 133 runs, 6807770 us total, avg 51186.24 us
RMS_NORM : 178 runs, 3009 us total, avg 16.90 us
MUL_MAT : 676 runs, 6675856 us total, avg 9875.53 us
CPY : 168 runs, 1123 us total, avg 6.68 us
CONT : 87 runs, 344 us total, avg 3.95 us
RESHAPE : 316 runs, 284 us total, avg 0.90 us
VIEW : 294 runs, 67 us total, avg 0.23 us
PERMUTE : 294 runs, 105 us total, avg 0.36 us
TRANSPOSE : 68 runs, 31 us total, avg 0.46 us
GET_ROWS : 9 runs, 127 us total, avg 14.11 us
SOFT_MAX : 88 runs, 5991 us total, avg 68.08 us
ROPE : 175 runs, 3593 us total, avg 20.53 us
UNARY : 22 runs, 8535113 us total, avg 387959.68 us

Detailed information is written to the file, including the size (shape) of each tensor node. Below is a sample output:
cat ggml_perf.log
ggml_graph_compute_perf: total compute time: 22096.867 ms

BACKEND:CPU OP:GET_ROWS: total 0.110 ms over 4 runs (avg 0.028 ms) [shape=2048,6,1]
BACKEND:CPU OP:RMS_NORM: total 0.148 ms over 4 runs (avg 0.037 ms) [shape=2048,6,1]
BACKEND:TSAVORITE OP:MUL: total 143.046 ms over 1 runs (avg 143.046 ms) [shape=2048,6,1]
BACKEND:CPU OP:MUL_MAT: total 34.345 ms over 4 runs (avg 8.586 ms) [shape=2048,6,1]
BACKEND:CPU OP:RESHAPE: total 0.005 ms over 2 runs (avg 0.003 ms) [shape=64,32,6]
BACKEND:CPU OP:ROPE: total 0.329 ms over 4 runs (avg 0.082 ms) [shape=64,32,6]
BACKEND:CPU OP:MUL_MAT: total 4.916 ms over 4 runs (avg 1.229 ms) [shape=256,6,1]
BACKEND:CPU OP:RESHAPE: total 0.002 ms over 4 runs (avg 0.001 ms) [shape=64,4,6]
BACKEND:CPU OP:ROPE: total 0.032 ms over 4 runs (avg 0.008 ms) [shape=64,4,6]
BACKEND:CPU OP:MUL_MAT: total 4.840 ms over 4 runs (avg 1.210 ms) [shape=256,6,1]
BACKEND:CPU OP:RESHAPE: total 0.002 ms over 3 runs (avg 0.001 ms) [shape=64,4,6]
BACKEND:CPU OP:VIEW: total 0.007 ms over 4 runs (avg 0.002 ms) [shape=1536,1,1]
BACKEND:CPU OP:CPY: total 0.021 ms over 4 runs (avg 0.005 ms) [shape=1536,1,1]
BACKEND:CPU OP:RESHAPE: total 0.000 ms over 3 runs (avg 0.000 ms) [shape=256,6,1]
BACKEND:CPU OP:TRANSPOSE: total 0.002 ms over 4 runs (avg 0.001 ms) [shape=6,256,1]
BACKEND:CPU OP:VIEW: total 0.000 ms over 4 runs (avg 0.000 ms) [shape=6,256,1]
BACKEND:CPU OP:CPY: total 0.040 ms over 4 runs (avg 0.010 ms) [shape=6,256,1]
BACKEND:CPU OP:VIEW: total 0.001 ms over 4 runs (avg 0.000 ms) [shape=32,4,64]
BACKEND:CPU OP:PERMUTE: total 0.004 ms over 4 runs (avg 0.001 ms) [shape=32,64,4]

ggml/include/ggml.h

ggml/src/ggml-cpu/ggml-cpu.c

ggml/src/ggml.c

src/llama-context.cpp

LewisLui777

Great job this looks good.

…ow it will be printed at last

ggml/src/ggml.c

src/llama-context.cpp

atrivedi-tsavoritesi

Most of the changes look good, just few minor indentation to be taken care of. Feel free to push the changes once indentation is addressed.

ggml/src/ggml-cpu/ggml-cpu.c

@FIR-770 - LLama.cpp: Adding Perf Status for all GGML Operation

a6918d8

akapoor3518 requested review from LewisLui777, atrivedi-tsavoritesi, reach2shaunak and sh1r1sh June 27, 2025 05:52

Updated the README

c7e934d

atrivedi-tsavoritesi reviewed Jun 27, 2025

View reviewed changes

Addressed Ashish's PR comments

d1bb3f6

atrivedi-tsavoritesi approved these changes Jun 27, 2025

View reviewed changes

LewisLui777 approved these changes Jun 29, 2025

View reviewed changes

Since some status coming along with prompt response, move the code. n…

d5ca4f7

…ow it will be printed at last

atrivedi-tsavoritesi reviewed Jun 30, 2025

View reviewed changes

ggml/src/ggml.c Outdated Show resolved Hide resolved

src/llama-context.cpp Show resolved Hide resolved

atrivedi-tsavoritesi approved these changes Jun 30, 2025

View reviewed changes

ggml/src/ggml-cpu/ggml-cpu.c Outdated Show resolved Hide resolved

Addressed Indentation comments raise by Ashish

44daa2b

akapoor3518 merged commit 83de276 into master Jun 30, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

@FIR-770 - LLama.cpp: Adding Perf Status for all GGML Operation #26

@FIR-770 - LLama.cpp: Adding Perf Status for all GGML Operation #26

Uh oh!

akapoor3518 commented Jun 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LewisLui777 left a comment

Uh oh!

Uh oh!

Uh oh!

atrivedi-tsavoritesi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

@FIR-770 - LLama.cpp: Adding Perf Status for all GGML Operation #26

@FIR-770 - LLama.cpp: Adding Perf Status for all GGML Operation #26

Uh oh!

Conversation

akapoor3518 commented Jun 27, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

LewisLui777 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

atrivedi-tsavoritesi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants