@FIR1016 - ggml: Release change with new compiler SDK release #62

akapoor3518 · 2025-10-10T23:20:34Z

POSIX result
akapoor@wssw01 llama.cpp]$
[akapoor@wssw01 llama.cpp]$ build-posix/bin/llama-cli -p "my cat's name" -m /proj/rel/sw/ggml/models/Tiny-Llama-v0.3-FP32-1.1B-F32.gguf --device tSavorite -c 12288 --temp 0.0 --n-predict 10 --repeat-penalty 1.5 -b 1024 --top-k 50 --top-p 0.9 --repeat-last-n 5 --no-warmup --no-display-prompt
is Luna.
I'm a

llama_perf_sampler_print: sampling time = 25.53 ms / 16 runs ( 1.60 ms per token, 626.66 tokens per second)
llama_perf_context_print: load time = 3486.81 ms
llama_perf_context_print: prompt eval time = 2823.51 ms / 6 tokens ( 470.59 ms per token, 2.13 tokens per second)
llama_perf_context_print: eval time = 4913.36 ms / 9 runs ( 545.93 ms per token, 1.83 tokens per second)
llama_perf_context_print: total time = 8428.57 ms / 15 tokens

=== GGML Perf Summary ===
Op Target Runs Total us Avg us
ADD OPU 2024 5320381 2628.65
MUL OPU 2070 1538497 743.24
RMS_NORM OPU 2070 1757942 849.25
MUL_MAT CPU 36427 55584308 1525.91
CONT CPU 7723 437475 56.65
RESHAPE CPU 11372 6246 0.55
VIEW CPU 17813 2351 0.13
PERMUTE CPU 13764 3058 0.22
TRANSPOSE CPU 3506 926 0.26
GET_ROWS CPU 409 1006 2.46
SET_ROWS CPU 7121 6517 0.92
SOFT_MAX CPU 3463 330536 95.45
ROPE CPU 7709 41772 5.42
GLU OPU 1012 2989455 2954.01
cat

OPU Profiling Results:

Calls Total(ms) T/call Self(ms) Function

2090 1292.2790 0.6183 117.3370 [12.33%] [Thread] tsi::runtime::TsavRTPosix::loadBlob

atrivedi-tsavoritesi

You need to make changes to ggml-kernel as well

@FIR1016 - ggml: Release change with new compiler SDK release

efe4fdb

akapoor3518 requested review from Nithyanand-G, atrivedi-tsavoritesi, dineshReddy6381, dmpatra, gkethamallax, mikeuhler and mmankal as code owners October 10, 2025 23:20

atrivedi-tsavoritesi reviewed Oct 10, 2025

View reviewed changes

atrivedi-tsavoritesi approved these changes Oct 10, 2025

View reviewed changes

updated submodule latest tag

7fdc7c1

akapoor3518 merged commit 272b85c into master Oct 10, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

@FIR1016 - ggml: Release change with new compiler SDK release #62

@FIR1016 - ggml: Release change with new compiler SDK release #62

Uh oh!

akapoor3518 commented Oct 10, 2025

Uh oh!

atrivedi-tsavoritesi left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

@FIR1016 - ggml: Release change with new compiler SDK release #62

@FIR1016 - ggml: Release change with new compiler SDK release #62

Uh oh!

Conversation

akapoor3518 commented Oct 10, 2025

OPU Profiling Results:

Calls Total(ms) T/call Self(ms) Function

Uh oh!

atrivedi-tsavoritesi left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants