examples(gguf): GGUF example outputs #17025

gabe-l-hart · 2025-11-05T16:24:01Z

Description

This PR is extracted from #16982 since it's an isolated change that's not strictly related to implementing the SSD algorithm.

The changes in this PR add some extra output to the llama-gguf tool to show each tensor's type and element count.

Example Output

...
gguf_ex_read_1: tensor[0]: name = token_embd.weight, size = 513802240, offset = 0, type = bf16, n_elts = 256901120
gguf_ex_read_1: tensor[1]: name = blk.0.attn_norm.weight, size = 10240, offset = 513802240, type = f32, n_elts = 2560
gguf_ex_read_1: tensor[2]: name = blk.0.ffn_norm.weight, size = 10240, offset = 513812480, type = f32, n_elts = 2560
gguf_ex_read_1: tensor[3]: name = blk.0.attn_k.weight, size = 2621440, offset = 513822720, type = bf16, n_elts = 1310720
gguf_ex_read_1: tensor[4]: name = blk.0.attn_output.weight, size = 13107200, offset = 516444160, type = bf16, n_elts = 6553600
gguf_ex_read_1: tensor[5]: name = blk.0.attn_q.weight, size = 13107200, offset = 529551360, type = bf16, n_elts = 6553600
gguf_ex_read_1: tensor[6]: name = blk.0.attn_v.weight, size = 2621440, offset = 542658560, type = bf16, n_elts = 1310720
gguf_ex_read_1: tensor[7]: name = blk.0.ffn_gate.weight, size = 41943040, offset = 545280000, type = bf16, n_elts = 20971520
...

Branch: Mamba2Perf Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

Branch: Mamba2SSD Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

ggerganov · 2025-11-05T17:47:55Z

examples/gguf/gguf.cpp

+            const auto type = gguf_get_tensor_type(ctx, i);
+            const char * type_name   = ggml_type_name(type);
+            const size_t type_size = ggml_type_size(type);
+            const size_t n_elements = size / type_size;


valign:

Suggested change

const auto type = gguf_get_tensor_type(ctx, i);

const char * type_name = ggml_type_name(type);

const size_t type_size = ggml_type_size(type);

const size_t n_elements = size / type_size;

const auto type = gguf_get_tensor_type (ctx, i);

const char * type_name = ggml_type_name(type);

const size_t type_size = ggml_type_size(type);

const size_t n_elements = size / type_size;

Branch: GGUFToolOutputs Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

examples/gguf/gguf.cpp

* origin/master: (21 commits) vulkan: Fix GGML_VULKAN_CHECK_RESULTS to better handle fusion (ggml-org#16919) examples(gguf): GGUF example outputs (ggml-org#17025) mtmd: allow QwenVL to process larger image by default (ggml-org#17020) server : do not default to multiple slots with speculative decoding (ggml-org#17017) mtmd: improve struct initialization (ggml-org#16981) docs: Clarify the endpoint that webui uses (ggml-org#17001) model : add openPangu-Embedded (ggml-org#16941) ggml webgpu: minor set rows optimization (ggml-org#16810) sync : ggml ggml : fix conv2d_dw SVE path (ggml/1380) CUDA: update ops.md (ggml-org#17005) opencl: update doc (ggml-org#17011) refactor: replace sprintf with snprintf for safer string handling in dump functions (ggml-org#16913) vulkan: remove the need for the dryrun (ggml-org#16826) server : do context shift only while generating (ggml-org#17000) readme : update hot topics (ggml-org#17002) ggml-cpu : bicubic interpolation (ggml-org#16891) ci : apply model label to models (ggml-org#16994) chore : fix models indent after refactor (ggml-org#16992) Fix garbled output with REPACK at high thread counts (ggml-org#16956) ...

gabe-l-hart added 2 commits November 5, 2025 09:21

feat(llama-gguf): Print out the tensor type in llama-gguf r

42b3837

Branch: Mamba2Perf Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

feat(off-topic): print the number of elements in tensors with llama-gguf

2327538

Branch: Mamba2SSD Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

gabe-l-hart requested a review from ggerganov as a code owner November 5, 2025 16:24

gabe-l-hart mentioned this pull request Nov 5, 2025

Mamba2 SSD #16982

Draft

github-actions bot added the examples label Nov 5, 2025

DajanaV mentioned this pull request Nov 5, 2025

UPSTREAM PR #17025: examples(gguf): GGUF example outputs auroralabs-loci/llama.cpp#91

Open

ggerganov approved these changes Nov 5, 2025

View reviewed changes

style: valign

752869f

Branch: GGUFToolOutputs Signed-off-by: Gabe Goodhart <ghart@us.ibm.com>

gabe-l-hart force-pushed the GGUFToolOutputs branch from 605187a to 752869f Compare November 5, 2025 17:51

ggerganov reviewed Nov 5, 2025

View reviewed changes

examples/gguf/gguf.cpp Outdated Show resolved Hide resolved

Update examples/gguf/gguf.cpp

6b7eca9

ggerganov merged commit 5886f4f into ggml-org:master Nov 5, 2025
8 checks passed

gabe-l-hart deleted the GGUFToolOutputs branch November 5, 2025 17:58

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

examples(gguf): GGUF example outputs #17025

examples(gguf): GGUF example outputs #17025

Uh oh!

gabe-l-hart commented Nov 5, 2025

Uh oh!

ggerganov Nov 5, 2025

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

examples(gguf): GGUF example outputs #17025

examples(gguf): GGUF example outputs #17025

Uh oh!

Conversation

gabe-l-hart commented Nov 5, 2025

Description

Example Output

Uh oh!

ggerganov Nov 5, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants