A quick question: how do I calculate overhead for a model? #356

znsoftm · 2023-07-08T09:52:34Z

for BERT mode, its overhead is calculated as :

model_mem_req += (5 + 16 * n_layer) * 256; // object overhead

Can anyone explain the meaning 5, 16 , 256 for me ?

znsoftm · 2023-07-08T09:54:01Z

for GPT-2, it is:

ctx_size += (6 + 12*n_layer)*512; // object overhead

znsoftm · 2023-07-08T10:53:27Z

For the model BERT, it has n_layer, each layer has 16 tensors and 5 extra parameters (tensor), but what the 512 is for? I guess it is the size of "struct ggml_tensor", but it is not...
so what is the 256 for? It does be the size of something...

the sizeof(struct ggml_tensor) is 208, so 256 is rounded to n times 16?

ggerganov · 2023-07-11T17:42:32Z

The object overhead is calculated as:

size_t overhead_bytes = n_tensors * ggml_tensor_overhead();

* Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>

znsoftm mentioned this issue Jul 10, 2023

About the calculation of overhead. skeskinen/bert.cpp#19

Open

ggerganov closed this as completed Jul 11, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A quick question: how do I calculate overhead for a model? #356

A quick question: how do I calculate overhead for a model? #356

znsoftm commented Jul 8, 2023 •

edited

Loading

znsoftm commented Jul 8, 2023

znsoftm commented Jul 8, 2023 •

edited

Loading

ggerganov commented Jul 11, 2023

A quick question: how do I calculate overhead for a model? #356

A quick question: how do I calculate overhead for a model? #356

Comments

znsoftm commented Jul 8, 2023 • edited Loading

znsoftm commented Jul 8, 2023

znsoftm commented Jul 8, 2023 • edited Loading

ggerganov commented Jul 11, 2023

znsoftm commented Jul 8, 2023 •

edited

Loading

znsoftm commented Jul 8, 2023 •

edited

Loading