-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
A quick question: how do I calculate overhead for a model? #356
Comments
for GPT-2, it is: ctx_size += (6 + 12*n_layer)*512; // object overhead |
For the model BERT, it has n_layer, each layer has 16 tensors and 5 extra parameters (tensor), but what the 512 is for? I guess it is the size of "struct ggml_tensor", but it is not... the sizeof(struct ggml_tensor) is 208, so 256 is rounded to n times 16? |
The object overhead is calculated as:
|
* Introduce structs for the q4 data blocks * ggml : rename quant struct variables + fix ARM_NEON --------- Co-authored-by: Georgi Gerganov <ggerganov@gmail.com>
for BERT mode, its overhead is calculated as :
model_mem_req += (5 + 16 * n_layer) * 256; // object overhead
Can anyone explain the meaning 5, 16 , 256 for me ?
The text was updated successfully, but these errors were encountered: