-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
About the calculation of overhead. #19
Comments
or BERT mode, its overhead is calculated as : model_mem_req += (5 + 16 * n_layer) * 256; // object overhead Can anyone explain the meaning 5 is extra tensors, 16 means each layer has 16 tensor, and 256 for what? Is it the sizeof ggml_tensor struct ? The actual size is 208 bytes, so 256 is rounded size? |
My memory is a little hazy on this subject. |
thanks for your answer:) |
I have tested the latest ggml, should alter the 256 to 512. Do not understand why:( |
ggerganov/ggml#356
The text was updated successfully, but these errors were encountered: