Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

performance benchmark #17

Closed
touhi99 opened this issue May 8, 2023 · 4 comments
Closed

performance benchmark #17

touhi99 opened this issue May 8, 2023 · 4 comments

Comments

@touhi99
Copy link

touhi99 commented May 8, 2023

Hi,

are there possibility to add a performance benchmark of the open-sourced LLMs?

@LudwigStumpp
Copy link
Collaborator

Hi,

do you mean adding the performance results directly to the table?

For now, we have listed some external resources under the Evals on open LLMs section, which cover the performances of the models on various baselines. Do you think this is enough?

@touhi99
Copy link
Author

touhi99 commented May 10, 2023

Hi,

do you mean adding the performance results directly to the table?

For now, we have listed some external resources under the Evals on open LLMs section, which cover the performances of the models on various baselines. Do you think this is enough?

Yes, that would give some rough idea which model to choose among many. I will have a look at the Evals, thanks.

Beside performance, personally, also GPU/tech requirement would also be interesting benchmark to estimate solution. If I propose, an LLM-based solution, what's the min. tech requirement would be for training/fine-tuning/inference. so far many models are coming in and out, but I haven't found any certain data. For example, X model need atleast Y gb gpu ram for inference.

@LudwigStumpp
Copy link
Collaborator

LudwigStumpp commented May 10, 2023

@touhi99 Great points you are mentioning, thanks for that!

Here are some remarks from my side to further keep the discussion going and find a suitable spot to add your requested information:

Adding evals results right inside the table

  • there are many different benchmarks which would require us to add many more additional columns to the table
  • furthermore, one row inside the table currently features all available model variations. Each of them has a different performance on the eval benchmarks. This means that adding evals into the table would require us to split these rows apart. I currently think that this is not a direction we want to go but I will keep this at the back of my head and discuss with @eugeneyan and @Muhtasham
  • for identifying good performing models, I created the LLM-Leaderboard, which covers both open and closed models

GPU memory requirements

  • Roughly speaking, the memory requirements to load a model depend on two things:
    • the number of parameters
    • the precision used (float32, float16, bfloat16, int8) for these parameters
  • while the number of parameters are fixed, the precision you use to load the model is generally not
  • you can easily calculate the memory requirements. For simplicity, assume 1 Giga (G) ~= 1 Billion (B), example:
    • 7B model
    • multiply with:
      • *1 (e.g for int8) = 7 GB
      • *2 (e.g for float16) = 14 GB
      • *4 (e.g for float32) = 28 GB
    • and you get the rough estimate for the GPU memory requirements

EDIT: This is a little naive, as one also needs to account for

  • gradients for backprop (~ assume to be of the same size as the model params)
  • first and second order momentum terms of ADAM optimizer (~ assume 2 times the size of the model params)
  • feature maps in the forward pass (depends on the architecture, ignore for now)
  • batch size (effects gradient storage and feature map storage, but ignore for now)

Taking above into account, we can get a very naive estimate for fine-tuning with:
MODEL_SIZE [Billion] * PRECISION [Bytes] * 4 (model weights + gradients + ADAM)

So for our 7B model above:

  • float32: 7 * 4 * 4 = 112 GB

More on this topic:

@martinezpl
Copy link

For anyone interested in this topic:

https://huggingface.co/spaces/HuggingFaceH4/open_llm_leaderboard

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants