How to evaluate the model memory efficiently? #52

Godofnothing · 2023-03-31T08:40:17Z

Thanks for the great work and convenient benchmarking tool!

I would like to evaluate CodeGen-16B model on the humaneval benchmark. At my disposal there is A6000 GPUs with 48Gb of memory. The evaluation script crashes due to CUDA out of memory here (i.e accelerator.prepare) even with the smallest batch size - 1.

Since it is model evaluation I would expect that most of the memory is occupied by the model params (no optimizer states).
Naively, this model should fit into a single GPU if loaded in half precision, since 2x 16 = 32 < 48. However, when setting in the accelerate launch mixed precision with fp16 I still face OOM error.

What measures would you suggest to fit the model onto a single GPU?

The text was updated successfully, but these errors were encountered:

arjunguha · 2023-04-03T16:47:31Z

This is not going to be full solution. I have gotten Codegen-16B-multi to work on an A6000/48GB. The script we used to pull it off is here:

https://github.com/nuprl/MultiPL-E/blob/main/inference/codegen.py

Note the crazy code for the stopping criteria. IIRC it was necessary to get things to work.

loubnabnl · 2023-04-03T17:11:42Z

Can you make sure that FP16 is set and follow memory consumption up until accelerator.prepare ?

Godofnothing · 2023-04-04T11:59:51Z

@loubnabnl I set fp16 in the accelerate launch --mixed_precision fp16 but it doesn't help. There is no GPU memory consumption up to accelerator.prepare.

loubnabnl · 2023-04-20T21:19:15Z

@Godofnothing we found a bug which made the memory consumption more than necessary, can you try running evaluation with code from this PR #61? you now need to specify --precision fp16

loubnabnl · 2023-05-24T09:33:18Z

Closing this issue, as I tried loading CodeGen-16B in mixed precision and it fits under 40GB of RAM

Godofnothing · 2023-05-24T15:40:46Z

Sorry for long delay. I've pulled the latest version of the code and model successfully fits onto 40GB. Thanks for your help and response.

loubnabnl closed this as completed May 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to evaluate the model memory efficiently? #52

How to evaluate the model memory efficiently? #52

Godofnothing commented Mar 31, 2023 •

edited

arjunguha commented Apr 3, 2023

loubnabnl commented Apr 3, 2023

Godofnothing commented Apr 4, 2023

loubnabnl commented Apr 20, 2023 •

edited

loubnabnl commented May 24, 2023

Godofnothing commented May 24, 2023

How to evaluate the model memory efficiently? #52

How to evaluate the model memory efficiently? #52

Comments

Godofnothing commented Mar 31, 2023 • edited

arjunguha commented Apr 3, 2023

loubnabnl commented Apr 3, 2023

Godofnothing commented Apr 4, 2023

loubnabnl commented Apr 20, 2023 • edited

loubnabnl commented May 24, 2023

Godofnothing commented May 24, 2023

Godofnothing commented Mar 31, 2023 •

edited

loubnabnl commented Apr 20, 2023 •

edited