remove model from accelerate prepare and add precision argument #61

loubnabnl · 2023-04-20T20:53:45Z

Passing both model and dataloader to accelerate.prepare takes unnecessary memory as noticed by @RaymondLi0, which causes OOM for large models.
This is because the model is wrapped in the DistributedDataParallel class which will reserve memory for the gradients for training (issue). We now only wrap the dataloader, and we also add precision argument to properly load model in bf16 or fp16. (the mixed-precison accelerate argument in config is for mixed precision in training and will load two model copies..)

Todo: add cpu case

lm_eval/generation.py

tests/test_generation_evaluation.py

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

…ct/bigcode-evaluation-harness into handle-large-model

loubnabnl added 4 commits April 20, 2023 20:37

remove model from accelerate prepare and add precision

622b060

add fp16

7f64821

reformet code

20a4cf1

fix CI

d2f9c3f

loubnabnl mentioned this pull request Apr 20, 2023

How to evaluate the model memory efficiently? #52

Closed

loubnabnl added 2 commits April 21, 2023 09:07

use accelerate device

2d9d9e5

update help message in precision arg

7cb8f63

loubnabnl mentioned this pull request Apr 21, 2023

Commit / Edit / Diff models & their evaluation #47

Closed

loubnabnl requested a review from Muennighoff April 21, 2023 11:34

Muennighoff reviewed Apr 21, 2023

View reviewed changes

lm_eval/generation.py Outdated Show resolved Hide resolved

Muennighoff reviewed Apr 21, 2023

View reviewed changes

tests/test_generation_evaluation.py Outdated Show resolved Hide resolved

loubnabnl and others added 3 commits April 21, 2023 14:17

change default precision in tests to None

dd7011a

Co-authored-by: Niklas Muennighoff <n.muennighoff@gmail.com>

use torch dtype to set precision instead of converting model

2b5df4a

Merge branch 'handle-large-model' of https://github.com/bigcode-proje…

726e4c3

…ct/bigcode-evaluation-harness into handle-large-model

loubnabnl merged commit 705b007 into main Apr 21, 2023

loubnabnl deleted the handle-large-model branch June 12, 2023 14:52

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

remove model from accelerate prepare and add precision argument #61

remove model from accelerate prepare and add precision argument #61

loubnabnl commented Apr 20, 2023 •

edited

Loading

remove model from accelerate prepare and add precision argument #61

remove model from accelerate prepare and add precision argument #61

Conversation

loubnabnl commented Apr 20, 2023 • edited Loading

loubnabnl commented Apr 20, 2023 •

edited

Loading