What is the best way to handle large models? #31

Pier297 · 2022-12-27T15:31:34Z

Hi all,
I was trying to fine-tune GPT-J 6B but I encounter Out Of Memory errors if I use a single-gpu, for non-private training I managed to solve them by using deepspeed but it seems that I cannot use that with opacus or with this codebase. Do you know how I could solve this problem?
Thank you in advance:)

lxuechen · 2022-12-27T19:49:11Z

Hi,

Thanks for your interest. I have detailed thoughts on this, but the short answer is that we likely need to make some non-trivial changes to the codebase to enable that (If you have 80G A100 GPUs, things might be easier).

If you're interested in making progress on this, I'm happy to chat in depth via email.

Thanks.

lxuechen · 2022-12-29T07:05:49Z

Closing this now.

lxuechen closed this as completed Dec 29, 2022

Pier297 mentioned this issue Jan 10, 2023

Support for multi-gpu private fine-tuning #32

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

What is the best way to handle large models? #31

What is the best way to handle large models? #31

Pier297 commented Dec 27, 2022

lxuechen commented Dec 27, 2022

lxuechen commented Dec 29, 2022

What is the best way to handle large models? #31

What is the best way to handle large models? #31

Comments

Pier297 commented Dec 27, 2022

lxuechen commented Dec 27, 2022

lxuechen commented Dec 29, 2022