Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What is the best way to handle large models? #31

Closed
Pier297 opened this issue Dec 27, 2022 · 2 comments
Closed

What is the best way to handle large models? #31

Pier297 opened this issue Dec 27, 2022 · 2 comments

Comments

@Pier297
Copy link

Pier297 commented Dec 27, 2022

Hi all,
I was trying to fine-tune GPT-J 6B but I encounter Out Of Memory errors if I use a single-gpu, for non-private training I managed to solve them by using deepspeed but it seems that I cannot use that with opacus or with this codebase. Do you know how I could solve this problem?
Thank you in advance:)

@lxuechen
Copy link
Owner

Hi,

Thanks for your interest. I have detailed thoughts on this, but the short answer is that we likely need to make some non-trivial changes to the codebase to enable that (If you have 80G A100 GPUs, things might be easier).

If you're interested in making progress on this, I'm happy to chat in depth via email.

Thanks.

@lxuechen
Copy link
Owner

Closing this now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants