Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问训练所需的显存大概是多少? #1

Closed
doublecheng12 opened this issue Nov 22, 2023 · 6 comments
Closed

请问训练所需的显存大概是多少? #1

doublecheng12 opened this issue Nov 22, 2023 · 6 comments

Comments

@doublecheng12
Copy link

No description provided.

@zhengbw0324
Copy link
Owner

@doublecheng12 您好!
目前我们参数设置是每个GPU的batch size为8,GPU显存为40G,训练的时候显存一般占满。不过您可以通过减少batch size,增大gradient_accumulation_steps来节省显存。

@doublecheng12
Copy link
Author

感谢您的回复,所以是使用8*40G完成训练吗?

@zhengbw0324
Copy link
Owner

@doublecheng12
是的,8卡40G,但实际上更少的卡也可以训练,只是速度可能较慢,或者可以尝试使用lora微调,我们也提供了代码。

@doublecheng12
Copy link
Author

再次感谢您的回复,这份工作的想法十分的novel,请问80G的显存可以完成这样的工作吗?因为我的实验室并没有这么多的卡?也许最多只能申请1-2张a800

@zhengbw0324
Copy link
Owner

80G的单张A800我没有尝试过,但两张80G的是一定可以的。

@doublecheng12
Copy link
Author

感谢您这么晚的回复,祝您科研顺利

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants