Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

请问在text2sql和schema_item_classifier中,精调的思路是全参精调?还是部分参数精调? #40

Closed
ManchesterWuer opened this issue Jul 4, 2023 · 3 comments

Comments

@ManchesterWuer
Copy link

我学习了一下代码,似乎没有看到哪里写了freeze部分参数的规模。

@lihaoyang-ruc
Copy link
Contributor

全参数微调

@ManchesterWuer
Copy link
Author

mT5-XL (3.7 billion parameters),全参精调,您是使用1张a100完成的?可能我对a100的能力有低估。
请问,一张a100卡+Cspider数据(大约1万条),大概训练多久?

@lihaoyang-ruc
Copy link
Contributor

不好意思,我记不太清了,但是训练可以在两天内可以完成。但是除去训练的时间开销,我们还需要一些时间来评估中间的检查点。

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants