Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

对xlnet预训练过程的一点疑问 #32

Open
genggui001 opened this issue Jun 23, 2019 · 3 comments
Open

对xlnet预训练过程的一点疑问 #32

genggui001 opened this issue Jun 23, 2019 · 3 comments

Comments

@genggui001
Copy link

对于一段文本,选取其中的K个单词,每次只MASK掉一个,生成K条训练数据,再最大化K条训练数据的对应正确单词的对数概率。

是不是也可以达到和xlnet一样的效果?

@zihangdai
Copy link
Owner

Objective-wise, this is the same as the XLNet. However, this procedure requires K different forward and backward passes, which makes it too expensive to use,

@kimiyoung
Copy link
Collaborator

@genggui001 That would take 85x more machines, which is almost impossible to train. Also, given 85x more machines, simply scaling up XLNet will probably be better due to better data efficiency.

@guotong1988
Copy link

Thank you. Learned a lot here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants