Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

如何用自己的数据进一步预训练 #64

Open
cxyccc opened this issue Jul 26, 2022 · 2 comments
Open

如何用自己的数据进一步预训练 #64

cxyccc opened this issue Jul 26, 2022 · 2 comments

Comments

@cxyccc
Copy link

cxyccc commented Jul 26, 2022

您好!请问您有模型预训练的代码吗?尝试使用run_mlm.py[https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling/run_mlm.py]进行进一步预训练,但代码中调用的tokenizer和您的模型中的tokenizer(BertMaskDataset)不同,替换后遇到了许多问题,希望您可以提供帮助~谢谢!

@cxyccc cxyccc changed the title 如何用自己的数据微调 如何用自己的数据进一步预训练 Aug 1, 2022
@yanghh2000
Copy link

你好,请问你有[方正古隶繁体.ttf24.npy]这个文件吗,现在下载不了这个文件,请问可以发我一份这个文件吗

@Nonponder
Copy link

您好!请问您有模型预训练的代码吗?尝试使用run_mlm.py[https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling/run_mlm.py]进行进一步预训练,但代码中调用的tokenizer和您的模型中的tokenizer(BertMaskDataset)不同,替换后遇到了许多问题,希望您可以提供帮助~谢谢!

请问您问题解决了吗?我最近也在尝试,一直没跑通,方便交流下吗

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants