如何用自己的数据进一步预训练 #64

cxyccc · 2022-07-26T08:21:40Z

您好！请问您有模型预训练的代码吗？尝试使用run_mlm.py[https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling/run_mlm.py]进行进一步预训练，但代码中调用的tokenizer和您的模型中的tokenizer（BertMaskDataset）不同，替换后遇到了许多问题，希望您可以提供帮助~谢谢！

yanghh2000 · 2022-08-19T08:55:48Z

你好，请问你有[方正古隶繁体.ttf24.npy]这个文件吗，现在下载不了这个文件，请问可以发我一份这个文件吗

Nonponder · 2022-11-03T12:51:14Z

您好！请问您有模型预训练的代码吗？尝试使用run_mlm.py[https://github.com/huggingface/transformers/tree/main/examples/pytorch/language-modeling/run_mlm.py]进行进一步预训练，但代码中调用的tokenizer和您的模型中的tokenizer（BertMaskDataset）不同，替换后遇到了许多问题，希望您可以提供帮助~谢谢！

请问您问题解决了吗？我最近也在尝试，一直没跑通，方便交流下吗

cxyccc changed the title ~~如何用自己的数据微调~~ 如何用自己的数据进一步预训练 Aug 1, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

如何用自己的数据进一步预训练 #64

如何用自己的数据进一步预训练 #64

cxyccc commented Jul 26, 2022 •

edited

yanghh2000 commented Aug 19, 2022

Nonponder commented Nov 3, 2022

如何用自己的数据进一步预训练 #64

如何用自己的数据进一步预训练 #64

Comments

cxyccc commented Jul 26, 2022 • edited

yanghh2000 commented Aug 19, 2022

Nonponder commented Nov 3, 2022

cxyccc commented Jul 26, 2022 •

edited