We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
https://github.com/PaddlePaddle/models/tree/release/1.8/PaddleNLP/pretrain_language_models/BERT 数据预处理部分 id化的例子 第一个token的id是1?你们用的是啥字典啊。 bert-base 的字典,cls是102,sep是103才对吧
或者你们能不能放出对应的create_train_data.py的代码,这也不麻烦吧
The text was updated successfully, but these errors were encountered:
有人来回答一下吗?确定不了字典,预训练也就无从谈起了
Sorry, something went wrong.
我刚才研究了一下,这里是一些结果:
data/demo_config/vocab.txt
data/demo_wiki_tokens.txt
龙 江 ic ( 平 假 名 : ) 是 位 于 长 野 县 饭 田 市 的 三 远 南 信 自 动 车 道 之 交 流 道 。 现 时 还 未 启 用 。
data/train/demo_wiki_train.gz
1
2
至于所要求的 create_train_data.py, 应该就是 train.py [4] 和 tokenization.py [5].
[1] https://github.com/PaddlePaddle/models/blob/release/1.8/PaddleNLP/pretrain_language_models/BERT/data/demo_config/vocab.txt [2] https://github.com/PaddlePaddle/models/blob/release/1.8/PaddleNLP/pretrain_language_models/BERT/data/demo_wiki_tokens.txt [3] https://github.com/PaddlePaddle/models/blob/release/1.8/PaddleNLP/pretrain_language_models/BERT/data/train/demo_wiki_train.gz [4] https://github.com/PaddlePaddle/models/blob/release/1.8/PaddleNLP/pretrain_language_models/BERT/train.py [5] https://github.com/PaddlePaddle/models/blob/release/1.8/PaddleNLP/pretrain_language_models/BERT/tokenization.py
https://github.com/PaddlePaddle/PaddleNLP/tree/develop/examples/language_model/bert
kuke
No branches or pull requests
https://github.com/PaddlePaddle/models/tree/release/1.8/PaddleNLP/pretrain_language_models/BERT
数据预处理部分
id化的例子 第一个token的id是1?你们用的是啥字典啊。
bert-base 的字典,cls是102,sep是103才对吧
或者你们能不能放出对应的create_train_data.py的代码,这也不麻烦吧
The text was updated successfully, but these errors were encountered: