How to train bert from scratch using line_by_line corpus #1794
Unanswered
yiqiang-zhao
asked this question in
Community | Q&A
Replies: 1 comment 3 replies
-
Hi Yiqiang. Thanks for your question. Can we see the error that you came with? like what would follow after |
Beta Was this translation helpful? Give feedback.
3 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
Hi, I want to train a bert from scratch, here is my code. I followed this example and used the bert_base_tp1d.py config. I rewrote the
build_data
to load line_by_line form corpus.My env is a docker container, whose image is hpcaitech/colossalai:0.1.8, in a machine with 8 v100-16G gpus. I used the below training command:
colossalai run --nproc_per_node=8 cai_bert_mlm_trainer_clean.py --config=bert_base_tp1d.py --train /tf/wiki_zh_aa_09.sent --tokenizer /tf/bert-new-chinese --checkpoint_dir /tf/colossalai-bert-mlm --dataset_fmt text --line_by_line --from_torch
But I got such errors:
Until now, I can't figure out the problem, so I need a favor. Please give me some advice or some similar working examples, thanks.
Beta Was this translation helpful? Give feedback.
All reactions