-
Notifications
You must be signed in to change notification settings - Fork 143
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
微调后预测三元组不正确原因 #35
Comments
你的训练参数(比如学习率、优化器啥的)贴一下。 另外请检查你的训练数据是否处理正确。输出 调用 这里是完整的微调代码:finetune_IE_task.ipynb |
您好,运行code与main分支是一致的,只更改了t5模型(google-t5/t5-base)的base路径: 参数是这样子的: SFTconfig(max_seq_len=512, tokenizer_dir='/home/xxx/mycode/demo2/model_save/', sft_train_file='./data/my_train.json', batch_size=16, num_train_epochs=6, save_steps=3000, gradient_accumulation_steps=4, learning_rate=5e-05, logging_first_step=True, logging_steps=20, output_dir='./model_save/ie_task', warmup_steps=1000, fp16=True, seed=23333)
training_args = Seq2SeqTrainingArguments(
output_dir=config.output_dir,
per_device_train_batch_size=config.batch_size,
auto_find_batch_size=True, # 防止OOM
gradient_accumulation_steps=config.gradient_accumulation_steps,
learning_rate=config.learning_rate,
logging_steps=config.logging_steps,
num_train_epochs=config.num_train_epochs,
optim="adafactor",
report_to='tensorboard',
log_level='info',
save_steps=config.save_steps,
save_total_limit=3,
fp16=config.fp16,
logging_first_step=config.logging_first_step,
warmup_steps=config.warmup_steps,
seed=config.seed,
generation_config=generation_config,
) 我找到了,自己更改了一处代码: def sft_train(config: SFTconfig) -> None:
# step 1. 加载tokenizer
tokenizer = PreTrainedTokenizerFast.from_pretrained(config.tokenizer_dir)
tokenizer.add_special_tokens({'pad_token': '[PAD]'}) # add code in here
... 因为先前运行时报了这个错误: ValueError: Asking to pad but the tokenizer does not have a padding token. Please select a token to use as `pad_token` `(tokenizer.pad_token = tokenizer.eos_token e.g.)` or add a new pad token via `tokenizer.add_special_tokens({'pad_token': '[PAD]'})` 看来像是包版本不同导致的。 |
使用requirements.txt重新创建一份虚拟环境,注释 执行 |
如果你使用项目的tokenizer的话,pad_token是存在的,不用自己添加。依赖一样的话,那大概率是模型文件不完整,重新下载试试,国内可以通过modelscope下载, from modelscope import snapshot_download
model_id = 'charent/ChatLM-mini-Chinese'
model_id = snapshot_download(model_id, cache_dir='./model_save') |
嗯嗯好的,谢谢喽。 |
您好,按照提供的脚本微调三元组抽取任务,训练5个epoch后loss收敛到0.087100,发现测试样例的三元组结果预测有点不符合常理。
是默认参数配置不正确的原因,还是包版本不同导致呢?训练和测试数据同样使用的duie1.0,也处理成了想要的格式。base模型是从huggingface上下载的t5-base。
The text was updated successfully, but these errors were encountered: