中文数据集使用及模型加载问题 #19

yedongyu1996 · 2022-10-26T11:50:05Z

你好：
我参考你们的文章A Unified Generative Framework for Aspect-Based Sentiment ，想用这个模型作中文的ABSA，于是我将原文的facebook/bart-base替换成fnlp/bart-base-chinese，但是我这里有以下几个问题：

1：transformers在4.4.1版本加载模型时会报错：RuntimeError: Error(s) in loading state_dict for BartModel:

size mismatch for encoder.embed_positions.weight: copying a param with shape torch.Size([514, 768]) from checkpoint, the
shape in current model is torch.Size([512, 768]).
size mismatch for encoder.embed_positions.weight: copying a param with shape torch.Size([514, 768]) from checkpoint, the
shape in current model is torch.Size([512, 768]).
这主要是在这里：model = BartSeq2SeqModel.build_model(bart_name, tokenizer, label_ids=label_ids,
decoder_type=decoder_type,copy_gate=False, use_encoder_mlp=use_encoder_mlp, use_recur_pos=False)

2：facebook提供的batr-base中有一些文件是merges.txt和json形式的vocab，这与您在huggingface上提供的不一致。我将您在

huggingface上提供的有关bart-base-chinese提供的文件用tokenizer.from_pretrained("bart-base-chinese")使用时，pytorch报错：
OSError: Can't load tokenizer for 'bart-base-chinese'. Make sure that:
'bart-base-chinese' is a correct model identifier listed on 'https://huggingface.co/models'
or 'bart-base-chinese' is the correct path to a directory containing relevant tokenizer files
请问这个该怎么解决？

zr941436946 · 2023-11-11T17:57:55Z

你好，请问您解决了这个问题吗？

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

中文数据集使用及模型加载问题 #19

中文数据集使用及模型加载问题 #19

yedongyu1996 commented Oct 26, 2022 •

edited

Loading

zr941436946 commented Nov 11, 2023

中文数据集使用及模型加载问题 #19

中文数据集使用及模型加载问题 #19

Comments

yedongyu1996 commented Oct 26, 2022 • edited Loading

zr941436946 commented Nov 11, 2023

yedongyu1996 commented Oct 26, 2022 •

edited

Loading