Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

其他数据集训练出错 #5

Closed
Yuknoshita opened this issue Oct 30, 2021 · 5 comments
Closed

其他数据集训练出错 #5

Yuknoshita opened this issue Oct 30, 2021 · 5 comments

Comments

@Yuknoshita
Copy link

Yuknoshita commented Oct 30, 2021

我按照您对数据的处理(nytAndWebnlg),应用在其他数据集上:
我猜测您dataloader之中
` subj = entity[subj_idx]

        obj = entity[obj_idx]

        rc_head_labels+=[subj['start'], obj['start'], re['type']]

        rc_tail_labels+=[subj['end']-1, obj['end']-1, re['type']]

的含义是实体变成[1, 1, 'None', 16, 20, 'None'],两个数字是实体的单词起始和结束下标,None是类型 关系是:rc_head_labels = [1, 1, '/location/location/contains],rc_tail_labels= [16, 20, '/location/location/contains]`,即头实体和尾实体的单词下标对和关系类型。

我想请教的是:

  1. 我对于数据的处理是否理解正确?
  2. 实体的类型为None或者实际的类型有什么区别吗?
  3. 在训练中,进行forward()的时候,出现了维度不匹配的问题,这是什么原因呢?应当如何解决?

Traceback (most recent call last):
File "B:\work\pycharm\PYCHARM\PyCharm Community Edition 2020.3.4\plugins\python-ce\helpers\pydev\pydevd.py", line 1483, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "B:\work\pycharm\PYCHARM\PyCharm Community Edition 2020.3.4\plugins\python-ce\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "B:/model/PFN-nested/main.py", line 197, in
ner_pred, re_head_pred, re_tail_pred = model(text, mask)
File "B:\work\anaconda\envs\pfn\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "B:\model\PFN-nested\model\pfn.py", line 260, in forward
x = self.bert(**x)[0]
File "B:\work\anaconda\envs\pfn\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "B:\work\anaconda\envs\pfn\lib\site-packages\transformers\models\bert\modeling_bert.py", line 989, in forward
past_key_values_length=past_key_values_length,
File "B:\work\anaconda\envs\pfn\lib\site-packages\torch\nn\modules\module.py", line 1051, in _call_impl
return forward_call(*input, **kwargs)
File "B:\work\anaconda\envs\pfn\lib\site-packages\transformers\models\bert\modeling_bert.py", line 221, in forward
embeddings += position_embeddings
RuntimeError: The size of tensor a (588) must match the size of tensor b (512) at non-singleton dimension 1

Process finished with exit code 1

@Coopercoppers
Copy link
Owner

  1. rc_head_label应该是subj_head, obj_head, relation。你这上边写的1,1是subj_head, subj_tail,这是不对的。
  2. 不需要对实体进行类型划分的就设成None
  3. 这个可能是你的句子tokenize之后长度超过了512,检查下你的数据,看看哪个超了。

@Yuknoshita
Copy link
Author

Yuknoshita commented Oct 30, 2021 via email

@Yuknoshita
Copy link
Author

Yuknoshita commented Oct 30, 2021 via email

@Coopercoppers
Copy link
Owner

你把tokenizer后长度为588的那个句子拿出来打印一下再分析。

@Yuknoshita
Copy link
Author

Yuknoshita commented Oct 30, 2021 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants