NER教程中发现bug #217

cocoa0409 · 2021-01-31T16:38:28Z

hi,
https://aistudio.baidu.com/aistudio/projectdetail/1317771
在这个NER任务中，主模型代码
class BiGRUWithCRF2(nn.Layer): def __init__(self, emb_size, hidden_size, word_num, label_num): super(BiGRUWithCRF2, self).__init__() self.word_emb = TokenEmbedding(extended_vocab_path='./conf/word.dic', unknown_token='OOV') #EMB
TokenEmbedding的利用有误

我查看了源码，extended_vocab_path的参数会作为读取字典，经过_read_vocab_list_from_file取出词表
def _read_vocab_list_from_file(self, extended_vocab_path): # load new vocab table from file vocab_list = [] with open(extended_vocab_path, "r", encoding="utf-8") as f: for line in f.readlines(): vocab = line.rstrip("\n").split("\t")[0] vocab_list.append(vocab) return vocab_list
该任务对应的字典word.dic ，第一列是索引id，不是vocab

所以TokenEmbedding无法正确加载pretrain的权重

The text was updated successfully, but these errors were encountered:

chenxiaozeng · 2021-02-02T08:40:58Z

您好，经研发同学排查，这里确实写错了。预计会在本周修复bug，感谢您的反馈！

ZeyuChen · 2021-05-11T12:35:16Z

This bug has already fixed.

ZeyuChen transferred this issue from PaddlePaddle/models Apr 1, 2021

ZeyuChen assigned kinghuin Apr 2, 2021

ZeyuChen added the bug Something isn't working label Apr 22, 2021

ZeyuChen closed this as completed May 11, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

NER教程中发现bug #217

NER教程中发现bug #217

cocoa0409 commented Jan 31, 2021

chenxiaozeng commented Feb 2, 2021

ZeyuChen commented May 11, 2021

NER教程中发现bug #217

NER教程中发现bug #217

Comments

cocoa0409 commented Jan 31, 2021

chenxiaozeng commented Feb 2, 2021

ZeyuChen commented May 11, 2021