Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NER教程中发现bug #217

Closed
cocoa0409 opened this issue Jan 31, 2021 · 2 comments
Closed

NER教程中发现bug #217

cocoa0409 opened this issue Jan 31, 2021 · 2 comments
Assignees
Labels
bug Something isn't working

Comments

@cocoa0409
Copy link

hi,
https://aistudio.baidu.com/aistudio/projectdetail/1317771
在这个NER任务中,主模型代码
class BiGRUWithCRF2(nn.Layer): def __init__(self, emb_size, hidden_size, word_num, label_num): super(BiGRUWithCRF2, self).__init__() self.word_emb = TokenEmbedding(extended_vocab_path='./conf/word.dic', unknown_token='OOV') #EMB
TokenEmbedding的利用有误

我查看了源码,extended_vocab_path的参数会作为读取字典,经过_read_vocab_list_from_file取出词表
def _read_vocab_list_from_file(self, extended_vocab_path): # load new vocab table from file vocab_list = [] with open(extended_vocab_path, "r", encoding="utf-8") as f: for line in f.readlines(): vocab = line.rstrip("\n").split("\t")[0] vocab_list.append(vocab) return vocab_list
该任务对应的字典word.dic ,第一列是索引id,不是vocab

所以TokenEmbedding无法正确加载pretrain的权重

@chenxiaozeng
Copy link
Contributor

您好,经研发同学排查,这里确实写错了。预计会在本周修复bug,感谢您的反馈!

@ZeyuChen ZeyuChen transferred this issue from PaddlePaddle/models Apr 1, 2021
@ZeyuChen ZeyuChen added the bug Something isn't working label Apr 22, 2021
@ZeyuChen
Copy link
Member

This bug has already fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

4 participants