about char pretrained embedding #12

ghost · 2018-04-26T05:13:49Z

Thank you for this excellent open source code.
But I have one question about the pre-trained embedding for charaters,In the class "Data",we load the pre-trained embedding for characters,but i donot known where to use it,maybe I have to add one parameter called "pretrained_char_embedding",and pass it into the class CharBilstm(for example),and modify the code like below:
if pretrain_char_embedding is not None: self.char_embeddings.weight.data.copy_(torch.from_numpy(pretrain_char_embedding)) else: self.char_embeddings.weight.data.copy_( torch.from_numpy(self.random_embedding(alphabet_size, embedding_dim)))

The text was updated successfully, but these errors were encountered:

jiesutd · 2018-04-26T06:27:28Z

@fengxiachong thank you very much for your report.
Yes, the previous version only includes the interface of the pretrained character embeddings but not use it. I just implemented the usage of the pretrained char embedding, you may use the updated code.

Generally, the pretrained character embedding works for languages such as Chinese which includes a large character alphabet. Based on my experience, sometime the model is not stable by adding the pretrained char embedding in Chinese. In this case, you may try to normalize the embedding first.

Appreciate if you give me a feedback about whether the pretrained char embedding works well or not in your experiments.

ghost · 2018-04-26T06:30:01Z

Thank you very much~

jiesutd closed this as completed Apr 26, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

about char pretrained embedding #12

about char pretrained embedding #12

ghost commented Apr 26, 2018 •

edited by ghost

Loading

jiesutd commented Apr 26, 2018

ghost commented Apr 26, 2018

about char pretrained embedding #12

about char pretrained embedding #12

Comments

ghost commented Apr 26, 2018 • edited by ghost Loading

jiesutd commented Apr 26, 2018

ghost commented Apr 26, 2018

ghost commented Apr 26, 2018 •

edited by ghost

Loading