Join GitHub today
GitHub is home to over 31 million developers working together to host and review code, manage projects, and build software together.Sign up
How to get token embedding and the weighted vector in ELMo? #2245
System (please complete the following information):
(1) How to get the token embedding $x_k$? since in practice, we need to combine the token embedding with the different outputs of LSTM layers.
Hi, maybe this document helps? https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md#using-elmo-as-a-pytorch-module-to-train-a-new-model
If you just want per-token embeddings in an AllenNLP model, you can use the elmo
(1) The token embedding x_k is calculated with a learned weighted average (the
(2) The output of the
Hope that helps! Closing this for now, but feel free to re-open if you have further questions.
@nelson-liu Thanks for your nice reply!
That's to say this representation is exact the final weighted embedding of the 3 layers (character-convnet output, first layer lstm output, second layer lstm output) mentioned in the paper (Eq (1)).
However, according to the doc of
So another question, what's the meaning when this parameter
different weight for the linear combination about 3 layers (charcnn, lstm1, lstm2)?
Thanks very much!
yeah, exactly --- multiple output representations means that you're learning a different linear combination of the 3 layers. If you want to use elmo in two different places of your model, it's inefficient to run the forward pass of the LSTM twice just so you can re-weight the layers differently.
For instance, the
does that clarify? I agree that the overloaded use of the word "layers" in the documentation is confusing, PRs to clarify would be great.