-
Notifications
You must be signed in to change notification settings - Fork 24.8k
Description
1) nn.Embedding documentation says:
padding_idx (int, optional): If given, pads the output with the embedding vector at :attr:
padding_idx
(initialized to zeros) whenever it encounters the index.
But, In case of using the from_pretrained
version, if input is equal to padding_idx, weigths[padding_idx] gets returned and not a vector of zeros.
Code example for reproducing:
predefined_weights = torch.rand(10, 3)
emb = torch.nn.Embedding.from_pretrained(embeddings=predefined_weights, padding_idx=0)
result = emb(torch.LongTensor([1,2,0]))
The 3rd vector in the result is not a zero vector.
This might be a documentation issue, and the doc should say that in case of using from_pretrained
, padding_idx will not return a zero vector, but instead, weigths[padding_idx]
But, it makes the use of padding_idx
in from_pretrained
redundant, thus I would suggest removing padding_idx
from the arguments if this is indeed the case.
I can open a PR for this but need some feedback regarding the approach to take here, please advise.
2) nn.functional.embedding documentation says:
padding_idx (int, optional): If given, pads the output with the embedding vector at :attr:
padding_idx
(initialized to zeros) whenever it encounters the index.
But, the result for index = padding_idx
returns weigths[padding_idx] and not a zero vector.
Code example for reproducing:
predefined_weights = torch.rand(10, 3)
result = torch.nn.functional.embedding(torch.LongTensor([1,2,0]), predefined_weights, padding_idx=0)
The 3rd vector in the result is not a zero vector.
IMO this is a documentation issue, and the doc should say that padding_idx pads the output with the embedding vector at padding_idx. (and not a zero vector)
cc @ezyang @gchanan @zou3519 @bdhirsh @heitorschueroff @jlin27 @mruberry @albanD