-
Notifications
You must be signed in to change notification settings - Fork 187
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Question about a place of the self.bias in _embeddings_layers.py. #60
Comments
Hey @jmpark-swk Thanks for opening the issue. No, no reason, and I think you are right, we should add bias and THEN apply dropout. We are about to merge a PR from @5uperpalo (check #56 and we will fix it there). @5uperpalo can you take care of this? If not I will edit it myself, no probs Thanks again. in the meantime, please, join us in slack! 🙂: https://join.slack.com/t/pytorch-widedeep/shared_invite/zt-soss7stf-iXpVuLeKZz8lGTnxxtHtTw |
@jrzaurin |
@jrzaurin
I understand FullEmbeddingDropout and nn.Dropout have the same mathematical effect. I wonder the purpose of the implementation. In the current situation, if FullEmbeddingDropout is used, dropout is expected to work in both train and test situations, and I wonder if this is the intended action. |
hey @jmpark-swk Let's address the question in two parts:
import torch
from torch import nn
X = torch.rand(1, 10, 5)
p = 0.2
# Full Embed Dropout
fedp = (X.new().resize_((X.size(1), 1)).bernoulli_(1 - p).expand_as(X) / (1 - p )) * X
# Dropout
dp = nn.Dropout(p)(X)
fedp
tensor([[[0.1910, 1.0102, 0.9767, 0.2338, 0.2881],
[0.7028, 0.7615, 1.1196, 0.1449, 0.0463],
[0.7549, 0.9568, 0.1248, 0.9500, 0.9484],
[0.3327, 0.5511, 0.4135, 1.0409, 0.1293],
[1.2022, 1.0507, 0.7874, 0.2441, 1.0943],
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[0.4181, 1.1530, 0.2361, 0.1072, 0.6228],
[0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
[0.5364, 0.1017, 0.2276, 1.1333, 0.5364]]])
tensor([[[0.1910, 1.0102, 0.9767, 0.0000, 0.2881],
[0.7028, 0.7615, 1.1196, 0.1449, 0.0463],
[0.7549, 0.0000, 0.1248, 0.9500, 0.9484],
[0.3327, 0.0000, 0.4135, 1.0409, 0.1293],
[1.2022, 1.0507, 0.7874, 0.2441, 0.0000],
[0.0000, 0.0291, 0.7125, 0.3697, 0.7847],
[1.1591, 0.1397, 1.1471, 0.8107, 0.0000],
[0.0000, 1.1530, 0.2361, 0.1072, 0.6228],
[0.4835, 0.2229, 0.9602, 0.0000, 0.4619],
[0.5364, 0.1017, 0.2276, 1.1333, 0.0000]]])
The code for the forward pass should look like: def forward(self, X: Tensor) -> Tensor:
if self.training:
mask = X.new().resize_((X.size(1), 1)).bernoulli_(1 - self.dropout).expand_as(
X
) / (1 - self.dropout)
return mask * X
else:
return X Thanks! |
@jrzaurin |
(
pytorch-widedeep/pytorch_widedeep/models/transformers/_embeddings_layers.py
Line 173 in c287c87
Hello, first of all, thank you for releasing a wonderful code.
I have a question while looking at the code related to ft_transformer.
Is there any particular reason why an adding bias function is placed after a dropout function?
Is there a problem that occurs when the dropout function is placed after the adding bias function?
The text was updated successfully, but these errors were encountered: