You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
# We "pool" the model by simply taking the hidden state corresponding
# to the first token.
first_token_tensor=hidden_states[:, 0]
pooled_output=self.dense(first_token_tensor)
pooled_output=self.activation(pooled_output)
returnpooled_output
I'm curious about the use of nn.Tanh(). I wasn't able to find more information about that activation in the paper. Is it possible to know where it comes from? Thanks!
The text was updated successfully, but these errors were encountered:
I noticed the following in the TAPAS pooling layer:
transformers/src/transformers/models/tapas/modeling_tapas.py
Lines 696 to 709 in d83b0e0
I'm curious about the use of
nn.Tanh()
. I wasn't able to find more information about that activation in the paper. Is it possible to know where it comes from? Thanks!The text was updated successfully, but these errors were encountered: