TAPAS tanh activation on the pooling layer #14543

xhluca · 2021-11-27T03:45:53Z

I noticed the following in the TAPAS pooling layer:

transformers/src/transformers/models/tapas/modeling_tapas.py

Lines 696 to 709 in d83b0e0

    
           # Copied from transformers.models.bert.modeling_bert.BertPooler 
        
           class TapasPooler(nn.Module): 
        
               def __init__(self, config): 
        
                   super().__init__() 
        
                   self.dense = nn.Linear(config.hidden_size, config.hidden_size) 
        
                   self.activation = nn.Tanh() 
        
               def forward(self, hidden_states): 
        
                   # We "pool" the model by simply taking the hidden state corresponding 
        
                   # to the first token. 
        
                   first_token_tensor = hidden_states[:, 0] 
        
                   pooled_output = self.dense(first_token_tensor) 
        
                   pooled_output = self.activation(pooled_output) 
        
                   return pooled_output

I'm curious about the use of nn.Tanh(). I wasn't able to find more information about that activation in the paper. Is it possible to know where it comes from? Thanks!

The text was updated successfully, but these errors were encountered:

NielsRogge · 2021-11-27T09:20:53Z

Hi,

The TAPAS authors borrowed this from the original BERT paper, which decided to apply a tanh layer.

The BERT author explains why he did that here.

xhluca · 2021-11-27T20:31:26Z

Ah thanks, you are right. They indeed use tanh in the code: https://github.com/google-research/tapas/blob/f3d9f068e6eedb252883049b582516a1294ff951/tapas/models/bert/modeling.py#L269-L277

Wish it was mentioned in the appendix of the TAPAS paper 🤷 Thanks for clarifying!

xhluca closed this as completed Nov 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

TAPAS tanh activation on the pooling layer #14543

TAPAS tanh activation on the pooling layer #14543

xhluca commented Nov 27, 2021 •

edited

Loading

NielsRogge commented Nov 27, 2021

xhluca commented Nov 27, 2021

TAPAS tanh activation on the pooling layer #14543

TAPAS tanh activation on the pooling layer #14543

Comments

xhluca commented Nov 27, 2021 • edited Loading

NielsRogge commented Nov 27, 2021

xhluca commented Nov 27, 2021

xhluca commented Nov 27, 2021 •

edited

Loading