Why have [PAD] tokens in the masked spans? #8

bminixhofer · 2021-10-21T19:01:44Z

Hi, I was wondering what's the rationale for having [PAD] tokens in masked spans of length more than one instead of just removing the remaining tokens? Here:

splinter/pretraining/masking.py

Lines 316 to 323 in 1df4c13

    
               new_tokens[j] = "[QUESTION]" 
        
               masked_spans.append(MaskedSpanInstance(index=j, 
        
                                                      begin_label=unmasked_span_beginning, 
        
                                                      end_label=unmasked_span_ending)) 
        
               num_predictions += 1 
        
           else: 
        
               new_tokens[j] = "[PAD]" 
        
               input_mask[j] = 0

Is the reason just computational efficiency?

The text was updated successfully, but these errors were encountered:

oriram · 2021-10-21T21:21:28Z

Hi @bminixhofer, thanks for expressing your interest in Splinter :)
The main reason to do that is to keep the same span start and end indices. Otherwise, masking the span changes the indices of all other spans.
Note that these pad tokens aren't attended to, and that position embeddings are only calculated w.r.t "valid" (non-pad) tokens, so overall this implementation is equivalent to removing them.
Hope my answer is clear :)

bminixhofer · 2021-10-22T08:01:54Z

Thanks for the quick answer! I didn't know tokens which are not attended to are ignored by positional embeddings, it makes sense then.

oriram · 2021-10-22T09:09:12Z

It's not by default, I implemented it this way for this reason :) @bminixhofer

bminixhofer closed this as completed Oct 22, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why have [PAD] tokens in the masked spans? #8

Why have [PAD] tokens in the masked spans? #8

bminixhofer commented Oct 21, 2021

oriram commented Oct 21, 2021

bminixhofer commented Oct 22, 2021

oriram commented Oct 22, 2021 •

edited

Why have [PAD] tokens in the masked spans? #8

Why have [PAD] tokens in the masked spans? #8

Comments

bminixhofer commented Oct 21, 2021

oriram commented Oct 21, 2021

bminixhofer commented Oct 22, 2021

oriram commented Oct 22, 2021 • edited

oriram commented Oct 22, 2021 •

edited