You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
In https://github.com/chnsh/BERT-NER-CoNLL/blob/master/model.py#L25, shouldn't it be labels[mask.nonzero().squeeze(1)] instead of labels[mask]? If you do labels[mask], then mask is either 0 or 1, which means you are selecting the 1st or 2nd dim. of labels. Similarly for L18 embedding[mask]
The text was updated successfully, but these errors were encountered:
As far as I understand, mask is not 0 or 1 here. mask is a tensor, the same size as sequence length, with values True and False: True for valid labels, False for invalid labels (e.g. for [CLS], [SEP], "pad" tokens and the tokens that are the results of BERT word piece tokenisation). So this line returns the values of the labels/embedding tensors in those positions for which the entry in mask is True.
A toy example with numpy array looks like the following: mask = np.array([False, True, False, True]) labels = np.array([1,2,3,4]) labels[mask] -> array([2, 4])
Although, I suppose your code snippet labels[mask.nonzero().squeeze(1)] does the same thing here.
In https://github.com/chnsh/BERT-NER-CoNLL/blob/master/model.py#L25, shouldn't it be labels[mask.nonzero().squeeze(1)] instead of labels[mask]? If you do labels[mask], then mask is either 0 or 1, which means you are selecting the 1st or 2nd dim. of labels. Similarly for L18 embedding[mask]
In https://github.com/chnsh/BERT-NER-CoNLL/blob/master/model.py#L25, shouldn't it be
labels[mask.nonzero().squeeze(1)]
instead oflabels[mask]
? If you do labels[mask], then mask is either 0 or 1, which means you are selecting the 1st or 2nd dim. of labels. Similarly for L18embedding[mask]
The text was updated successfully, but these errors were encountered: