Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Indexes shifting #1

Closed
xinsu626 opened this issue Jul 8, 2021 · 4 comments
Closed

Indexes shifting #1

xinsu626 opened this issue Jul 8, 2021 · 4 comments

Comments

@xinsu626
Copy link

xinsu626 commented Jul 8, 2021

Hello, nice work! Sorry if I miss something. I have a question about the decoder's output in your code.

Based on your code it seems you're shifting position indexes of the tokens by a number of labels. I was wondering why shift tokens instead of shift the labels. Thank you!

Please find a example below (results from here).

raw_words: ['SOCCER', '-', 'JAPAN', 'GET', 'LUCKY', 'WIN', ',', 'CHINA', 'IN', 'SURPRISE', 'DEFEAT', '.']
target_ids: [0, 11, 2, 20, 3, 1]	# each of the token positions is shifted by 6
word_bpe_ids: [0, 13910, 3376, 2076, 111, 344, 591, 1889, 7777, 226, 23806, 975, 17164, 2156, 3858, 16712, 2808, 31987, 4454, 18819, 5885, 10885, 2571, 479, 2]
word_bpe_tokens: ['<s>', 'ĠSO', 'CC', 'ER', 'Ġ-', 'ĠJ', 'AP', 'AN', 'ĠGET', 'ĠL', 'UCK', 'Y', 'ĠWIN', 'Ġ,', 'ĠCH', 'INA', 'ĠIN', 'ĠSUR', 'PR', 'ISE', 'ĠDE', 'FE', 'AT', 'Ġ.', '</s>']
@yhcc
Copy link
Owner

yhcc commented Jul 9, 2021

Because shifting labels will cause the shifting depending on the number of tokens in the input. And this will make the index of special tag vary in each input, which will cause the beam search algorithm hard to determine the finish code.

@xinsu626
Copy link
Author

xinsu626 commented Jul 9, 2021

Because shifting labels will cause the shifting depending on the number of tokens in the input. And this will make the index of special tag vary in each input, which will cause the beam search algorithm hard to determine the finish code.

Hi @yhcc , got it. Thanks for your reply! Sorry I have a follow-up question. Is it because you put the EOS token in the second position of the label space ([BOS, EOS, Tag1, ...]), so you set the EOS token id to 1 instead of BART's original 2 during generation (inference phase)?

@yhcc
Copy link
Owner

yhcc commented Jul 10, 2021

Yes. We map the eos id 1 to 2 in the forward function of our model (so that BART can still get proper eos token id).

@xinsu626
Copy link
Author

Yes. We map the eos id 1 to 2 in the forward function of our model (so that BART can still get proper eos token id).

@yhcc Thank you! This is really helpful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants