Can't change BOS token or EOS token for GPT Neo #12

mallorbc · 2021-06-11T17:45:01Z

In order to better control the start and stop of generated text, I have added BOS tokens and EOS tokens for GPT2xl. This works well and the generated text stops at an appropriate length and starts how a normal sentence would. However, I want to do this process on GPT Neo, and this does not work. I have discovered that for some reason arguments that normally set BOS and EOS are not working when GPT Neo is ran, even if I change the tokenizer from AutoTokenizer to GPT2Tokenizer. Below is some code that shows what I mean.

    tokenizer = GPT2Tokenizer.from_pretrained(
    model_args.model_name_or_path, bos_token='<|beginingtext|>',eos_token='<|endingtext|>', pad_token='<|pad|>',**tokenizer_kwargs)
    print(tokenizer.eos_token)
    print(tokenizer.bos_token)
    quit()

As I said, when I run this with GPT2xl, the tokens are appropriately changed. When I run this with GPT Neo, both the BOS and EOS tokens are <|endoftext|>

The text was updated successfully, but these errors were encountered:

mallorbc · 2021-06-11T18:10:56Z

After looking into this further, this may be a bug outside of this project. I am going to make an issue on the hugging face repo. I could be wrong though.

bn4t · 2021-06-12T14:41:46Z

Not 100% sure about this, but according to https://github.com/finetuneanon/gpt-neo_finetune_2.7B#dataset-preparation there is no BOS token in GPT Neo.

mallorbc · 2021-06-13T17:12:58Z

Thanks. Maybe its not a bug then. Without a BOS token and EOS token, I can still accomplish my goals, just takes a different, not as elegant method.
Thanks!

mallorbc closed this as completed Jun 13, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Can't change BOS token or EOS token for GPT Neo #12

Can't change BOS token or EOS token for GPT Neo #12

mallorbc commented Jun 11, 2021

mallorbc commented Jun 11, 2021

bn4t commented Jun 12, 2021

mallorbc commented Jun 13, 2021

Can't change BOS token or EOS token for GPT Neo #12

Can't change BOS token or EOS token for GPT Neo #12

Comments

mallorbc commented Jun 11, 2021

mallorbc commented Jun 11, 2021

bn4t commented Jun 12, 2021

mallorbc commented Jun 13, 2021