History and old tokens #71

galaxyentity904 · 2021-07-05T02:42:04Z

I have tried to implement DialoGPT using this code

# encode the new user input, add the eos_token and return a tensor in Pytorch
    new_user_input_ids = tokenizer.encode(input(genpromptfinal) + tokenizer.eos_token, return_tensors='pt')

    # append the new user input tokens to the chat history
    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids

    # generated a response while limiting the total chat history to 1000 tokens, 
    chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)

    # pretty print last ouput tokens from bot
    print("{}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))

which is the code given in huggingface. The problem is when I try to generate something there is never a response. To fix this I have used this code I modified to work

# encode the new user input, add the eos_token and return a tensor in Pytorch
            new_user_input_ids = tokenizer.encode(genpromptfinal + tokenizer.eos_token, return_tensors='pt')
            # append the new user input tokens to the chat history
            step = 0
            bot_input_ids = torch.cat([new_user_input_ids],
                                      dim=-1) if step > 0 else new_user_input_ids
            chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)
            bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids],
                                      dim=-1) if step > 0 else new_user_input_ids
            # generated a response while limiting the total chat history to 1000 tokens,
            chat_history_ids = model.generate(bot_input_ids, max_length=1000, pad_token_id=tokenizer.eos_token_id)
            # pretty print last ouput tokens from bot
            "{}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True))

My concern is this may not retain past tokens when generating, and when I try to do that it gives me an error that chat_history_ids was never declared before bot_input_ids which needs chat_history_ids to include history.
How can I fix this?

The text was updated successfully, but these errors were encountered:

archmagos-dominus · 2022-02-22T17:21:43Z

The code should look something like this instead

# Let's chat for 4 lines
for step in range(4):
    # encode the new user input, add the eos_token and return a tensor in Pytorch
    new_user_input_ids = tokenizer.encode(input(">> User:") + tokenizer.eos_token, return_tensors='pt')
    # print(new_user_input_ids)

    # append the new user input tokens to the chat history
    bot_input_ids = torch.cat([chat_history_ids, new_user_input_ids], dim=-1) if step > 0 else new_user_input_ids

    # generated a response while limiting the total chat history to 200 tokens,
    chat_history_ids = model.generate(
        bot_input_ids, max_length=200,
        pad_token_id=tokenizer.eos_token_id,
        no_repeat_ngram_size=1,
        do_sample=True,
        top_k=100,
        top_p=0.8,
        temperature=0.8
    )

    # pretty print last ouput tokens from bot
    print("{}".format(tokenizer.decode(chat_history_ids[:, bot_input_ids.shape[-1]:][0], skip_special_tokens=True)))

Notice the fact that step is defined in the declaration of the for loop.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

History and old tokens #71

History and old tokens #71

galaxyentity904 commented Jul 5, 2021 •

edited

archmagos-dominus commented Feb 22, 2022 •

edited

History and old tokens #71

History and old tokens #71

Comments

galaxyentity904 commented Jul 5, 2021 • edited

archmagos-dominus commented Feb 22, 2022 • edited

galaxyentity904 commented Jul 5, 2021 •

edited

archmagos-dominus commented Feb 22, 2022 •

edited