Skip to content

Conversation

ADITYADAS1999
Copy link
Contributor

@ADITYADAS1999 ADITYADAS1999 commented Mar 14, 2023

Modify generate() method in GPT2CausalLM to support chatbot #844

The gap we have is about the end_token_id. In a chatbot system like DialoGPT , the user prompt needs to be concatenated by an end_token before generation, otherwise it will just generate end_token and stop. One simple fix is to add an argument to generate() method, e.g., append_end_token=False, when which is True the prompt will be appended by an end_token before calling our sampler.

cc: @chenmoneygithub

@chenmoneygithub
Copy link
Contributor

@ADITYADAS1999 Sorry what does this PR do...? We need to reflect this argument append_end_token in the code to add the end_token after tokenization. We are doing a few refactor work in #804, which will affect this PR, please stay tuned, thanks!

@ADITYADAS1999
Copy link
Contributor Author

ADITYADAS1999 commented Mar 15, 2023

@ADITYADAS1999 Sorry what does this PR do...? We need to reflect this argument append_end_token in the code to add the end_token after tokenization. We are doing a few refactor work in #804, which will affect this PR, please stay tuned, thanks!

thanks for the info !
actually, I am going to try solving with this issue.

@chenmoneygithub
Copy link
Contributor

@ADITYADAS1999 So to resolve #853, we need to reflect this argument append_end_token in the code, simply adding the token is not enough.

@ADITYADAS1999
Copy link
Contributor Author

@ADITYADAS1999 So to resolve #853, we need to reflect this argument append_end_token in the code, simply adding the token is not enough.

Thanks @chenmoneygithub

Is the reflect this argument work already in progress by team ?

@mattdangerw
Copy link
Member

I will go ahead and close this, as there is no implementation here, and this is something that will be take on by @chenmoneygithub or me I think!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants