Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: Add chat example #77

Closed
wants to merge 2 commits into from
Closed

WIP: Add chat example #77

wants to merge 2 commits into from

Conversation

@maraoz
Copy link

@maraoz maraoz commented Feb 24, 2019

I'm experimenting with the 117M model, trying to build a simple interactive chat example.
Results are acceptable:
screenshot from 2019-02-24 18-06-38

However, I have a feeling my code has an error because when I try generating text with the other scripts (src/generate_unconditional_samples.py and src/interactive_conditional_samples.py), the output quality seems subjectively better. The approach I've taken is to keep the model in memory, and pass the whole conversation at each point, to generate the bot's reply.

To try out the code on this branch, run:

git remote add maraoz git@github.com:maraoz/gpt-2.git
git fetch maraoz
git checkout maraoz/master -b chat
python3 src/chat.py --seed=1337

Some open questions in case anyone (OpenAI team or community) wants to chime in:

  • Is it correct to use the same model instance for the whole "chat session"? Or should I restore from a checkpoint before generating each new line?
  • Is it correct to prompt the model with the whole conversation at each stage, or should I only send the new dialogue lines?
  • Any other ideas on how to improve output quality? (tweaking temperature, top_k?)
@WuTheFWasThat
Copy link
Collaborator

@WuTheFWasThat WuTheFWasThat commented Feb 25, 2019

hey, try doing double newline instead of single newline between dialogue entries. this is our fault, owing to the way we processed html while producing our dataset.

regarding your questions:

  • it's correct to use the same model instance for the whole chat session
  • yep, give it the whole conversation at each stage (or at least the last 1024 - gen_length tokens of it)
  • I recommend top_k = 40!

@maraoz
Copy link
Author

@maraoz maraoz commented Feb 27, 2019

Thanks for the tips, it's working much better now!
I'll keep playing and report back if I find anything interesting for you to try.

@maraoz maraoz closed this Feb 27, 2019
@nerdimite
Copy link

@nerdimite nerdimite commented Jun 3, 2019

Were you able to get chat like results using the original interactive_conditional_samples.py? Because I tried feeding 117M a dataset where the next line contained the reply of the previous line but the model didn't quite talk like a chatbot but was instead finishing my sentence rather than replying to it.

@maraoz
Copy link
Author

@maraoz maraoz commented Jun 3, 2019

@Nerdimite37 i did get "chatlike" behavior. You should prompt each line with something like "Alice: " or "Bob: " so that GPT-2 "understands" there a multiple parties interacting. I don't know if this makes sense to you, but it kind of worked. Check out how I did it here:
https://github.com/openai/gpt-2/pull/77/files#diff-c7652ec719aa48942ac653e5656ab00aR58

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet

3 participants