Skip to content
This repository was archived by the owner on Apr 8, 2026. It is now read-only.

WIP: Add chat example#77

Closed
maraoz wants to merge 2 commits into
openai:masterfrom
maraoz:master
Closed

WIP: Add chat example#77
maraoz wants to merge 2 commits into
openai:masterfrom
maraoz:master

Conversation

@maraoz

@maraoz maraoz commented Feb 24, 2019

Copy link
Copy Markdown

I'm experimenting with the 117M model, trying to build a simple interactive chat example.
Results are acceptable:
screenshot from 2019-02-24 18-06-38

However, I have a feeling my code has an error because when I try generating text with the other scripts (src/generate_unconditional_samples.py and src/interactive_conditional_samples.py), the output quality seems subjectively better. The approach I've taken is to keep the model in memory, and pass the whole conversation at each point, to generate the bot's reply.

To try out the code on this branch, run:

git remote add maraoz git@github.com:maraoz/gpt-2.git
git fetch maraoz
git checkout maraoz/master -b chat
python3 src/chat.py --seed=1337

Some open questions in case anyone (OpenAI team or community) wants to chime in:

  • Is it correct to use the same model instance for the whole "chat session"? Or should I restore from a checkpoint before generating each new line?
  • Is it correct to prompt the model with the whole conversation at each stage, or should I only send the new dialogue lines?
  • Any other ideas on how to improve output quality? (tweaking temperature, top_k?)

@WuTheFWasThat

Copy link
Copy Markdown
Contributor

hey, try doing double newline instead of single newline between dialogue entries. this is our fault, owing to the way we processed html while producing our dataset.

regarding your questions:

  • it's correct to use the same model instance for the whole chat session
  • yep, give it the whole conversation at each stage (or at least the last 1024 - gen_length tokens of it)
  • I recommend top_k = 40!

@maraoz

maraoz commented Feb 27, 2019

Copy link
Copy Markdown
Author

Thanks for the tips, it's working much better now!
I'll keep playing and report back if I find anything interesting for you to try.

@maraoz maraoz closed this Feb 27, 2019
@nerdimite

nerdimite commented Jun 3, 2019

Copy link
Copy Markdown

Were you able to get chat like results using the original interactive_conditional_samples.py? Because I tried feeding 117M a dataset where the next line contained the reply of the previous line but the model didn't quite talk like a chatbot but was instead finishing my sentence rather than replying to it.

@maraoz

maraoz commented Jun 3, 2019

Copy link
Copy Markdown
Author

@Nerdimite37 i did get "chatlike" behavior. You should prompt each line with something like "Alice: " or "Bob: " so that GPT-2 "understands" there a multiple parties interacting. I don't know if this makes sense to you, but it kind of worked. Check out how I did it here:
https://github.com/openai/gpt-2/pull/77/files#diff-c7652ec719aa48942ac653e5656ab00aR58

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants