Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

chat: fix ChatGPT after #1970 #2051

Merged
merged 1 commit into from
Mar 6, 2024
Merged

chat: fix ChatGPT after #1970 #2051

merged 1 commit into from
Mar 6, 2024

Conversation

cebtenzzre
Copy link
Member

I'm actually not sure how certain parts of this were working before #1970, namely:

  • processRestoreStateFromText() - My understanding is that the LLModel would generate a new reply to both the original prompt and the original reply, because we called LLModel::prompt with the default n_predict (not zero). We were simply not displaying the reply (empty callback), even though it ended up in the LLModel's conversation history (m_context in the case of ChatGPT). I added fakeReply in PR 1970 because the prompt and response now need to happen within the same LLModel::prompt invocation, but as a bonus it fixes this apparent issue.
  • For ChatGPT, decrementing n_past to regenerate the reply meant the initial new reply would be based on only the previous context, but we never actually removed anything from m_context, so any further replies would see old messages. Now we resize m_context as needed.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>
@cebtenzzre
Copy link
Member Author

Actually, I think my understanding of processRestoreStateFromText is accurate for ChatGPT, but not for other subclasses of LLModel: If responseCallback returns false, n_past is not incremented further. But it is still incremented the first time, so one token of the response would be generated by the LLM and recorded in the history. This was true even before 12f943e.

@cebtenzzre
Copy link
Member Author

I tested a version of GPT4All before #1970 and it does indeed exhibit the behavior that I mentioned when you restore a chat session:

decode(n_past=0):
pos=0 1 ''
pos=1 16121 ' ###'
pos=2 9779 ' User'
pos=3 31871 ':'
pos=4 13 '
'
pos=5 10302 'Message'
pos=6 31843 '.'
pos=7 13 '
'
pos=8 8458 '##'
pos=9 31922 '#'
pos=10 13166 ' Response'
pos=11 31871 ':'
pos=12 13 '
'
decode(n_past=13):
pos=13 16644 ' Hello'
decode(n_past=14):
pos=14 16644 ' Hello'
pos=15 31905 '!'
pos=16 312 ' I'
pos=17 705 ' am'
pos=18 1048 ' here'
pos=19 289 ' to'
pos=20 2803 ' assist'
pos=21 365 ' you'

Notice how pos=13 is a token freshly generated by the model based on the current parameters, random state (for temperature sampling), etc. and then pos=14 and onward is what the model generated during the previous session.

@cebtenzzre cebtenzzre merged commit 402f515 into main Mar 6, 2024
6 of 10 checks passed
@cebtenzzre cebtenzzre deleted the fix-chatgpt branch March 6, 2024 19:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants