chat: fix ChatGPT after #1970 #2051

cebtenzzre · 2024-02-28T18:01:25Z

I'm actually not sure how certain parts of this were working before #1970, namely:

processRestoreStateFromText() - My understanding is that the LLModel would generate a new reply to both the original prompt and the original reply, because we called LLModel::prompt with the default n_predict (not zero). We were simply not displaying the reply (empty callback), even though it ended up in the LLModel's conversation history (m_context in the case of ChatGPT). I added fakeReply in PR 1970 because the prompt and response now need to happen within the same LLModel::prompt invocation, but as a bonus it fixes this apparent issue.
For ChatGPT, decrementing n_past to regenerate the reply meant the initial new reply would be based on only the previous context, but we never actually removed anything from m_context, so any further replies would see old messages. Now we resize m_context as needed.

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre · 2024-03-06T18:45:02Z

Actually, I think my understanding of processRestoreStateFromText is accurate for ChatGPT, but not for other subclasses of LLModel: If responseCallback returns false, n_past is not incremented further. But it is still incremented the first time, so one token of the response would be generated by the LLM and recorded in the history. This was true even before 12f943e.

cebtenzzre · 2024-03-06T19:00:56Z

I tested a version of GPT4All before #1970 and it does indeed exhibit the behavior that I mentioned when you restore a chat session:

decode(n_past=0):
pos=0 1 ''
pos=1 16121 ' ###'
pos=2 9779 ' User'
pos=3 31871 ':'
pos=4 13 '
'
pos=5 10302 'Message'
pos=6 31843 '.'
pos=7 13 '
'
pos=8 8458 '##'
pos=9 31922 '#'
pos=10 13166 ' Response'
pos=11 31871 ':'
pos=12 13 '
'
decode(n_past=13):
pos=13 16644 ' Hello'
decode(n_past=14):
pos=14 16644 ' Hello'
pos=15 31905 '!'
pos=16 312 ' I'
pos=17 705 ' am'
pos=18 1048 ' here'
pos=19 289 ' to'
pos=20 2803 ' assist'
pos=21 365 ' you'

Notice how pos=13 is a token freshly generated by the model based on the current parameters, random state (for temperature sampling), etc. and then pos=14 and onward is what the model generated during the previous session.

chat: fix ChatGPT after #1970

4ff99b5

Signed-off-by: Jared Van Bortel <jared@nomic.ai>

cebtenzzre requested a review from manyoso February 28, 2024 18:01

cebtenzzre mentioned this pull request Feb 29, 2024

feat: Add support for Mistral API models #2053

Merged

5 tasks

manyoso approved these changes Mar 6, 2024

View reviewed changes

cebtenzzre merged commit 402f515 into main Mar 6, 2024
6 of 10 checks passed

cebtenzzre deleted the fix-chatgpt branch March 6, 2024 19:02

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chat: fix ChatGPT after #1970 #2051

chat: fix ChatGPT after #1970 #2051

cebtenzzre commented Feb 28, 2024

cebtenzzre commented Mar 6, 2024

cebtenzzre commented Mar 6, 2024

chat: fix ChatGPT after #1970 #2051

chat: fix ChatGPT after #1970 #2051

Conversation

cebtenzzre commented Feb 28, 2024

cebtenzzre commented Mar 6, 2024

cebtenzzre commented Mar 6, 2024