-
Notifications
You must be signed in to change notification settings - Fork 62
Interactive #47
Comments
@luca-saggese you need to maintain the context on the nodejs side. ie. you should maintain a list of chatting histories where every items of the list should not exceed the context length of your model. thats why llama-node also expose the tokenizer to node.js. |
@hlhr202 thanks for the comment, where should I pass the context to the new query? within the prompt? |
yes, your prompt should be a string that compose chatting list. at the same time you also have to make sure it doesnt exceed the context length limit of the model |
understood, and what is the point of saveSession and loadSession? |
They are used for accelerating loading. |
@luca-saggese Keeping a list of previous messages in every prompt (as he suggested) works, but is slow. Instead, during startup, i call createCompletion (initial prompt) with feedPromptOnly and saveSession once. (can also copy the initial cache file to make future startup faster) Every new message is added individually with feedPromptOnly, saveSession+loadSession to get a bot response, just call without feedPromptOnly as usual This is still limited by context length, with the added disadvantage that you can't clear old messages (takes a while to run into the 2048 token ctx limit tho) also seems to improve "conversation memory" without extra cost of including more messages in the chat history |
regarding the context length limit; rustformers/llm#77 might be related |
@end-me-please thanks fo the help, here is a working version for anyone interested:
|
can we make it so previous prompts are part of an array? Otherwise it would continuously show the entire history with every response. |
@end-me-please @luca-saggese I can't make it work.
Two weird things:
And then:
The first prompt that I fed is completely ignored... |
I'm new to llm and llama but learning fast, I've wrote a small piece of code to chat via cli, but it seems to not follow the context (ie work in interactive mode).
I'm missing something?
The text was updated successfully, but these errors were encountered: