You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have no problem using the chat, until the prompt + answer token size exceed the context size, then the answer interrupts while answering, and without leaving the "session.ChatAsync" method. Which is expected at this point as I don't check my prompt size.
So I simply check the history in my code to remove some old message if the total prompt size exceed the context size. I use "session.History.Messages.Remove()" to do that. But while I have the confirmation that the token size is now okay using "context.Tokenize()", the answer still interrupt whil answering, as if what I did was useless.
So I don't really know what's wrong.
Reproduction Steps
private static uint contextSize = 260 ;
public void InitChat(string modelPath)
{
Task.Run((async () =>
{
var parameters = new ModelParams(modelPath)
{
ContextSize = 260, // limiting context to check the bug quickly, but it happens no matter the context size
GpuLayerCount = 5 // How many layers to offload to GPU. Please adjust it according to your GPU memory.
};
using var model = LLamaWeights.LoadFromFile(parameters);
using var context = model.CreateContext(parameters);
var executor = new InteractiveExecutor(context);
var chatHistory = new ChatHistory();
chatHistory.AddMessage(AuthorRole.System, "You are a coding assistant.");
ChatSession session = new(executor, chatHistory);
var bannedwords = new List<string>() {"User:" };
while (true)
{
signalEvent.Wait(token);
InferenceParams inferenceParams = new InferenceParams()
{
MaxTokens = 56,
};
var x = new ChatHistory.Message(AuthorRole.User, currentMessage).ToString();
// Check if future context won't be > contextsize (history + answer maxtokens + currentmessagelength + 20 security)
while (context.Tokenize(session.HistoryTransform.HistoryToText(session.History)).Length
+ AdvancedSettings.MaxTokens
+ context.Tokenize(currentMessage).Length
+ 20 > contextSize){
// Remove first message that is not system to free some context
session.History.Messages.Remove(session.History.Messages.FirstOrDefault(m => m.AuthorRole !=
AuthorRole.System));
}
string buffer = "";
await foreach (var text in session.ChatAsync(new ChatHistory.Message(AuthorRole.User, currentMessage), inferenceParams))
{
buffer += text;
}
signalEvent.Reset();
}
}));
}
Environment & Configuration
Operating system: Windows 11
.NET runtime version: .net 8
LLamaSharp version: 0.11.2
CUDA version (if you are using cuda backend): CPU Bakcend, 0.11.2
CPU & GPU device: AMD Ryzen 7 4800HS
Known Workarounds
I tried the same code using 0.10 version and weirdly, it works and continue the conversation as expected, removing old messages to keep context size below the limit.
The text was updated successfully, but these errors were encountered:
Description
Hi,
I have no problem using the chat, until the prompt + answer token size exceed the context size, then the answer interrupts while answering, and without leaving the "session.ChatAsync" method. Which is expected at this point as I don't check my prompt size.
So I simply check the history in my code to remove some old message if the total prompt size exceed the context size. I use "session.History.Messages.Remove()" to do that. But while I have the confirmation that the token size is now okay using "context.Tokenize()", the answer still interrupt whil answering, as if what I did was useless.
So I don't really know what's wrong.
Reproduction Steps
Environment & Configuration
Known Workarounds
I tried the same code using 0.10 version and weirdly, it works and continue the conversation as expected, removing old messages to keep context size below the limit.
The text was updated successfully, but these errors were encountered: