How to implement a multiple loop chat with memory(chat context) using onnx.genai and deepseek reasonning model #1312

John0King · 2025-03-08T03:41:55Z

I use follow code with the deepseek-r1-1b mode, but it not work well .
I start ask it 1+1=?
I continue ask it with "add 1 more" , and it start give me the result 2 only , and it lost the begin token <think>

using GeneratorParams generatorParams = new(model);
generatorParams.SetSearchOption("max_length", 4096);
using var tokenizerStream = tokenizer.CreateStream();

List<string> chatHistory = new List<string>();

ulong s = 0;
var sb = new StringBuilder();
do
{

    using var generator = new Generator(model, generatorParams);
    Console.Write("请输入提示词：");
    string prompt = Console.ReadLine()!;
    //var sequences = tokenizer.Encode($"<｜begin▁of▁sentence｜><｜User｜>{prompt}<｜end▁of▁sentence｜>\n<｜Assistant｜>");
    chatHistory.Add($"<｜begin▁of▁sentence｜><｜User｜>{prompt}<｜end▁of▁sentence｜>\n<｜Assistant｜>");
    //var sequences = tokenizer.EncodeBatch(chatHistory.ToArray());
    var sequences = tokenizer.Encode(string.Join('\n',chatHistory));
    generator.AppendTokenSequences(sequences);
    sb.Clear();
    sb.Append("<｜begin▁of▁sentence｜>");
    sb.Append("");
    while (!generator.IsDone())
    {
        //generator.ComputeLogits();
        generator.GenerateNextToken();
        var str = tokenizerStream.Decode(generator.GetSequence(s)[^1]);
        Console.Write(str);
        sb.Append(str);

    }
    sb.Append("<｜end▁of▁sentence｜>\n");
    chatHistory.Add(sb.ToString());
    Console.WriteLine();
    //s++;
}
while (true);

The text was updated successfully, but these errors were encountered:

natke · 2025-03-09T22:10:03Z

Hi @John0King,

Have a look at the snippet and see if that helps: https://onnxruntime.ai/docs/genai/howto/migrate.html#add-chat-mode-to-your-c-application-1

You don't need to re-create the generator each time around the loop and you only need to append the new prompt - the model takes care of the previous context.

Let us know how that goes!

John0King · 2025-03-10T05:07:27Z

@natke it desn't help , the reson I use using var generator = new Generator(model, generatorParams);· inside the loop , is because it'll lose the reasonning start token <think> (and it doen't help to fix that, that'y why I asked here on github).

I'm looking for a example on how to create a openAI compatible webapi.

John0King · 2025-03-12T10:54:27Z

##the problem

code

using OnnxRuntimeGenAIChatClient client = new OnnxRuntimeGenAIChatClient(new OnnxRuntimeGenAIChatClientOptions
{
    PromptFormatter = (prompt, context) =>
    {
        var sb = new StringBuilder();
        sb.Append("");
        sb.Append(string.Join("", prompt.Select(x => $"<｜begin▁of▁sentence｜><｜{x.Role}｜>{x.Text}<｜end▁of▁sentence｜>\n<｜Assistant｜>")));
        sb.Append("");
        return sb.ToString();
    },
}, model, false);
List<ChatMessage> chatMessage = new List<ChatMessage>();
do
{
    Console.WriteLine();
    Console.WriteLine("Prompt:");
    var prompt = Console.ReadLine();
    if (prompt == "exit")
    {
        break;
    }
    chatMessage.Add(new ChatMessage(ChatRole.User, prompt));
    List<ChatResponseUpdate> chatMessageUpdates = [];
    await foreach (var x in client.GetStreamingResponseAsync(chatMessage, new ChatOptions
    {
        MaxOutputTokens = 4096,
        AdditionalProperties = new() { { "max_length", 4096 } },
    }))
    {
        chatMessageUpdates.Add(x);
        Console.Write(x.ToString());
    }
    var resposne = chatMessageUpdates.ToChatResponse();
    chatMessage.Add(resposne.Message);
    Console.WriteLine();
}
while (true);

John0King changed the title ~~How to implement a multiple loop chat with memory(chat context) using onnx.genai~~ How to implement a multiple loop chat with memory(chat context) using onnx.genai and deepseek reasonning model Mar 12, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

How to implement a multiple loop chat with memory(chat context) using onnx.genai and deepseek reasonning model #1312

How to implement a multiple loop chat with memory(chat context) using onnx.genai and deepseek reasonning model #1312

John0King commented Mar 8, 2025

natke commented Mar 9, 2025

Uh oh!

John0King commented Mar 10, 2025

Uh oh!

John0King commented Mar 12, 2025 •

edited

Loading

Uh oh!

How to implement a multiple loop chat with memory(chat context) using onnx.genai and deepseek reasonning model #1312

How to implement a multiple loop chat with memory(chat context) using onnx.genai and deepseek reasonning model #1312

Comments

John0King commented Mar 8, 2025

natke commented Mar 9, 2025

Uh oh!

John0King commented Mar 10, 2025

Uh oh!

John0King commented Mar 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

code

Uh oh!

John0King commented Mar 12, 2025 •

edited

Loading