# Chat History Reducers in Semantic Kernel

## Overview

This notebook explores the concept of Chat History Reducers - a crucial pattern for managing token usage and context window limitations when working with Large Language Models (LLMs) in conversational scenarios.

### The Problem

Large Language Models operate with a defined limit on the number of tokens they can process at once, referred to as the **context window**. Exceeding this limit can have significant cost and performance implications. Therefore, it is essential to manage the size of the input sent to the LLM, particularly when using chat completion models.

### The Solution

Chat History Reducers provide strategies to truncate chat history when it becomes too large, while preserving the most important context for meaningful conversations.

In [1]:
#r "nuget: Microsoft.SemanticKernel, 1.61.0"

#!import config/Settings.cs

using Microsoft.SemanticKernel;
using Kernel = Microsoft.SemanticKernel.Kernel;

Kernel CreateKernel()
{
    var builder = Kernel.CreateBuilder();

    var (useAzureOpenAI, model, azureEndpoint, apiKey, orgId, embeddingEndpoint, embeddingApiKey) = Settings.LoadFromFile();

    builder.AddAzureOpenAIChatCompletion(model, azureEndpoint, apiKey);
    var kernel = builder.Build();

    return kernel;
}

var kernel = CreateKernel();

In [2]:
#r "nuget: Microsoft.SemanticKernel.Agents.Core, 1.61.0"

## The IChatHistoryReducer Interface

The foundation of chat history reduction is the `IChatHistoryReducer` interface, which supports various strategies for reducing chat history with asynchronous processing.

```csharp
/// <summary>
/// Interface for reducing the chat history before sending it to the chat completion provider.
/// </summary>
public interface IChatHistoryReducer
{
    /// <summary>
    /// Reduce the <see cref="ChatHistory"/> before sending it to the <see cref="IChatCompletionService"/>.
    /// </summary>
    /// <param name="chatHistory">Instance of <see cref="ChatHistory"/> to be reduced.</param>
    /// <param name="cancellationToken">Cancellation token.</param>
    Task<IEnumerable<ChatMessageContent>?> ReduceAsync(ChatHistory chatHistory, CancellationToken cancellationToken);

}
```

## Strategy 1: Truncating Based on Message Count

The simplest approach to reduce chat history is to send only the last N messages to the LLM. 

This is acheived with ```ChatHistoryTruncationReducer``` from the ```Microsoft.SemanticKernel.ChatCompletion``` namespace

In [3]:
#pragma warning disable SKEXP0110
#pragma warning disable SKEXP0001
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Agents;

var biblicalHebrewAgent = new ChatCompletionAgent
{
    Name = "biblicalHabrewAgent",
    Instructions = 
        """
        You are an expert in anceint hebrew, when you receive messages you summarize them and translate them to biblical hebrew.
        Answer ONLY in biblical hebrew!
        """,
    Description = "An expert in anceint hebrew that translates texts and messages into hebrew",
    Kernel = kernel,

    // This will only be invoked automatically when the agent is part of a chat
    HistoryReducer = new ChatHistoryTruncationReducer(3),
};

// Creating a long history of messages
var chatHistory = new ChatHistory();
chatHistory.AddUserMessage("The Matrix is everywhere.");
chatHistory.AddUserMessage("It is all around us.");
chatHistory.AddUserMessage("Even now, in this very room.");
chatHistory.AddUserMessage("You can see it when you look out your window...");
chatHistory.AddUserMessage("It is the world that has been pulled over your eyes to blind you from the truth");

var chatHistoryAgentThread  = new ChatHistoryAgentThread(chatHistory);
await biblicalHebrewAgent.ReduceAsync(chatHistoryAgentThread.ChatHistory); //since we are not in a chat, we run it explicitly

IAsyncEnumerable<AgentResponseItem<ChatMessageContent>> response = biblicalHebrewAgent.InvokeAsync("remind me everything i just said", chatHistoryAgentThread);

Console.WriteLine("-------- History --------");
foreach(var msg in chatHistoryAgentThread.ChatHistory)
{
    Console.WriteLine(msg);
}


Console.WriteLine("-------- Agent Response --------");
await foreach (var item in response)
{
    Console.WriteLine(item.Message);
}

-------- History --------
Even now, in this very room.
You can see it when you look out your window...
It is the world that has been pulled over your eyes to blind you from the truth
-------- Agent Response --------
גַּם עַתָּה בַּחֶדֶר הַזֶּה.  
תִּרְאֶה בַּאֲשֶׁר תַּבִּיט מִן הַחַלּוֹן.  
הַעוֹלָם שֶׁשֻׁלַּף עַל עֵינֶיךָ לְסַלֵּף אֶת הָאֱמֶת.  


## Strategy 2: Summarizing Older Messages

The most sophisticated strategy summarizes existing chat history and sends the system message, chat history summary, and most recent messages to the LLM. This approach helps maintain context while reducing token usage.

This is done by using the ```ChatHistorySummarizationReducer```

In [4]:
#pragma warning disable SKEXP0110
#pragma warning disable SKEXP0001
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Agents;

var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();

var biblicalHebrewAgent = new ChatCompletionAgent
{
    Name = "biblicalHabrewAgent",
    Instructions = 
        """
        You are an expert in anceint hebrew, when you receive messages you summarize them and translate them to biblical hebrew.
        Answer ONLY in biblical hebrew!
        """,
    Description = "An expert in anceint hebrew that translates texts and messages into hebrew",
    Kernel = kernel,

    // This will only be invoked automatically when the agent is part of a chat
    HistoryReducer = new ChatHistorySummarizationReducer(chatCompletionService, 
                                                         targetCount: 1,     // Desired number of messages 
                                                         thresholdCount: 2 ) // How many message will trigger the summarization
                                                         { 
                                                            UseSingleSummary = true // If wish to only have one summarized message
                                                         }
};

// Creating a long history of messages
var chatHistory = new ChatHistory();
chatHistory.AddUserMessage("The Matrix is everywhere.");
chatHistory.AddUserMessage("It is all around us.");
chatHistory.AddUserMessage("Even now, in this very room.");
chatHistory.AddUserMessage("You can see it when you look out your window...");
chatHistory.AddUserMessage("It is the world that has been pulled over your eyes to blind you from the truth");

var chatHistoryAgentThread  = new ChatHistoryAgentThread(chatHistory);
await biblicalHebrewAgent.ReduceAsync(chatHistoryAgentThread.ChatHistory); //since we are not in a chat, we run it explicitly

IAsyncEnumerable<AgentResponseItem<ChatMessageContent>> response = biblicalHebrewAgent.InvokeAsync("remind me everything i just said", chatHistoryAgentThread);

Console.WriteLine("-------- History --------");
foreach(var msg in chatHistoryAgentThread.ChatHistory)
{
    Console.WriteLine(msg);
}


Console.WriteLine("-------- Agent Response --------");
await foreach (var item in response)
{
    Console.WriteLine(item.Message);
}

-------- History --------
The dialog references a well-known scene from "The Matrix," highlighting the idea that the Matrix is an all-encompassing illusion. It is described as omnipresent, surrounding individuals even in their immediate environment. The Matrix is depicted as a deceptive construct, obscuring the truth from those within it. The user builds the quote sequentially, focusing on its profound implications.
It is the world that has been pulled over your eyes to blind you from the truth
-------- Agent Response --------
הוּא הָעוֹלָם הַנִּמְשַׁךְ עַל עֵינֶיךָ לְהַעֲוֹר מִמְּךָ אֶת הָאֱמֶת.


## Reducing without an agent

You can use the the reducer even without an agent

In [5]:
// Creating a long history of messages
var chatHistory = new ChatHistory();
chatHistory.AddUserMessage("The Matrix is everywhere.");
chatHistory.AddUserMessage("It is all around us.");
chatHistory.AddUserMessage("Even now, in this very room.");
chatHistory.AddUserMessage("You can see it when you look out your window...");
chatHistory.AddUserMessage("It is the world that has been pulled over your eyes to blind you from the truth");

var chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();

var historyReducer = new ChatHistorySummarizationReducer(chatCompletionService, targetCount: 1) { UseSingleSummary = true };

var reducedHistory = await historyReducer.ReduceAsync(chatHistory);
Console.WriteLine($"Summary of chat so far: {reducedHistory?.FirstOrDefault()?.Content}");


Summary of chat so far: The user begins quoting a famous speech from "The Matrix" by Morpheus, describing the Matrix as an omnipresent and deceptive construct. They convey that it exists everywhere, even within the confines of a room, visible through observation like looking out a window. The Matrix is characterized as a false reality designed to obscure the truth from humanity. There is an ongoing progression of the original dialogue's themes about perception and deception in the world. Further interaction appears to build on this established context.
